Code for the Paper
ChangeNet-v2: Semantic Change detection with Convolutional Neural Networks K. Ram Prabhakar, Akshaya Ramasamy, Suvaansh Bhambri, Jayavardhana Gubbi, R. Venkatesh Babu, Balamuralidhar Purushothaman
The above mentiioned paper is currently under review in Computer Vision and Image Understanding Journal
In this paper, a novel deep learning architecture is proposed for change detection that targets higher-level inferencing. The new network architecture involves extracting features using CNN and combining filter outputs at different levels to localize the change. Finally, detected changes are identified using the same network, and output is an object-level change detection with the label.
The proposed architecture is compared with the state-of-the-art using three different modern change detection datasets: VL-CMU-CD (Alcantarilla et al. (2018)), TSUNAMI (Sakurada and Okatani(2015)), and GSV (Sakurada and Okatani (2015)) datasets.
This repository has been tested for Python3.
- Install PyTorch (Python3) by following instructions on PyTorch Homepage.
- Install Torchvision via pip3. It is used for incorporating feature extractor(VGG) pretrained on Imagenet
- Install tqdm via pip3. It is used for generating pregress bars.
The VL-CMU-CD dataset can be downloaded from the project page of the paper Street-View Change Detection with Deconvolutional Networks (RSS'16). This dataset is available on request.
.
├── ...
├── VL_CMU_CD # Test files (alternatively `spec` or `tests`)
│ ├── left # Contains test images
│ │ ├── 001_00.jpeg
│ │ ├── 001_01.jpeg
│ │ └── ...
│ │
│ ├── right # Contains reference images
│ │ ├── 001_00.jpeg
│ │ ├── 001_01.jpeg
│ │ └── ...
│ │
│ ├── GT_MULTICLASS # Contains Groundtruth 3d maps with each channel(11) representing single class at every pixel
│ │ ├── 001_00.npy
│ │ ├── 001_01.npy
│ │ └── ...
│ │
│ ├── mask # Binary Mask for region of interest
│ │ ├── 001_00.jpeg
│ │ ├── 001_01.jpeg
│ │ └── ...
│ │
│ └── ...
│
└── ...
The main.py
script is used for training. It trains the model iteratively over the entire dataset for the specified number of epochs. Use the following command for training the baseline model provided in this repository. The baseline experiment used Adam Optimiser with 1e-4
as initial learning rate. The model trains for 50 epochs
by default.
python3 main.py --data /path/to/dataset/VL_CMU_CD
We can resume training from a saved checkpoint by using the resume
option and passing the checkpoint path as argument:
python3 main.py --data /path/to/dataset/VL_CMU_CD --resume models/checkpoint.pth.tar
We can train our model on multiple GPUs using the device_ids
option and passing the device ids as arguments as a string
.
python3 main.py --data /path/to/dataset/VL_CMU_CD --device_ids "gpu ids separated by commas (e.g. 0,1,2,...)"
The main.py
script along with evaluate
flag is used for the purpose of evaluation. It takes a pretrained model and evaluate the model on the image ids present in the csv file passed as an argument with efile
option.
python3 main.py --data /path/to/dataset/VL_CMU_CD --resume /path/to/saved/model.pth.tar --evaluate --efile test
The above command will test the trained model on test.csv
file
The metrics used for evaluation are:
Precision: Precision tells us about how accurate our model is. Means, out of the predicted positive pixels, how many of are actually positive.
Recall: Recall calculates out of all the Actual Positives, how many can our model identify by labelling them as positive.
F Measure: F Measure is the harmonic mean of Precision and Recall. We need this metric when we need to maintain a balance between the both. F Measure's value goes down if either of the 2 have low value. Which makes it the perfect metric for class imbalanced datasets
Best performing checkpoint has been made available in this repository here
TODO: Add inferencing code using trained checkpoint
Classification | Class→ Metric↓ |
Barrier | Bin | Construction | Other Objects | Person Bicycle | Rubbish Bin | Sign Board | Traffic Cone | Vehicle |
---|---|---|---|---|---|---|---|---|---|---|
Pixel Based | Precision | 0.74 | 0.76 | 0.90 | 0.67 | 0.84 | 0.56 | 0.78 | 0.67 | 0.92 |
Recall | 0.70 | 0.72 | 0.85 | 0.65 | 0.79 | 0.50 | 0.69 | 0.60 | 0.88 | |
F_Measure | 0.72 | 0.74 | 0.87 | 0.66 | 0.81 | 0.53 | 0.73 | 0.63 | 0.90 | |
Object Based | Precision | 1.00 | 0.97 | 0.88 | 1.00 | 1.00 | 0.96 | 1.00 | 1.00 | 1.00 |
Recall | 0.78 | 1.00 | 1.00 | 0.63 | 1.00 | 1.00 | 0.87 | 0.58 | 0.97 | |
F_Measure | 0.87 | 0.98 | 0.94 | 0.78 | 1.00 | 0.97 | 0.93 | 0.73 | 0.98 |
Accuracy | Precision | Recall | f-score | |
---|---|---|---|---|
Binary | 99.2 | 93.7 | 93.9 | 93.8 |
Multi-class | 78.5 | 76.0 | 71.3 | 73.4 |
FPR = 0.1 | FPR = 0.01 | |||||
---|---|---|---|---|---|---|
Metric→ Methods↓ |
Precision | Recall | f-score | Precision | Recall | f-score |
Super-pixel | 0.17 | 0.35 | 0.23 | 0.23 | 0.12 | 0.15 |
CDnet | 0.40 | 0.85 | 0.55 | 0.79 | 0.46 | 0.58 |
ChangeNet | 0.79 | 0.80 | 0.79 | 0.80 | 0.79 | 0.79 |
ChangeNet-v2 | 0.93 | 0.93 | 0.93 | 0.90 | 0.94 | 0.93 |
1. Alcantarilla, P.F., Stent, S., Ros, G., Arroyo, R., Gherardi, R., 2018. Street- view change detection with deconvolutional networks. Autonomous Robots 42, 1301–1322.
2. Babaee, M., Dinh, D.T., Rigoll, G., 2018. A deep convolutional neural network for video sequence background subtraction. Pattern Recognition 76, 635–649.
3. Gressin, A., Vincent, N., Mallet, C., Paparoditis, N., 2013. Semantic approach in image change detection, in: International Conference on Advanced Concepts for Intelligent Vision Systems, Springer. pp. 450–459.
4. Gubbi, J., Ramaswamy, A., Sandeep, N., Varghese, A., Balamuralidhar, P., 2017. Visual change detection using multiscale super pixel, in: Digital Image Computing: Techniques and Applications (DICTA), 2017 International Conference on, IEEE. pp. 1–6.
5. Hussain, M., Chen, D., Cheng, A., Wei, H., Stanley, D., 2013. Change detection from remotely sensed images: From pixel-based to object-based approaches. ISPRS Journal of photogrammetry and remote sensing 80, 91–106.
6. Sakurada, K., Okatani, T., 2015. Change detection from a street image pair using cnn features and superpixel segmentation., in: BMVC, pp. 61–1. St-Charles, P.L., Bilodeau, G.A., Bergevin, R., 2015. Subsense: A universal change detection method with local adaptive sensitivity. IEEE Transactions on Image Processing 24, 359–373.
7. Varghese, A., Jayavardhana, G., Akshaya, R., Balamuralidhar, P., 2018. Changenet: A deep learning architecture for visual change detection, in European Conference on Computer Vision Workshops (ECCVW), IEEE.