Harnessing Meta-Learning for Enhanced Full-Frame Video Stabilization

Muhammad Kashif Ali, Eun Woo Im, Dongjin Kim, Tae Hyun Kim

[CVPR2024] Paper

Video stabilization is a longstanding computer vision problem particularly pixel-level synthesis solutions for video stabilization which synthesize full frames add to the complexity of this task. These techniques aim to stabilize videos by synthesizing full frames while enhancing the stability of the considered video. This intensifies the complexity of the task due to the distinct mix of unique motion profiles and visual content present in each video sequence making robust generalization with fixed parameters difficult. In our study we introduce a novel approach to enhance the performance of pixel-level synthesis solutions for video stabilization by adapting these models to individual input video sequences. The proposed adaptation exploits low-level visual cues accessible during test-time to improve both the stability and quality of resulting videos. We highlight the efficacy of our methodology of" test-time adaptation" through simple fine-tuning of one of these models followed by significant stability gain via the integration of meta-learning techniques. Notably significant improvement is achieved with only a single adaptation step. The versatility of the proposed algorithm is demonstrated by consistently improving the performance of various pixel-level synthesis models for video stabilization in real-world scenarios.

Framework Overview

How to Run

Installation

First, compile the dependencies for PWCNet from their official repository, as GlobalFlowNet uses a modified version of PWCNet. Then, run the following commands to clone the repository and set up the environment:

# Clone this repo
git clone https://github.com/MKashifAli/MetaVideoStab.git
cd MetaVideoStab

# Create and activate conda environment
conda env create -f environments.yaml
conda activate mvs

Training and Evaluation

Dataset

To meta-train the different variants discussed in the paper (DMBVS, DMBVS_recurrent, or DIFRINT), download the DeepStab Dataset. Organize the files as shown below and update the corresponding directories in the code files:

root
└───unstable_frames
│   └───Video_1
│   │   │   xxxx.png
│   │   │   xxxx.png
│   │   └───...
│   └───Video_2
│   │   │   xxxx.png
│   │   │   xxxx.png
│   │   └───...
│   └───Video_3
│   │   │   xxxx.png
│   │   │   xxxx.png
│   │   └───...
└───stable_frames
│   └───Video_1
│   │   │   xxxx.png
│   │   │   xxxx.png
│   │   └───...
│   └───Video_2
│   │   │   xxxx.png
│   │   │   xxxx.png
│   │   └───...
│   └───Video_3
│   │   │   xxxx.png
│   │   │   xxxx.png
│   │   └───...

Training

Once the dataset is in place, you can begin meta-training the different variants by running the corresponding scripts as described in the table below:

Variant	Corresponding Code File
DMBVS	train_DMBVS_vanilla.py
DMBVS_recurrent	train_DMBVS_recurrent.py
DIFRINT	train_DIF.py

Please adjust the "opts" parameters in the respective files according to your data and resources before starting the training.

Testing on Videos

To perform test-time adaptation on your videos, download the pretrained network checkpoints and place them in the "pretrained_models" folder as shown below (or modify the paths in the test_time_xxx.py scripts accordingly):

root
└───pretrained_models
│   └───checkpoints              
│   │   └───dif_eval             # Meta-trained DIFRINT checkpoints 
│   │   │   └───xxx.pth
│   │   └───eval                 # Meta-trained DMBVS checkpoints
│   │   │   └───xxx.pth
│   │   └───rec_eval             # Meta-trained DMBVS_recurrent checkpoints
│   │   │   └───xxx.pth
│   └───CoarseStabilizer.pth
│   └───GFlowNet.pth
│   └───baseline_ckpts           # For performance comparison

Run the corresponding files as described in the table below to perform test-time adaptation for the different variants:

Variant	Corresponding Code File
DMBVS	test_time_adaptation_DMBVS.py
DMBVS_recurrent	test_time_adaptation_DMBVSr.py
DIFRINT	test_time_adaptation_DIF.py

Results

Citation

If you find our work useful in your research, please cite our publications:

@inproceedings{ali2024harnessing,
  title={Harnessing Meta-Learning for Improving Full-Frame Video Stabilization},
  author={Ali, Muhammad Kashif and Im, Eun Woo and Kim, Dongjin and Kim, Tae Hyun},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={12605--12614},
  year={2024}
}


@article{ali2020deep,
  title={Deep motion blind video stabilization},
  author={Ali, Muhammad Kashif and Yu, Sangjoon and Kim, Tae Hyun},
  journal={arXiv preprint arXiv:2011.09697},
  year={2020}
}

Acknowledgements

The code in this repository incorporates parts of some methods from Learning Blind Video Temporal Consistency and GlobalFlowNet. We thanks to the authors for sharing their code.

Additionally, to improve readability and cleaning the code, I have significantly rewritten and cleaned the original repository. If you encounter any bugs or issues, please report them, and I will address them as soon as I have enough time.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Assets		Assets
PWC_Module		PWC_Module
TestVideos/Video1		TestVideos/Video1
models		models
networks		networks
README.md		README.md
datasets.py		datasets.py
datasets_mod_test.py		datasets_mod_test.py
debug_tools.py		debug_tools.py
environment.yml		environment.yml
losses.py		losses.py
losses_test_time_adapt.py		losses_test_time_adapt.py
model_cs.py		model_cs.py
test_time_adaptation_DIF.py		test_time_adaptation_DIF.py
test_time_adaptation_DMBVS.py		test_time_adaptation_DMBVS.py
test_time_adaptation_DMBVSr.py		test_time_adaptation_DMBVSr.py
train_DIF.py		train_DIF.py
train_DMBVS_recurrent.py		train_DMBVS_recurrent.py
train_DMBVS_vanilla.py		train_DMBVS_vanilla.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Harnessing Meta-Learning for Enhanced Full-Frame Video Stabilization

Table of Contents

Framework Overview

How to Run

Installation

Training and Evaluation

Dataset

Training

Testing on Videos

Results

Citation

Acknowledgements

About

Releases

Packages

Languages

eunwooim/MetaVideoStab

Folders and files

Latest commit

History

Repository files navigation

Harnessing Meta-Learning for Enhanced Full-Frame Video Stabilization

Table of Contents

Framework Overview

How to Run

Installation

Training and Evaluation

Dataset

Training

Testing on Videos

Results

Citation

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages