This is a simple guide to help you use TSCUNet for video upscaling.
-
Clone or download this repository:
git clone https://github.com/Kim2091/SCUNet
(Or download the zip file from GitHub)
-
Install PyTorch with CUDA from: https://pytorch.org/get-started/locally/
-
Install required packages:
pip install -r requirements.txt
TensorRT support has been added thanks to pifroggi! Refer to the TensorRT guide on how to use it.
For easy use with PyTorch or ONNX, you can launch the graphical interface:
python vsr_gui.py
-
Input Video: Drop your video file or click to browse and select one.
-
Model Selection: Choose the model you wish to use. Use PyTorch unless you know what you're doing.
-
Output: Specify where to save the upscaled video.
-
Options:
- Presize: Resizes the video by the model scale. So if you're using a 2x model, it would downscale the video by 50%. This can improve quality in some cases, and always improves performance.
- Video Codec: Select the codec to use. Just use
libx264
if you're unsure.
-
Processing: Click "Process Video" to start upscaling. A progress bar will show the status, and you can monitor the details in the log area.
-
Stop Processing: If needed, you can stop the upscale. This is currently broken.
That's it!
To get started, simply:
-
Clone this repository with
git clone https://github.com/Kim2091/SCUNet
(or download the zip). -
Install PyTorch with CUDA: https://pytorch.org/get-started/locally/
-
Run
pip install -r requirements.txt
-
Use one of the following:
-
Video/Temporal Inference
python test_vsr.py --model_path pretrained_models/2x_eula_anifilm_vsr.pth --input example/lr/ --output example/sr/ --depth 16
-
Single-Image inference
python test_sisr.py --model_path pretrained_models/scunet_color_real_psnr.pth --input example/lr/ --output example/sr/ --depth 16
If a folder of images is provided as input, they all must match in resolution.
Both architectures support image inputs with video output and vice-versa. Input and output arguments can be a path to either a single image, a folder of images, or a video file. To output to a video, the --video
argument must be provided to select the output video codec. Additional ffmpeg arguments such as --profile
, --preset
, --crf
, and --pix_fmt
can also be provided if desired. In the original repository by eula, the --res
command was required. This is no longer necessary, as the script calculates the output resolution for you.
Additionally, the --presize
argument can be used to resize the input to the target resolution divided by the scale, which can be produce better results when the output resolution is short of the target resolution or if the original aspect ratio does not match the target aspect ratio.
python test_vsr.py --model_path pretrained_models/tscu_2x.pth --input example/lr_video.mp4 --output example/sr_video.mp4 --video libx264 --presize
You can also convert an FP32 version of the model to FP16 using pth_fp32_to_fp16.py
. However, this currently is experimental and comes at the cost of visual quality.
python pth_fp32_to_fp16.py --model path/to/model.pth --output path/to/output.pth
TensorRT support has been added. The main reason for converting to ONNX would be for usage with TRT.
Please note that on April 7 2025, a breaking change was pushed that renders all existing TSCUNet ONNX models incompatible. You must re-convert them using the updated convert_to_onnx.py
script.
-
Convert PyTorch model to ONNX format
python convert_to_onnx.py --model pretrained_models/model.pth --output model.onnx --dynamic
Optional arguments:
--dynamic
: Outputs a dynamic onnx model, good for processing various sized input videos--height
,--width
: Outputs a static onnx. Good for upscaling videos of specific resolutions. Specify input dimensions (e.g. 256)--batch
: Set batch size. Don't mess with this (default: 1)--no-optimize
: Disable optimization wrapper--fp16
: Converts the model to fp16, which provides a speed boost
-
Run the model
For upscaling video:
python test_onnx.py --model_path model.onnx --input path/to/video.mp4 --output path/to/output.mp4
For upscaling images (untested):
python test_onnx.py --model_path model.onnx --input example/lr/ --output example/sr/
The test script supports both image and video inputs/outputs, similar to the PyTorch testing scripts. Additional arguments:
--video
: Specify video codec for video output (e.g., 'h264_nvenc', 'libx264')--res
: Output video resolution--fps
: Output frame rate--presize
: Resize input before processing--providers
: ONNX Runtime execution providers (default: 'CUDAExecutionProvider,CPUExecutionProvider')
[Paper]
@article{zhang2022practical,
title={Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis},
author={Zhang, Kai and Li, Yawei and Liang, Jingyun and Cao, Jiezhang and Zhang, Yulun and Tang, Hao and Timofte, Radu and Van Gool, Luc},
journal={arXiv preprint},
year={2022}
}