Skip to content

Latest commit

 

History

History
89 lines (68 loc) · 3.57 KB

GETTING_STARTED.md

File metadata and controls

89 lines (68 loc) · 3.57 KB

Getting Started with Mask2Former

This document provides a brief intro of the usage of Mask2Former.

Please see Getting Started with Detectron2 for full usage.

Inference Demo with Pre-trained Models

  1. Pick a model and its config file from model zoo, for example, configs/coco/panoptic-segmentation/maskformer2_R50_bs16_50ep.yaml.
  2. We provide demo.py that is able to demo builtin configs. Run it with:
cd demo/
python demo.py --config-file ../configs/coco/panoptic-segmentation/maskformer2_R50_bs16_50ep.yaml \
  --input input1.jpg input2.jpg \
  [--other-options]
  --opts MODEL.WEIGHTS /path/to/checkpoint_file

The configs are made for training, therefore we need to specify MODEL.WEIGHTS to a model from model zoo for evaluation. This command will run the inference and show visualizations in an OpenCV window.

For details of the command line arguments, see demo.py -h or look at its source code to understand its behavior. Some common arguments are:

  • To run on your webcam, replace --input files with --webcam.
  • To run on a video, replace --input files with --video-input video.mp4.
  • To run on cpu, add MODEL.DEVICE cpu after --opts.
  • To save outputs to a directory (for images) or a file (for webcam or video), use --output.

Training & Evaluation in Command Line

We provide a script train_net.py, that is made to train all the configs provided in Mask2Former.

To train a model with "train_net.py", first setup the corresponding datasets following datasets/README.md, then run:

python train_net.py --num-gpus 8 \
  --config-file configs/coco/panoptic-segmentation/maskformer2_R50_bs16_50ep.yaml

The configs are made for 8-GPU training. Since we use ADAMW optimizer, it is not clear how to scale learning rate with batch size. To train on 1 GPU, you need to figure out learning rate and batch size by yourself:

python train_net.py \
  --config-file configs/coco/panoptic-segmentation/maskformer2_R50_bs16_50ep.yaml \
  --num-gpus 1 SOLVER.IMS_PER_BATCH SET_TO_SOME_REASONABLE_VALUE SOLVER.BASE_LR SET_TO_SOME_REASONABLE_VALUE

To evaluate a model's performance, use

python train_net.py \
  --config-file configs/coco/panoptic-segmentation/maskformer2_R50_bs16_50ep.yaml \
  --eval-only MODEL.WEIGHTS /path/to/checkpoint_file

For more options, see python train_net.py -h.

Video instance segmentation

Please use demo_video/demo.py for video instance segmentation demo and train_net_video.py to train and evaluate video instance segmentation models.

Getting Started with Cluster2Former

We adjusted the batch sizes, learning rates, iterations and learning rate steps to our hardware resources. See the more in the Model Zoo.

Find the documentation of the CLUSTER_2_FORMER parameters in the code, e.g. here.

Training & Evaluation in Command Line

To use Cluster2Former with scribble VIS datasets, set the following configuration parameters:

  MODEL.META_ARCHITECTURE: VideoCluster2Former
  INPUT.DATASET_MAPPER_NAME: cluster2former_scribble

To use Cluster2Former with the full mask annotated VIS datasets, set the following configuration parameters:

  MODEL.META_ARCHITECTURE: VideoCluster2Former

To use MaskCluster2Former (original Mask2Former plus a Similiraty-based Clustering loss) with the full mask annotated VIS datasets, set the following configuration parameters:

  MODEL.META_ARCHITECTURE: VideoMaskCluster2Former