Getting Started with Mask2Former

This document provides a brief intro of the usage of Mask2Former.

Please see Getting Started with Detectron2 for full usage.

Inference Demo with Pre-trained Models

Pick a model and its config file from model zoo, for example, configs/coco/panoptic-segmentation/maskformer2_R50_bs16_50ep.yaml.
We provide demo.py that is able to demo builtin configs. Run it with:

cd demo/
python demo.py --config-file ../configs/coco/panoptic-segmentation/maskformer2_R50_bs16_50ep.yaml \
  --input input1.jpg input2.jpg \
  [--other-options]
  --opts MODEL.WEIGHTS /path/to/checkpoint_file

The configs are made for training, therefore we need to specify MODEL.WEIGHTS to a model from model zoo for evaluation. This command will run the inference and show visualizations in an OpenCV window.

For details of the command line arguments, see demo.py -h or look at its source code to understand its behavior. Some common arguments are:

To run on your webcam, replace --input files with --webcam.
To run on a video, replace --input files with --video-input video.mp4.
To run on cpu, add MODEL.DEVICE cpu after --opts.
To save outputs to a directory (for images) or a file (for webcam or video), use --output.

Training & Evaluation in Command Line

We provide a script train_net.py, that is made to train all the configs provided in Mask2Former.

To train a model with "train_net.py", first setup the corresponding datasets following datasets/README.md, then run:

python train_net.py --num-gpus 8 \
  --config-file configs/coco/panoptic-segmentation/maskformer2_R50_bs16_50ep.yaml

The configs are made for 8-GPU training. Since we use ADAMW optimizer, it is not clear how to scale learning rate with batch size. To train on 1 GPU, you need to figure out learning rate and batch size by yourself:

python train_net.py \
  --config-file configs/coco/panoptic-segmentation/maskformer2_R50_bs16_50ep.yaml \
  --num-gpus 1 SOLVER.IMS_PER_BATCH SET_TO_SOME_REASONABLE_VALUE SOLVER.BASE_LR SET_TO_SOME_REASONABLE_VALUE

To evaluate a model's performance, use

python train_net.py \
  --config-file configs/coco/panoptic-segmentation/maskformer2_R50_bs16_50ep.yaml \
  --eval-only MODEL.WEIGHTS /path/to/checkpoint_file

For more options, see python train_net.py -h.

Video instance segmentation

Please use demo_video/demo.py for video instance segmentation demo and train_net_video.py to train and evaluate video instance segmentation models.

Getting Started with Cluster2Former

We adjusted the batch sizes, learning rates, iterations and learning rate steps to our hardware resources. See the more in the Model Zoo.

Find the documentation of the CLUSTER_2_FORMER parameters in the code, e.g. here.

Training & Evaluation in Command Line

To use Cluster2Former with scribble VIS datasets, set the following configuration parameters:

  MODEL.META_ARCHITECTURE: VideoCluster2Former
  INPUT.DATASET_MAPPER_NAME: cluster2former_scribble

To use Cluster2Former with the full mask annotated VIS datasets, set the following configuration parameters:

  MODEL.META_ARCHITECTURE: VideoCluster2Former

To use MaskCluster2Former (original Mask2Former plus a Similiraty-based Clustering loss) with the full mask annotated VIS datasets, set the following configuration parameters:

  MODEL.META_ARCHITECTURE: VideoMaskCluster2Former

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GETTING_STARTED.md

GETTING_STARTED.md

Getting Started with Mask2Former

Inference Demo with Pre-trained Models

Training & Evaluation in Command Line

Video instance segmentation

Getting Started with Cluster2Former

Training & Evaluation in Command Line

Files

GETTING_STARTED.md

Latest commit

History

GETTING_STARTED.md

File metadata and controls

Getting Started with Mask2Former

Inference Demo with Pre-trained Models

Training & Evaluation in Command Line

Video instance segmentation

Getting Started with Cluster2Former

Training & Evaluation in Command Line