The Something-Else Annotations

This repository provides intructions regarding the annotations used in the paper: 'Something-Else: Compositional Action Recognition with Spatial-Temporal Interaction Networks' (https://arxiv.org/abs/1912.09930). We collected annotations for 180049 videos from the Something-Something Dataset (https://20bn.com/datasets/something-something), that include per frame bounding box annotation for each object and hand in the human-object interaction in the video.

The file containing annotations can be downloaded from: https://drive.google.com/open?id=1XqZC2jIHqrLPugPOVJxCH_YWa275PBrZ in four parts, it containes a dictionary mapping each video id, the name of the video file to the list of per-frame annotations. The annotations assume that the frame rate of the videos is 12. An example of per-frame annotation is shown below, the names and number of "something's" in the frame correspond to the fields 'gt_placeholders' and 'nr_instances', the frame path is given in the field 'name', 'labels' is a list of object's and hand's bounding boxes and names.

   [
    {'gt_placeholders': ['pillow'],
     'labels': [{'box2d': {'x1': 97.64950730138266,
                          'x2': 427,
                          'y1': 11.889318166856967,
                          'y2': 239.92858832368972},
                          'category': 'pillow',
                          'standard_category': '0000'}},
                {'box2d': {'x1': 210.1160330781122,
                          'x2': 345.4329005999551,
                          'y1': 78.65516045335991,
                          'y2': 209.68758889799403},
                          'category': 'hand',
                          'standard_category': 'hand'}}],
     'name': '2/0001.jpg',
     'nr_instances': 2}, 
     {...},
     ...
     {...},
     ]

The annotations for example videos are a small subset of the annotation file, and can be found in annotations.json.

Citation

If you use our annotations in your research or wish to refer to the baseline results, please use the following BibTeX entry.

@inproceedings{CVPR2020_SomethingElse,
  title={Something-Else: Compositional Action Recognition with Spatial-Temporal Interaction Networks},
  author={Materzynska, Joanna and Xiao, Tete and Herzig, Roei and Xu, Huijuan and Wang, Xiaolong and Darrell, Trevor},
  booktitle = {CVPR},
  year={2020}
}

Dataset splits

The compositional, compositional shuffle, one-class compositional and few-shot splits of the Something Something v2 Dataset are availible in the folder dataset_splits.

Visualization of the ground-truth bounding boxes

The folder videos contains example videos from the dataset and selected annotations file (full file availible on google drive). To visualize videos with annotated bounding boxes run:

python annotate_videos.py

The annotated videos will be saved in the annotated_videos folder.

Visualization of the detected bounding boxes

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
annotated_videos		annotated_videos
dataset_splits		dataset_splits
figures		figures
videos		videos
README.md		README.md
annotate_videos.py		annotate_videos.py
annotations.json		annotations.json
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Something-Else Annotations

Citation

Dataset splits

Visualization of the ground-truth bounding boxes

Visualization of the detected bounding boxes

About

Releases

Packages

Languages

JWZhao-uestc/something_else

Folders and files

Latest commit

History

Repository files navigation

The Something-Else Annotations

Citation

Dataset splits

Visualization of the ground-truth bounding boxes

Visualization of the detected bounding boxes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages