This is the code repository for our paper MemeCLIP: Leveraging CLIP Representations for Multimodal Meme Classification published in EMNLP 2024.
The images and labels for the PrideMM dataset are available here (Warning: Insensitive content).
Class | Terminology |
---|---|
No Hate | 0 |
Hate | 1 |
Class | Terminology |
---|---|
Undirected | 0 |
Individual | 1 |
Community | 2 |
Organization | 3 |
Class | Terminology |
---|---|
Neutral | 0 |
Support | 1 |
Oppose | 2 |
Class | Terminology |
---|---|
No Humor | 0 |
Humor | 1 |
All experimental changes can be made through a single file: configs.py.
Directory names can be set in the following variables:
- cfg.root_dir
- cfg.img_folder
- cfg.info_file
- cfg.checkpoint_path
- cfg.checkpoint_file
To train, validate, and test MemeCLIP, set cfg.test_only = False and run main.py.
To test MemeCLIP, set cfg.test_only = True and run main.py.
CSV files are expected to contain image path, text, and label in no particular order.
Pre-trained weights for MemeCLIP (Hate Classification Task) are available here.
@article{shah2024memeclip,
title={MemeCLIP: Leveraging CLIP Representations for Multimodal Meme Classification},
author={Shah, Siddhant Bikram and Shiwakoti, Shuvam and Chaudhary, Maheep and Wang, Haohan},
journal={arXiv preprint arXiv:2409.14703},
year={2024}
}
OR
Siddhant Bikram Shah, Shuvam Shiwakoti, Maheep Chaudhary, and Haohan Wang. 2024. MemeCLIP: Leveraging CLIP Representations for Multimodal Meme Classification. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 17320–17332, Miami, Florida, USA. Association for Computational Linguistics.