Skip to content

kazemihabib/Sexism_Detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

f1b3448 · Jan 15, 2025

History

41 Commits
Jan 15, 2025
Dec 3, 2024
Nov 14, 2024
Jan 15, 2025
Jan 15, 2025
Jan 15, 2025
Jan 15, 2025
Jan 15, 2025
Jan 7, 2025
Jan 7, 2025
Nov 15, 2024

Repository files navigation

Sexism Detection

This project, developed as part of the Natural Language Processing (NLP) course at the University of Bologna (UniBo), addresses the problem of sexism detection in text. The goal is to classify whether a given text (tweets) contains or describes sexist expressions or behaviors. The project explores a range of modern NLP techniques, including LSTM-based models, Transformer-based models, and Large Language Models (LLMs).


Table of Contents

  1. Introduction
  2. Approaches
  3. Contributors
  4. License

Introduction

This project tackles the challenge of sexism detection using a variety of modern NLP techniques. It is divided into two main assignments:

  1. Assignment 1: Focuses on LSTM-based models and Transformer-based models for sexism detection.

    Dataset: A small version of EXIST dataset Github repository.

  2. Assignment 2: Explores Large Language Models (LLMs) for zero-shot and few-shot prompting for sexism detection.

    Dataset: A small test set version of EDOS Github repository.


Approaches

Assignment 1: LSTM and Transformer-Based Models

LSTM-Based Models

Three LSTM-based models were implemented:

  1. Baseline Model: A Bidirectional LSTM with a Dense layer on top.
  2. Model 1: Extends the Baseline by adding an additional LSTM layer.
  3. Model 2: Uses two LSTM layers with the same hidden dimension.

Transformer-Based Models

The project fine-tuned the Twitter-roBERTa-base for Hate Speech Detection model, available on Hugging Face, for sexism detection. This model leverages the power of pre-trained transformer architectures to achieve state-of-the-art performance.


Assignment 2: LLM-Based Models

This part of the project focuses on Large Language Models (LLMs) for sexism detection using Zero-shot and Few-shot prompting.

The following LLMs were used:

  • Mistral-7B-Instruct-v0.3
  • Phi-3.5-mini-instruct

Contributors

  • Habib Kazemi
  • Hesam Sheikh Hassani
  • Ehsan Ramezani

License

This project is licensed under the MIT License. See the LICENSE file for details.