GloVe Model for Igala Language

Overview

This repository contains the code and resources for training a GloVe (Global Vectors for Word Representation) model on Igala, a low-resource language. The project includes scripts for preprocessing the text data, training the GloVe model, evaluating the resulting word embeddings, and visualizing the embeddings using techniques such as t-SNE.

Features

Data Preprocessing: Tools for cleaning and tokenizing Igala text data.
GloVe Model Training: Implementation of a GloVe model using PyTorch.
Evaluation: Scripts to evaluate word embeddings using cosine similarity, intrinsic evaluation methods, and extrinsic tasks.
Visualization: Interactive visualization of word embeddings using t-SNE and Plotly.

Getting Started

Prerequisites

Before running the code, ensure you have the following dependencies installed:

Python 3.x
PyTorch
NumPy
Matplotlib
Scikit-learn
Plotly
Pandas
GPU

You can install these dependencies using pip:

pip install torch numpy matplotlib scikit-learn plotly pandas

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
GLoveModel.ipynb		GLoveModel.ipynb
README.md		README.md
final_processed_wikipedia_all_pages_contents.txt		final_processed_wikipedia_all_pages_contents.txt
glove_embeddings.txt		glove_embeddings.txt
glove_model_wiki.pth		glove_model_wiki.pth
igala_embedding_table.csv		igala_embedding_table.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GloVe Model for Igala Language

Overview

Features

Getting Started

Prerequisites

About

Releases

Packages

Languages

Joshuaatanu/GloVeModel

Folders and files

Latest commit

History

Repository files navigation

GloVe Model for Igala Language

Overview

Features

Getting Started

Prerequisites

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages