Skip to content

JaesungHuh/voice-gender-classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Voice gender classifier

Dynamic JSON Badge Static Badge

  • This repo contains the inference code to use pretrained human voice gender classifier.

Installation

First, clone this repository

git clone https://github.com/JaesungHuh/voice-gender-classifier.git

and install the packages via pip.

cd voice-gender-classifier
pip install -r requirements.txt

Usage

import torch

from model import ECAPA_gender

# You could directly download the model from the huggingface model hub
model = ECAPA_gender.from_pretrained("JaesungHuh/voice-gender-classifier")
model.eval()

# If you are using gpu .... 
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

# Load the audio file and use predict function to directly get the output
example_file = "data/00001.wav"
with torch.no_grad():
    output = model.predict(example_file, device=device)
    print("Gender : ", output)

Pretrained weights

For those who need pretrained weights, please download them here.

Training details

State-of-the-art speaker verification model already produces good representation of the speaker's gender.

I used the pretrained ECAPA-TDNN from TaoRuijie's repository, added one linear layer to make a two-class classifier, and finetuned the model with the VoxCeleb2 dev set.

The model achieved 98.7% accuracy on the VoxCeleb1 identification test split.

Caveat

I would like to note that the training dataset I've used for this model (VoxCeleb) may not represent the global human population. Please be careful of unintended biases when using this model.

References

Releases

No releases published

Packages

No packages published

Languages