🧠 Punctuation Restoration Model

This repository contains a fine-tuned transformer-based model for restoring punctuation in automatic speech recognition (ASR) outputs and spoken transcripts. The model adds missing punctuation like ., ,, ?, !, : and ; to improve readability and downstream NLP performance.

✨ Features

Fine-tuned on diverse spoken text data (Wikipedia corpus, Hugging Face datasets, podcast transcripts, manual YouTube captions from TedTalks and interviews)
Supports ; : ! ? , . — uncommon punctuation like ; and : included
Built on top of google/bert_uncased_L-4_H-256_A-4
Easy to plug into any transcript-cleaning pipeline
Does not support auto capitalisations, and works only on clean transcripts without any ; : ! ? , . punctuation

📦 Installation

Follow these steps to install and run the punctuation restoration model locally.

1. Clone the repository

git clone https://github.com/yyihaoc/punctuate-bert-mini.git
cd punctuate-bert-mini

2. Set up the virtual environment

python3 -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

3. Install dependencies

pip install -r requirements.txt

4. Run the model

Open test_result_bert_mini.py and replace the example input with your own text. Then run

python test_result_bert_mini.py

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
datasets		datasets
google-bert-mini-tuned-versions		google-bert-mini-tuned-versions
google-bert-mini-tuned		google-bert-mini-tuned
prajjwal1-bert-tiny-tuned		prajjwal1-bert-tiny-tuned
results-bert-mini-google		results-bert-mini-google
results-bert-tiny		results-bert-tiny
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
save_model.py		save_model.py
test_result_bert_mini.py		test_result_bert_mini.py
test_result_bert_tiny.py		test_result_bert_tiny.py
train_bert_mini.py		train_bert_mini.py
train_bert_tiny.py		train_bert_tiny.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 Punctuation Restoration Model

✨ Features

📦 Installation

1. Clone the repository

2. Set up the virtual environment

3. Install dependencies

4. Run the model

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧠 Punctuation Restoration Model

✨ Features

📦 Installation

1. Clone the repository

2. Set up the virtual environment

3. Install dependencies

4. Run the model

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages