Punctuator by @ottokart

Bomoda borrows a model to add missing punctuations back to documents, especially radio transcripts. Check out https://github.com/ottokart/punctuator2

How to use

First install required python packages

pip install -r requirements.txt

Then go to python

# define your own tokenize function
from nltk.tokenize import TweetTokenizer
tknzr = TweetTokenizer()

from lib.punctuator import Punctuator
P = Punctuator(
    tokenize_func=tknzr.tokenize
    )
P.load()
P.punctuate(u"hi this is the best-looking guy on globe why you laugh get lost")
# return will be like u"hi, this is the best-looking guy on globe, why you laugh get lost? "

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Punctuator by @ottokart

How to use

First install required python packages

Then go to python

Files

README.md

Latest commit

History

README.md

File metadata and controls

Punctuator by @ottokart

How to use

First install required python packages

Then go to python