Language Identification

This project was a fun side project in which I trained a few n-gram models on different language wikipedia content, and compared them to the n-gram model of the input text to detect the closest match.

To try it out, just clone this repo and run the included notebook locally. It should be self-explanatory