Skip to content

Latest commit

 

History

History
14 lines (11 loc) · 601 Bytes

README.md

File metadata and controls

14 lines (11 loc) · 601 Bytes

The Datuk Corpus

The Datuk corpus is a free and open source Malayalam–Malayalam dictionary dataset with over 106,000 definitions for more than 83,000 Malayalam words. It is an extensively refined and semanticized version of Datuk's original digitisation work incorporating tens of thousands of changes. The majority of words and definitions are grammar tagged, and a large number of records also have additional metadata attached to them.

Documentation

For documentation and other information, visit http://olam.in/open/datuk

License

ODbL

Kailash Nadh, May 2013 - http://nadh.in