GitHub - NLPI/Malaya: Natural-Language-Toolkit for bahasa Malaysia, https://malaya.readthedocs.io/

Malaya is a Natural-Language-Toolkit library for bahasa Malaysia, powered by Deep Learning Tensorflow.

Documentation

Proper documentation is available at https://malaya.readthedocs.io/

Installing from the PyPI

CPU version

$ pip install malaya

GPU version

$ pip install malaya-gpu

Only Python 3.6.x and above and Tensorflow 1.10 and above but not 2.0 are supported.

Features

Emotion Analysis

From transfer-learning BERT-Bahasa, XLNET-Bahasa and ALBERT-Bahasa to build deep emotion analysis models.
Entities Recognition

From transfer-learning BERT-Bahasa, XLNET-Bahasa and ALBERT-Bahasa to do Naming Entity Recognition.
Language Detection

using Fast-text and Sparse Deep learning Model to classify Malay (formal and social media), Indonesia (formal and social media), Rojak language and Manglish.
Normalizer

using local Malaysia NLP researches hybrid with Transformer models to normalize any bahasa texts.
Num2Word

Convert from numbers to cardinal or ordinal representation.
Part-of-Speech Recognition

From transfer-learning BERT-Bahasa, XLNET-Bahasa and ALBERT-Bahasa to do Part-of-Speech Recognition.
Dependency Parsing

From transfer-learning BERT-Bahasa, XLNET-Bahasa and ALBERT-Bahasa to do Dependency Parsing.
Relevancy Analysis

From transfer-learning BERT-Bahasa, XLNET-Bahasa and ALBERT-Bahasa to build deep relevancy analysis models.
Sentiment Analysis

From transfer-learning BERT-Bahasa, XLNET-Bahasa and ALBERT-Bahasa to build deep sentiment analysis models.
Spell Correction

Using local Malaysia NLP researches hybrid with Transformer models to auto-correct any bahasa words.
Stemmer

Use Character LSTM Seq2Seq with attention state-of-art to do Bahasa stemming.
Subjectivity Analysis

From transfer-learning BERT-Bahasa, XLNET-Bahasa and ALBERT-Bahasa to build deep subjectivity analysis models.
Similarity

Use deep Encoder, Doc2Vec, BERT, XLNET and ALBERT to build deep semantic similarity models.
Summarization

Using BERT, XLNET, ALBERT, skip-thought, LDA, LSA and Doc2Vec to give precise unsupervised summarization, and TextRank as scoring algorithm.
Topic Modelling

Provide Attention, LDA2Vec, LDA, NMF and LSA interface for easy topic modelling with topics visualization.
Toxicity Analysis

From transfer-learning BERT-Bahasa, XLNET-Bahasa and ALBERT-Bahasa to build deep toxicity analysis models.
Word2Vec

Provide pretrained bahasa wikipedia and bahasa news Word2Vec, with easy interface and visualization.
Transformer

Provide easy interface to load BERT-Bahasa, XLNET-Bahasa, ALBERT-Bahasa and ALXNET-Bahasa.

References

If you use our software for research, please cite:

@misc{Malaya, Natural-Language-Toolkit library for bahasa Malaysia, powered by Deep Learning Tensorflow,
  author = {Husein, Zolkepli},
  title = {Malaya},
  year = {2018},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/huseinzol05/malaya}}
}

Acknowledgement

Thanks to Im Big, LigBlou, Mesolitica and KeyReply for sponsoring AWS Google and private cloud to train Malaya models.

Contributing

Thank you for contributing this library, really helps a lot. Feel free to contact me to suggest me anything or want to contribute other kind of forms, we accept everything, not just code!

Name		Name	Last commit message	Last commit date
Latest commit History 301 Commits
accuracy		accuracy
crawl		crawl
deployment		deployment
docs		docs
example		example
malaya		malaya
pretrained-model		pretrained-model
session		session
tests		tests
translator		translator
.gitignore		.gitignore
.travis.yml		.travis.yml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.rst		README.rst
build-dependencies.sh		build-dependencies.sh
build-package.sh		build-package.sh
generate-rst.sh		generate-rst.sh
readthedocs.yml		readthedocs.yml
setup-gpu.py		setup-gpu.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Documentation

Installing from the PyPI

Features

References

Acknowledgement

Contributing

License

About

Releases

Packages

Languages

License

NLPI/Malaya

Folders and files

Latest commit

History

Repository files navigation

Documentation

Installing from the PyPI

Features

References

Acknowledgement

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages