In this paper, we intend to discriminate the Baybayin script, a pre-colonial writing system used in the Philippines, from the Latin script at a character level. The proposed algorithm uses four main Support Vector Machine (SVM) classifiers to perform the following classifications between: Baybayin and Latin script, Baybayin characters, Latin characters, and Baybayin diacritical marks. This method emphasizes the recognition of Baybayin characters and so we tested the algorithm with the set of images found in (1) that satisfies our system assumptions. We also include here the codes on how we generate the aforementioned classifiers using the dataset found in (2), (3), and (4) for Baybayin, Latin, and Baybayin diacritic characters, respectively. Finally, we discuss the strengths and limitations of the system, its experimental results and recommendations for further research.
URL links for dataset:
(1) https://www.kaggle.com/jamesnogra/baybayn-baybayin-handwritten-images
(2) https://www.kaggle.com/rodneypino/baybayin-and-latin-binary-images-in-mat-format?select=Baybayin
(3) https://www.kaggle.com/rodneypino/baybayin-and-latin-binary-images-in-mat-format?select=Latin
(4) https://www.kaggle.com/rodneypino/baybayin-and-latin-binary-images-in-mat-format?select=Baybayin+Diacritics
You can check the full paper here: https://peerj.com/articles/cs-360/.