- -

Blackboard CLS

-

V1.0

-

Model Details

-
    -
  • Developer: Wannaphong Phatthiyaphaibun
  • -
  • This report author: Wannaphong Phatthiyaphaibun
  • -
  • Model date: 2022-10-14
  • -
  • Model version: 1.0
  • -
  • Used in PyThaiNLP version: 3.2 +
  • -
  • Filename: pythainlp/corpus/blackboard-cls_v1.0.crfsuite
  • -
  • GitHub: https://github.com/PyThaiNLP/pythainlp/issues/729
  • -
  • CRF Model
  • -
  • License: CC0
  • -
-

Intended Use

-
    -
  • Segmenting Thai text into clauses (smaller than a sentence but bigger than a word)
  • -
  • Not suitable for other language or non-news domains.
  • -
-

Factors

-
    -
  • Based on known problems with thai natural Language processing.
  • -
-

Metrics

-
    -
  • Evaluation metrics include precision, recall and f1-score.
  • -
-

Training Data

-

Blackboard treebank

-

Evaluation Data

-

Blackboard treebank

-

Quantitative Analyses

-
              precision    recall  f1-score   support
-
-       B_CLS       1.00      1.00      1.00     91698
-       E_CLS       1.00      1.00      1.00     91700
-       I_CLS       1.00      1.00      1.00    707795
-
-   micro avg       1.00      1.00      1.00    891193
-   macro avg       1.00      1.00      1.00    891193
-weighted avg       1.00      1.00      1.00    891193
- samples avg       1.00      1.00      1.00    891193
-
-

Ethical Considerations

-
    -
  • It trains from Blackboard treebank. It is possible to have a bias from Blackboard treebank.
  • -
-

Caveats and Recommendations

-
    -
  • The user must perform word segmentation first before using this model.
  • -
  • Thai text only
  • -