Traversal of different vectorization and modeling approaches
Using the Wiki comments competition data as a data set, these Jupyter NBs explore different machine and deep learning alorithms to understand how they perform and their different trade offs. For the ML models the vectorization methods leveraged are BoW and TF-IDF and the models Naive Bayes and a decision tree. For the DL it uses gthe Keras tokenizer, different embedddings aproaches and a FF and Simple RNN model.