Can classifier update() be faster than training from scratch?

I am building a dataset and am training NaiveBayesClassifier as the dataset grows. Instead of retraining the classifier every time after adding few new entries, I was hoping to use the update() method just to add new entries and retrain the model with them, in order to cut training time when new data added. What I discovered that loading a pickled trained classifier and updating it just with new entries is not faster than re-training it from scratch. Re-reading the docs they do say that update() "Update the classifier with new training data and re-trains the classifier", which implies re-training on the entire data set...

Question: is there such thing as incremental re-training, or realistically it is processing the entire dataset _from scratch_, every time I want to update the classifier with new data?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can classifier update() be faster than training from scratch? #123

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Can classifier update() be faster than training from scratch? #123

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions