Project made in collaboration with Ben Chen https://github.com/benbenbang
- kkbox.py ( as
main) - README.md
- tools
- dataIO.py
- helperOutput.py
- wrangling.py
- requirement.txt
- README.md
-
Python Version:
python3 -
Run
python3,jupyter notebookor set default directory underpy_scriptfolder -
For easier to use
main:- for csv files: place them in
../data- Concated csv files
- kkbox.csv
- kkbox_test.csv
- Raw csv files
- train.csv
- test.csv
- songs.csv
- members.csv
- song_extra_info.csv
- Concated csv files
- for pickle files: place them in
..data/pickle- train_sparse.pickle
- test_sparse.pickle
- target.pickle
- for xgboost model weights: place them in
..data/models- xgbt_nn.model (nn is number, ex. 01, 02, ...)
- for csv files: place them in
-
Basically, we will need at least
kkbox.csvandkkbox_test.csv. If missing one or more of them, useimportAndMergeCSV('train')andimportAndMergeCSV('test')to getpd.DataFrame. Don't forget to save one copy to save your time for next loading. -
Useful functions in
tools-
In
dataIO:importAndMergeCSV(type_)importAndMergeHDF5(type_)loadPickle(path)savePickle(data, path)
-
In
wranglingloadAndPreprocess(csv_path=None, to_train_mode=False, to_test_mode=False)getFreqOfTarget(df)
-
In
helperOutputoutputHelper(model, X_test_sparse=None, load_test_set=False)loadModelHelper(version)
-