Steps to reproduce
When prepare the file assigned to FILE_projects_list in config.ini, DO NOT end with an empty line:
path/to/project1.zip
path/to/project2.zip
Run python tokenizer.py zip, then check the log file and see:
[INFO] (MainThread) Starting zip project <1, path/to/project1.zip> (process 0)
...
[INFO] (MainThread) Starting zip project <2, path/to/project2.zi> (process 0)
The path of the last project is handled incorrectly which results in project not found.
This may caused by proj_paths.append(line[:-1]) in tokenizers/file-level/tokenizer.py .
Recommend to use line.strip() instead of line[:-1].