Skip to content

ValueError: could not convert string to float: 'SH600000' when i use dump_bin.py #1852

@nb7123

Description

@nb7123

🐛 Bug Description

To Reproduce

Steps to reproduce the behavior:

  1. python scripts/data_collector/baostock_5min/collector.py download_data --source_dir ~/.qlib/stock_data/source/hs300_5min_original --start 2022-01-01 --end 2022-01-30 --interval 5min --region HS300

  2. python scripts/dump_bin.py dump_all --csv_path ~/.qlib/stock_data/source/hs300_5min_original --qlib_dir ~/.qlib/qlib_data/samples

Error:
"""
Traceback (most recent call last):
File "/Users/didi/miniconda3/envs/qlib/lib/python3.8/concurrent/futures/process.py", line 239, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "/Users/didi/miniconda3/envs/qlib/lib/python3.8/concurrent/futures/process.py", line 198, in _process_chunk
return [fn(*args) for args in chunk]
File "/Users/didi/miniconda3/envs/qlib/lib/python3.8/concurrent/futures/process.py", line 198, in
return [fn(*args) for args in chunk]
File "/Users/didi/Code/Github/qlib/scripts/dump_bin.py", line 264, in _dump_bin
self._data_to_bin(df, calendar_list, features_dir)
File "/Users/didi/Code/Github/qlib/scripts/dump_bin.py", line 238, in _data_to_bin
np.hstack([date_index, _df[field]]).astype("<f").tofile(str(bin_path.resolve()))
ValueError: could not convert string to float: 'SH600000'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/Users/didi/Code/Github/qlib/scripts/dump_bin.py", line 508, in
fire.Fire({"dump_all": DumpDataAll, "dump_fix": DumpDataFix, "dump_update": DumpDataUpdate})
File "/Users/didi/miniconda3/envs/qlib/lib/python3.8/site-packages/fire/core.py", line 143, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/Users/didi/miniconda3/envs/qlib/lib/python3.8/site-packages/fire/core.py", line 568, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/Users/didi/miniconda3/envs/qlib/lib/python3.8/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/Users/didi/Code/Github/qlib/scripts/dump_bin.py", line 271, in call
self.dump()
File "/Users/didi/Code/Github/qlib/scripts/dump_bin.py", line 322, in dump
self._dump_features()
File "/Users/didi/Code/Github/qlib/scripts/dump_bin.py", line 313, in _dump_features
for _ in executor.map(_dump_func, self.csv_files):
File "/Users/didi/miniconda3/envs/qlib/lib/python3.8/concurrent/futures/process.py", line 484, in _chain_from_iterable_of_lists
for element in iterable:
File "/Users/didi/miniconda3/envs/qlib/lib/python3.8/concurrent/futures/_base.py", line 619, in result_iterator
yield fs.pop().result()
File "/Users/didi/miniconda3/envs/qlib/lib/python3.8/concurrent/futures/_base.py", line 444, in result
return self.__get_result()
File "/Users/didi/miniconda3/envs/qlib/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
ValueError: could not convert string to float: 'SH600000'

Screenshot

Environment

Darwin
arm64
macOS-14.2-arm64-arm-64bit
Darwin Kernel Version 23.2.0: Wed Nov 15 21:54:55 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T8122

Python version: 3.8.19 (default, Mar 20 2024, 15:27:52) [Clang 14.0.6 ]

Qlib version: 0.9.5
numpy==1.23.5
pandas==2.0.3
scipy==1.10.1
requests==2.32.3
sacred==0.8.6
python-socketio==5.11.4
redis==5.0.8
python-redis-lock==4.0.0
schedule==1.2.2
cvxpy==1.5.2
hyperopt==0.1.2
fire==0.6.0
statsmodels==0.14.1
xlrd==2.0.1
plotly==5.24.1
matplotlib==3.7.5
tables==3.7.0
pyyaml==6.0.2
mlflow==1.14.1
tqdm==4.66.5
loguru==0.7.2
lightgbm==4.5.0
tornado==6.4.1
joblib==1.4.2
fire==0.6.0
ruamel.yaml==0.17.36

Additional Notes

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions