Skip to content

Conversation

@ahadjawaid
Copy link

What this does

Fixes handling of hidden .parquet files during dataset loading (Bug).
Some environments generate hidden files like ._file-000.parquet, which cause Dataset.from_parquet to error when memory-mapped loading is used.
This PR filters out such hidden .parquet files before loading.

How it was tested

  • Reproduced the issue with paths containing both hidden and regular .parquet files.
  • Verified that dataset loading succeeds after filtering out hidden files.
  • Confirmed that normal dataset loading remains unchanged when no hidden files are present.

How to checkout & try? (for the reviewer)

python -c "from lerobot.datasets.utils import load_dataset; load_dataset('/path/to/episodes')"

or run any pipeline relying on Dataset.from_parquet to confirm it loads without errors when hidden files exist.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant