Skip to content

Regarding Inconsistencies in Feature Dimensions and Column Names in TSB-AD-M Datasets #46

@User-Pass320

Description

@User-Pass320

First and foremost, please allow us to express our sincere gratitude and appreciation for your efforts in organizing and open-sourcing these valuable time series anomaly detection datasets.

In our current work, we are attempting to merge the individual data shards from each dataset into consolidated datasets for large-scale analysis. During this process, we've encountered some technical challenges related to feature consistency that we would like to bring to your attention.

Specifically, we've identified inconsistencies that prevent automated merging of the data shards:

In the MITDB dataset, we noticed significant discrepancies in column names across different shards:
Shard 1 uses: {'MLII', 'V4', 'Label'}
Shard 2 uses: {'MLII', 'V1', 'Label'}
Shard 3 uses: {'V5', 'V2', 'Label'}

Similarly, in the LTDB dataset, we observed inconsistencies in feature dimensions: while the majority of shards (1, 3, 4, and 5) contain 2 feature columns, Shard 2 contains 3 feature columns, with an additional column named "ECG3". These inconsistencies require manual intervention to align the data structures before merging, which compromises the reproducibility and scalability of our analysis pipeline.

We would greatly appreciate any guidance or clarification you could provide regarding:

The intended data schema for each dataset
Recommended approaches for handling these inconsistencies
Whether there are established protocols for merging these shards

Thank you again for your valuable contribution to the research community. We look forward to your insights on this matter.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions