Skip to content

v1.4.2 SEP token & data augmentation offset combinations argument

Compare
Choose a tag to compare
@Natooz Natooz released this 26 Jan 14:47
· 321 commits to main since this release

Changes

  • f6225a1 Added the option to have a SEP special token, that can be used to train models to perform tasks such as "Next sequence prediction"
  • bb24512 Data augmentation can now receive the all_offset_combinations argument, which will perform augmentation with all the combinations of offsets. With the offsets $\left( x_1 , x_2 , x_3 \right)$, it will perform a total of $\prod_i x_i$ combinations ( $\prod_i (x_i \times 2)$ if going up and down). This is disabled by default to save you from hundreds of augmentations 🤓 (and is not chained with tokenize_midi_dataset), by defaults augmentations are done on the original input only.