Introducing Synthetic Data Workshop: Your Gateway to Easy Synthetic Dataset Creation Jun 20, 2024 β’ 12
Synthetic dataset generation techniques: generating custom sentence similarity data May 23, 2024 β’ 16
Can we create pedagogically valuable multi-turn synthetic datasets from Cosmopedia? May 7, 2024 β’ 7
Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models Mar 20, 2024 β’ 72
Introducing IDEFICS: An Open Reproduction of State-of-the-art Visual Language Model Aug 22, 2023 β’ 28
Huggy Lingo: Using Machine Learning to Improve Language Metadata on the Hugging Face Hub Aug 2, 2023 β’ 1
Parallia/Fairly-Multilingual-ModernBERT-Embed-BE Sentence Similarity β’ Updated 2 days ago β’ 93 β’ 16
Rapidata/text-2-image-Rich-Human-Feedback Viewer β’ Updated about 2 hours ago β’ 13k β’ 375 β’ 18
nomic-ai/modernbert-embed-base-unsupervised Sentence Similarity β’ Updated 13 days ago β’ 959 β’ 10