43 133 281

Gabriel Martín Blázquez

gabrielmbmb

https://gabrielmb.com

AI & ML interests

ML Engineer

Recent Activity

upvoted a paper 1 day ago

Trans-Tokenization and Cross-lingual Vocabulary Transfers: Language Adaptation of LLMs for Low-Resource NLP

reacted to Kseniase's post with 👍 1 day ago

upvoted an article 1 day ago

🌁#81: Key AI Concepts to Follow in 2025

View all activity

Articles

How we leveraged distilabel to create an Argilla 2.0 Chatbot

Jul 16, 2024

• 32

Organizations

gabrielmbmb's activity

upvoted a paper 1 day ago

Trans-Tokenization and Cross-lingual Vocabulary Transfers: Language Adaptation of LLMs for Low-Resource NLP

Paper • 2408.04303 • Published Aug 8, 2024 • 14

reacted to Kseniase's post with 👍 1 day ago

Post

2997

**15 Agentic Systems and Frameworks of 2024**

This year, we started our “AI Agents and Agentic Workflows” series (https://www.turingpost.com/t/AI-Agents) to explore everything about AI agents step by step: all the vocabulary, how they work, and how to build them.
The huge interest in this series and the large number of studies conducted on agents showed that it was one of the most popular and important themes of the year. In 2025, most likely, agents will reach new highs – we will be covering that for you. Now, let’s review the agentic systems that have emerged this year.

Here is a list of 15 agentic systems and frameworks of 2024:

1. GUI Agents: A Survey (2412.13501)

2. Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level (2411.03562)

3. The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery (2408.06292)

4. MALT: Improving Reasoning with Multi-Agent LLM Training (2412.01928)

5. Agent S: An Open Agentic Framework that Uses Computers Like a Human (2410.08164)

6. Automated Design of Agentic Systems (2408.08435)

7. AgentInstruct: Toward Generative Teaching with Agentic Flows (2407.03502)

8. AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant (2410.18603)

9. WALL-E: World Alignment by Rule Learning Improves World Model-based LLM Agents (2410.07484)

10. Generative Agent Simulations of 1,000 People (2411.10109)

11. DynaSaur: Large Language Agents Beyond Predefined Actions (2411.01747)

12. PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking (2410.12375)

13. Generative World Explorer (2411.11844)

14. Bel Esprit: Multi-Agent Framework for Building AI Model Pipelines (2412.14684)

15. AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions (2410.20424)

Thanks for reading Turing Post!
Subscribe to receive new posts straight into your inbox -> https://www.turingpost.com/subscribe

upvoted an article 1 day ago

Article

🌁#81: Key AI Concepts to Follow in 2025

•

19 days ago

• 24

liked a model 3 days ago

microsoft/phi-4

Text Generation • Updated 3 days ago • 35.9k • 965

upvoted an article 3 days ago

Article

Fine-tune a SmolLM on domain-specific synthetic data from a LLM

•

8 days ago

• 29

liked a Space 6 days ago

Running

299

📈

2024 AI Timeline

reacted to anton-l's post with 🚀 21 days ago

Post

2186

Introducing 📐𝐅𝐢𝐧𝐞𝐌𝐚𝐭𝐡: the best public math pre-training dataset with 50B+ tokens!
HuggingFaceTB/finemath

Math remains challenging for LLMs and by training on FineMath we see considerable gains over other math datasets, especially on GSM8K and MATH.

We build the dataset by:
🛠️ carefully extracting math data from Common Crawl;
🔎 iteratively filtering and recalling high quality math pages using a classifier trained on synthetic annotations to identify math reasoning and deduction.

We conducted a series of ablations comparing the performance of Llama-3.2-3B-Base after continued pre-training on FineMath and observe notable gains compared to the baseline model and other public math datasets.

We hope this helps advance the performance of LLMs on math and reasoning! 🚀
We’re also releasing all the ablation models as well as the evaluation code.

HuggingFaceTB/finemath-6763fb8f71b6439b653482c2

updated a dataset 22 days ago

gabrielmbmb/gsm8k-reasoning-paths-combined

Viewer • Updated 22 days ago • 100 • 46

upvoted a paper 22 days ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published 23 days ago • 339

liked a Space 23 days ago

Running

206

🏃

Jupyter Agent

liked a model 23 days ago

answerdotai/ModernBERT-base

Fill-Mask • Updated about 14 hours ago • 2.89M • 646

upvoted a paper 23 days ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published 24 days ago • 121

reacted to burtenshaw's post with 🤗❤️ 23 days ago

Post

2659

People are flexing their end of year stats, so I made this app to show hub stats in a tidy design!

Thanks @Ameeeee and @jfcalvo for the feature from Argilla!
burtenshaw/recap

1 reply

liked a dataset 23 days ago

HuggingFaceTB/finemath

Viewer • Updated 19 days ago • 48.3M • 35.3k • 241

updated 3 datasets 25 days ago

liked a Space 25 days ago

Running

456

📈

Scaling test-time compute

updated a dataset 26 days ago

gabrielmbmb/math-500-dota-math

Viewer • Updated 26 days ago • 500 • 115