Moritz Laurer's picture

Moritz Laurer

MoritzLaurer

AI & ML interests

None yet

Recent Activity

reacted to merve's post with ❀️ about 23 hours ago
What a beginning to this year in open ML 🀠 Let's unwrap! https://huggingface.co/collections/merve/jan-10-releases-677fe34177759de0edfc9714 Multimodal πŸ–ΌοΈ > ByteDance released SA2VA: a family of vision LMs that can take image, video, text and visual prompts > moondream2 is out with new capabilities like outputting structured data and gaze detection! > Dataset: Alibaba DAMO lab released multimodal textbook β€” 22k hours worth of samples from instruction videos 🀯 > Dataset: SciCap captioning on scientific documents benchmark dataset is released along with the challenge! LLMs πŸ’¬ > Microsoft released Phi-4, sota open-source 14B language model πŸ”₯ > Dolphin is back with Dolphin 3.0 Llama 3.1 8B 🐬🐬 > Prime-RL released Eurus-2-7B-PRIME a new language model trained using PRIME alignment > SmallThinker-3B is a new small reasoning LM based on Owen2.5-3B-Instruct πŸ’­ > Dataset: QWQ-LONGCOT-500K is the dataset used to train SmallThinker, generated using QwQ-32B-preview πŸ“• > Dataset: @cfahlgren1 released React Code Instructions: a dataset of code instruction-code pairs πŸ“• > Dataset: Qwen team is on the roll, they just released CodeElo, a dataset of code preferences πŸ‘©πŸ»β€πŸ’» Embeddings πŸ”– > @MoritzLaurer released zero-shot version of ModernBERT large πŸ‘ > KaLM is a new family of performant multilingual embedding models with MIT license built using Qwen2-0.5B Image/Video Generation ⏯️ > NVIDIA released Cosmos, a new family of diffusion/autoregressive World Foundation Models generating worlds from images, videos and texts πŸ”₯ > Adobe released TransPixar: a new text-to-video model that can generate assets with transparent backgrounds (a first!) > Dataset: fal released cosmos-openvid-1m Cosmos-tokenized OpenVid-1M with samples from OpenVid-1M Others > Prior Labs released TabPFNv2, the best tabular transformer is out for classification and regression > Metagene-1 is a new RNA language model that can be used for pathogen detection, zero-shot embedding and genome understanding
View all activity

Articles

Organizations

Hugging Face's profile picture Amazon SageMaker Community's profile picture  Zero Shot NLI 's profile picture Hugging Test Lab's profile picture Deutsche Gesellschaft fΓΌr internationale Zusammenarbeit's profile picture HuggingFaceM4's profile picture Aledade Inc's profile picture classroom-test-room's profile picture Prezi's profile picture Blog-explorers's profile picture Enterprise Explorers's profile picture ZeroGPU Explorers's profile picture Spectral's profile picture C&A's profile picture Social Post Explorers's profile picture Triple's profile picture Dev Mode Explorers's profile picture moritz-test-organization-changed-2's profile picture Hugging Face Discord Community's profile picture Moritz Test Org's profile picture

MoritzLaurer's activity

posted an update about 3 hours ago
view post
Post
115
FACTS is a great paper from @GoogleDeepMind on measuring the factuality of LLM outputs. You can now download their prompt templates from @huggingface to improve LLM-based fact-checking yourself!

πŸ“ The paper introduces the FACTS Grounding benchmark for evaluating the factuality of LLM outputs.

πŸ€– Fact-checking is automated by an ensemble of LLM judges that verify if a response is fully grounded in a factual reference document.

πŸ§ͺ The authors tested different prompt templates on held-out data to ensure their generalization.

πŸ“š It's highly educational to read these templates to learn how frontier labs design prompts and understand their limitations.

πŸ’Ύ You can now download and reuse these prompt templates via the prompt-templates library!

πŸ”„ The library simplifies sharing prompt templates on the HF hub or locally via standardized YAML files. Let’s make LLM work more transparent and reproducible by sharing more templates like this!

Links πŸ‘‡
- prompt-templates docs: https://moritzlaurer.github.io/prompt_templates/
- all templates on the HF Hub: MoritzLaurer/facts-grounding-prompts
- FACTS paper: https://storage.googleapis.com/deepmind-media/FACTS/FACTS_grounding_paper.pdf
reacted to merve's post with ❀️ about 23 hours ago
view post
Post
1280
What a beginning to this year in open ML 🀠
Let's unwrap! merve/jan-10-releases-677fe34177759de0edfc9714

Multimodal πŸ–ΌοΈ
> ByteDance released SA2VA: a family of vision LMs that can take image, video, text and visual prompts
> moondream2 is out with new capabilities like outputting structured data and gaze detection!
> Dataset: Alibaba DAMO lab released multimodal textbook β€” 22k hours worth of samples from instruction videos 🀯
> Dataset: SciCap captioning on scientific documents benchmark dataset is released along with the challenge!

LLMs πŸ’¬
> Microsoft released Phi-4, sota open-source 14B language model πŸ”₯
> Dolphin is back with Dolphin 3.0 Llama 3.1 8B 🐬🐬
> Prime-RL released Eurus-2-7B-PRIME a new language model trained using PRIME alignment
> SmallThinker-3B is a new small reasoning LM based on Owen2.5-3B-Instruct πŸ’­
> Dataset: QWQ-LONGCOT-500K is the dataset used to train SmallThinker, generated using QwQ-32B-preview πŸ“•
> Dataset: @cfahlgren1 released React Code Instructions: a dataset of code instruction-code pairs πŸ“•
> Dataset: Qwen team is on the roll, they just released CodeElo, a dataset of code preferences πŸ‘©πŸ»β€πŸ’»

Embeddings πŸ”–
> @MoritzLaurer released zero-shot version of ModernBERT large πŸ‘
> KaLM is a new family of performant multilingual embedding models with MIT license built using Qwen2-0.5B

Image/Video Generation ⏯️
> NVIDIA released Cosmos, a new family of diffusion/autoregressive World Foundation Models generating worlds from images, videos and texts πŸ”₯
> Adobe released TransPixar: a new text-to-video model that can generate assets with transparent backgrounds (a first!)
> Dataset: fal released cosmos-openvid-1m Cosmos-tokenized OpenVid-1M with samples from OpenVid-1M

Others
> Prior Labs released TabPFNv2, the best tabular transformer is out for classification and regression
> Metagene-1 is a new RNA language model that can be used for pathogen detection, zero-shot embedding and genome understanding
posted an update 2 days ago
view post
Post
1462
The TRL v0.13 release is πŸ”₯! My highlight are the new process reward trainer to train models similar to o1 and tool call support:

🧠 Process reward trainer: Enables training of Process-supervised Reward Models (PRMs), which reward the quality of intermediate steps, promoting structured reasoning. Perfect for tasks like stepwise reasoning.

πŸ”€ Model merging: A new callback leverages mergekit to merge models during training, improving performance by blending reference and policy models - optionally pushing merged models to the Hugging Face Hub.

πŸ› οΈ Tool call support: TRL preprocessing now supports tool integration, laying the groundwork for agent fine-tuning with examples like dynamic temperature fetching in prompts.

βš–οΈ Mixture of judges: The new AllTrueJudge combines decisions from multiple binary judges for more nuanced evaluation.

Read the release notes and other resources here πŸ‘‡
Release: https://github.com/huggingface/trl/releases/tag/v0.13.0
Mergekit: https://github.com/arcee-ai/mergekit
Mixture of judges paper: The Perfect Blend: Redefining RLHF with Mixture of Judges (2409.20370)
reacted to andrewrreed's post with 🀝πŸ”₯ 3 days ago
view post
Post
2525
πŸš€ Supercharge your LLM apps with Langfuse on Hugging Face Spaces!

Langfuse brings end-to-end observability and tooling to accelerate your dev workflow from experiments through production

Now available as a Docker Space directly on the HF Hub! πŸ€—

πŸ” Trace everything: monitor LLM calls, retrieval, and agent actions with popular frameworks
1⃣ One-click deployment: on Spaces with persistent storage and integrated OAuth
πŸ›  Simple Prompt Management: Version, edit, and update without redeployment
βœ… Intuitive Evals: Collect user feedback, run model/prompt evaluations, and improve quality
πŸ“Š Dataset Creation: Build datasets directly from production data to enhance future performance

Kudos to the Langfuse team for this collab and the awesome, open-first product they’re building! πŸ‘ @marcklingen @Clemo @MJannik

πŸ”— Space: langfuse/langfuse-template-space
πŸ”— Docs: https://huggingface.co/docs/hub/spaces-sdks-docker-langfuse
  • 1 reply
Β·
posted an update 4 days ago
view post
Post
2001
OpenAI is losing money on the $200/month subscription 🀯. It's crazy how expensive it is to run these largest LLMs:

- ChatGPT Pro costs $200/month ($2,400/year) and is still unprofitable for OpenAI due to higher-than-expected usage.
- OpenAI reportedly expected losses of about $5 billion on revenue of $3.7 billion last year, with ChatGPT alone once costing an estimated $700,000 per day to operate. πŸ’ΈπŸ”₯
- They build strong models and do great research. Whether this business model will work in the long run is one of the biggest questions in the AI economy today.

Source with the numbers πŸ‘‡
https://techcrunch.com/2025/01/05/openai-is-losing-money-on-its-pricey-chatgpt-pro-plan-ceo-sam-altman-says/
Β·
posted an update 5 days ago
view post
Post
2156
πŸš€ Releasing a new zeroshot-classifier based on ModernBERT! Some key takeaways:

- ⚑ Speed & efficiency: It's multiple times faster and uses significantly less memory than DeBERTav3. You can use larger batch sizes and enabling bf16 (instead of fp16) gave me a ~2x speed boost as well
- πŸ“‰ Performance tradeoff: It performs slightly worse than DeBERTav3 on average across my zeroshot classification task collection
- 🧠 Use cases: I recommend using it for scenarios requiring speed and a larger context window (8k).
- πŸ’‘ What’s next? I’m preparing a newer version trained on better + longer synthetic data to fully leverage the 8k context window and improve upon the training mix of my older zeroshot-v2.0 models. I also hope that there will be a multilingual variant in the future.

Great work by https://huggingface.co/answerdotai !

If you’re looking for a high-speed zeroshot classifier, give it a try!

πŸ“„ Resources below: πŸ‘‡
Base model: MoritzLaurer/ModernBERT-base-zeroshot-v2.0
Large model: MoritzLaurer/ModernBERT-large-zeroshot-v2.0
Updated zeroshot collection: MoritzLaurer/zeroshot-classifiers-6548b4ff407bb19ff5c3ad6f
ModernBERT collection with paper: answerdotai/modernbert-67627ad707a4acbf33c41deb
replied to their post 5 days ago
view reply

I'm experiencing the same, will release a 0-shot model based on ModernBERT soon

posted an update 22 days ago
view post
Post
2597
Quite excited by the ModernBERT release! 0.15/0.4B small, 2T modern pre-training data and tokenizer with code, 8k context window, great efficient model for embeddings & classification!

This will probably be the basis for many future SOTA encoders! And I can finally stop using DeBERTav3 from 2021 :D

Congrats @answerdotai , @LightOnIO and collaborators like @tomaarsen !

Paper and models here πŸ‘‡https://huggingface.co/collections/answerdotai/modernbert-67627ad707a4acbf33c41deb
Β·
replied to their post 24 days ago
view reply

Hey @borowis , I don't think there is a plan to add embedding models to the NIM API. Embedding models are quite small which makes them easier to run on accessible hardware (vs. the H100 GPUs running the large LLMs on the NIM API). I'd recommend using a cheap GPU (or even a CPU) via the HF dedicated endpoints for deploying embedding models: https://huggingface.co/inference-endpoints/dedicated And you can use the autoscaling/scale-to-zero feature to avoid unnecessary costs
(The smaller BGE models from the MTEB leaderboard are always a good place to start)

posted an update 25 days ago
posted an update 30 days ago
view post
Post
1285
I've been building a small library for working with prompt templates on the HF hub: pip install prompt-templates. Motivation:

The community currently shares prompt templates in a wide variety of formats: in datasets, in model cards, as strings in .py files, as .txt/.yaml/.json/.jinja2 files etc. This makes sharing and working with prompt templates unnecessarily complicated.

Prompt templates are currently the main hyperparameter that people tune when building complex LLM systems or agents. If we don't have a common standard for sharing them, we cannot systematically test and improve our systems. After comparing different community approaches, I think that working with modular .yaml or .json files is the best approach.

The prompt-templates library :
- proposes a standard for sharing prompts (entirely locally or on the HF hub)
- provides some utilities that are interoperable with the broader ecosystem

Try it:
# !pip install prompt-templates
from prompt_templates import PromptTemplateLoader 
prompt_template = PromptTemplateLoader.from_hub(repo_id="MoritzLaurer/closed_system_prompts", filename="claude-3-5-artifacts-leak-210624.yaml")


The library is in early stages, feedback is welcome!

More details in the docs: https://github.com/MoritzLaurer/prompt_templates/
  • 1 reply
Β·
reacted to julien-c's post with ❀️ about 1 month ago
view post
Post
8203
After some heated discussion πŸ”₯, we clarify our intent re. storage limits on the Hub

TL;DR:
- public storage is free, and (unless blatant abuse) unlimited. We do ask that you consider upgrading to PRO and/or Enterprise Hub if possible
- private storage is paid above a significant free tier (1TB if you have a paid account, 100GB otherwise)

docs: https://huggingface.co/docs/hub/storage-limits

We optimize our infrastructure continuously to scale our storage for the coming years of growth in Machine learning, to the benefit of the community πŸ”₯

cc: @reach-vb @pierric @victor and the HF team
Β·
replied to their post about 1 month ago
replied to their post about 1 month ago
view reply

It works for me with this code actually. The small change I made is a minimal change in the model identifier (removing the "meta"). The issue is that Meta renamed their models on HF recently.
It works for me with this code. Can you try again with this code?

#!pip install "huggingface_hub>=0.25.0"
from huggingface_hub import InferenceClient


client = InferenceClient(
    base_url="https://huggingface.co/api/integrations/dgx/v1",
    api_key=os.getenv("HF_TOKEN_ENTERPRISE")  # see docs: https://huggingface.co/blog/inference-dgx-cloud#create-a-fine-grained-token
)

output = client.chat.completions.create(
    model="meta-llama/Llama-3.1-405B-Instruct-FP8",  #"meta-llama/Meta-Llama-3.1-405B-Instruct-FP8",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Count to 10"},
    ],
    max_tokens=1024,
)

print(output)
replied to their post about 1 month ago
view reply

ok, I could reproduce the issue and I get the same error, I'm reporting it to Nvidia, thanks for reporting

replied to their post about 1 month ago
replied to their post about 1 month ago
reacted to clem's post with πŸ”₯ 3 months ago
view post
Post
4442
This is no Woodstock AI but will be fun nonetheless haha. I’ll be hosting a live workshop with team members next week about the Enterprise Hugging Face hub.

1,000 spots available first-come first serve with some surprises during the stream!

You can register and add to your calendar here: https://streamyard.com/watch/JS2jHsUP3NDM
Β·
reacted to m-ric's post with πŸ”₯ 3 months ago
view post
Post
1505
Transformers v4.45.0 released: includes a lightning-fast method to build tools! ⚑️

During user research with colleagues @MoritzLaurer and @Jofthomas , we discovered that the class definition currently in used to define a Tool in
transformers.agents is a bit tedious to use, because it goes in great detail.

➑️ So I’ve made an easier way to build tools: just make a function with type hints + a docstring, and add a @tool decorator in front.

βœ…Β VoilΓ , you’re good to go!

Read all about it in the new doc here: https://huggingface.co/docs/transformers/main/en/agents#create-a-new-tool

And don’t hesitate to give feedback, I’m all ears! πŸ€—