Have you tried out π€ Transformers.js v3? Here are the new features: β‘ WebGPU support (up to 100x faster than WASM) π’ New quantization formats (dtypes) π 120 supported architectures in total π 25 new example projects and templates π€ Over 1200 pre-converted models π Node.js (ESM + CJS), Deno, and Bun compatibility π‘ A new home on GitHub and NPM
For anyone who struggles with NER or information extraction with LLM.
We showed an efficient workflow for token classification including zero-shot suggestions and model fine-tuning with Argilla, GliNER, the NuMind NuExtract LLM and SpanMarker. @argilla
π The leaderboard is available in both Japanese and English π Based on the evaluation tool, llm-jp-eval with more than 20 datasets for Japanese LLMs π The leaderboard showcases all the metrics for NLP experts, plus averages for NLP beginners π» For the comfort of users, we chose a horizontal UI, and implemented it in a light and dark theme on Gradio π¬ The radar chart provides a very interesting visualization of metrics! π± We are using the Japanese research platform, MDX, so please be patient! β‘ LLMs bigger than +70B will be evaluated soonβ¦
How do you say βGPUs Go Brrrβ in Japanese - > GPUγγγ³γγ³ο½! (To pronounce "GPU ga bunbun!") π₯
4 replies
Β·
reacted to monsoon-nlp's
post with πabout 2 months ago
Build a collection for the trending demos recently released by the Chinese community π From Qwen2.5 Turbo to FishAgent, see what these models can really do π₯ zh-ai-community/trending-demo-673b6ca2416a3b3c9d3bf8f1
reacted to jsulz's
post with πabout 2 months ago
In August, the XetHub team joined Hugging Face - https://huggingface.co/blog/xethub-joins-hf - and weβve been rolling up our sleeves to bring the best of both worlds together. We started with a deep dive into the current state of files stored with Git LFS on the Hub.
Getting this information was no small feat. We had to: * Analyze a complete database dump of all repositories and files stored in Git LFS across Hugging Face. * Parse through metadata on file sizes and types to accurately map the storage breakdown across Spaces, Models, and Datasets.
When the XetHub crew joined Hugging Face this fall, @erinys and I started brainstorming how to share our work to replace Git LFS on the Hub. Uploading and downloading large models and datasets takes precious time. Thatβs where our chunk-based approach comes in.
Instead of versioning files (like Git and Git LFS), we version variable-sized chunks of data. For the Hugging Face community, this means:
In our benchmarks, we found that using CDC to store iterative model and dataset version led to transfer speedups of ~2x, but this isnβt just a performance boost. Itβs a rethinking of how we manage models and datasets on the Hub.
We're planning on our new storage backend to the Hub in early 2025 - check out our blog to dive deeper, and let us know: how could this improve your workflows?