Checkout phi-4 from Microsoft, dropped a day ago... If you ❤️ the Phi series, then here is the GGUF - Sri-Vigneshwar-DJ/phi-4-GGUF. phi-4 is a 14B highly efficient open LLM that beats much larger models at math and reasoning - check out evaluations on the Open LLM.
Just sharing a thought: I started using DeepSeek V3 a lot, and an idea struck me about agents "orchestrating during inference" on a test-time compute model like DeepSeek V3 or the O1 series.
Agents (Instruction + Function Calls + Memory) execute during inference, and based on the output decision, a decision is made to scale the time to reason or perform other tasks.
Combining smolagents with Anthropic’s best practices simplifies building powerful AI agents:
1. Code-Based Agents: Write actions as Python code, reducing steps by 30%. 2. Prompt Chaining: Break tasks into sequential subtasks with validation gates. 3. Routing: Classify inputs and direct them to specialized handlers. 4. Fallback: Handle tasks even if classification fails.
Hi HuggingFacers🤗, I decided to ship early this year, and here's what I came up with:
𝐏𝐝𝐟𝐈𝐭𝐃𝐨𝐰𝐧 (https://github.com/AstraBert/PdfItDown) - If you're like me, and you have all your RAG pipeline optimized for PDFs, but not for other data formats, here is your solution! With PdfItDown, you can convert Word documents, presentations, HTML pages, markdown sheets and (why not?) CSVs and XMLs in PDF format, for seamless integration with your RAG pipelines. Built upon MarkItDown by Microsoft GitHub Repo 👉 https://github.com/AstraBert/PdfItDown PyPi Package 👉 https://pypi.org/project/pdfitdown/
🎉 Reached HuggingFace Trending Top 100 in Just One Day! Introducing Mouse-I
First, we want to thank everyone who helped Mouse-I reach the HuggingFace Spaces Trending Top 100! We're especially excited that a game called "Jewel Pop Game," created using Mouse-I, has reached the global top 160. With this overwhelming response, we're thrilled to introduce Mouse-I, an AI-powered code generation and automatic deployment tool by Bidraft.
✨ What is Mouse-I? Mouse-I is an innovative tool that automatically generates and deploys working web services within 60 seconds, simply based on your prompt input.
🚀 Key Features
One-Click Real-time Deployment: Complete from prompt to deployment in just 60 seconds Real-time Preview: Instantly check your generated code results 40+ Templates: Ready-to-use templates including MBTI tests, investment management tools, Tetris games, and more Real-time Editing: Instantly modify and apply generated code
⚡ How to Use Create your own web service in just 3 steps:
Enter your prompt (15 seconds) Code generation (40 seconds) Deploy (5 seconds)
🌟 What Makes Us Special
Ultra-fast code generation powered by NVIDIA H100 GPUs Advanced multi-LLM complex agent technology All generated web apps available for free viewing and use in our marketplace
🔍 Current Status
Over 3,000 web apps generated, with 160+ successfully deployed 30x faster service completion compared to competing services
🎈 Join Our Beta Test Try Mouse-I for free right now! 👉 Experience Mouse-I 🔮 Future Plans We're planning to launch 'Mouse-II', specialized for backend system development, within this year. When used together with Mouse-I, it will enable complete automation of full-stack development.
We look forward to your feedback and suggestions about Mouse-I! Thank you for your interest and support 🙏 #AI #CodeGeneration #WebDevelopment #HuggingFace #MouseI #Bidraft #AICodeAssistant
🏄♂️While browsing new models, I stumbled upon Lumiere from aixonlab. After testing it, I feel it has considerable potential. Keep up the good work!
Lumiere Alpha is a model focusing on improving realism without compromising prompt coherency or changing the composition completely from the original Flux.1-Dev model.
🎉 We’re excited to announce, in collaboration with @kaleidophon , the release of the models from our Apricot 🍑 paper, "Apricot: Calibrating Large Language Models Using Their Generations Only," accepted at ACL 2024! Reproducibility is essential in science, and we've worked hard to make it as seamless as possible. parameterlab/apricot-models-673d2cae40b6ff437a86f0bf
When the XetHub crew joined Hugging Face this fall, @erinys and I started brainstorming how to share our work to replace Git LFS on the Hub. Uploading and downloading large models and datasets takes precious time. That’s where our chunk-based approach comes in.
Instead of versioning files (like Git and Git LFS), we version variable-sized chunks of data. For the Hugging Face community, this means:
⏩ Only upload the chunks that changed. 🚀 Download just the updates, not the whole file. 🧠 We store your file as deduplicated chunks
In our benchmarks, we found that using CDC to store iterative model and dataset version led to transfer speedups of ~2x, but this isn’t just a performance boost. It’s a rethinking of how we manage models and datasets on the Hub.
We're planning on our new storage backend to the Hub in early 2025 - check out our blog to dive deeper, and let us know: how could this improve your workflows?
I'd like to share here a bit more about our Deep Learning Containers (DLCs) we built with Google Cloud, to transform the way you build AI with open models on this platform!
With pre-configured, optimized environments for PyTorch Training (GPU) and Inference (CPU/GPU), Text Generation Inference (GPU), and Text Embeddings Inference (CPU/GPU), the Hugging Face DLCs offer:
⚡ Optimized performance on Google Cloud's infrastructure, with TGI, TEI, and PyTorch acceleration. 🛠️ Hassle-free environment setup, no more dependency issues. 🔄 Seamless updates to the latest stable versions. 💼 Streamlined workflow, reducing dev and maintenance overheads. 🔒 Robust security features of Google Cloud. ☁️ Fine-tuned for optimal performance, integrated with GKE and Vertex AI. 📦 Community examples for easy experimentation and implementation. 🔜 TPU support for PyTorch Training/Inference and Text Generation Inference is coming soon!