After some heated discussion 🔥, we clarify our intent re. storage limits on the Hub
TL;DR: - public storage is free, and (unless blatant abuse) unlimited. We do ask that you consider upgrading to PRO and/or Enterprise Hub if possible - private storage is paid above a significant free tier (1TB if you have a paid account, 100GB otherwise)
INTELLECT-1 is the first collaboratively trained 10 billion parameter language model trained from scratch on 1 trillion tokens of English text and code.
Drag and drop your assets (images/videos/audios) to create any video you want using natural language!
It works by asking the model to output a valid FFMPEG and this can be quite complex but most of the time Qwen2.5-Coder-32B gets it right (that thing is a beast). It's an update of an old project made with GPT4 and it was almost impossible to make it work with open models back then (~1.5 years ago), but not anymore, let's go open weights 🚀.
Qwen2.5-72B is now the default HuggingChat model. This model is so good that you must try it! I often get better results on rephrasing with it than Sonnet or GPT-4!!
🎙️ Listen to the audio "Podcast" of every single Hugging Face Daily Papers.
Now, "AI Paper Reviewer" project can automatically generates audio podcasts on any papers published on arXiv, and this is integrated into the GitHub Action pipeline. I sounds pretty similar to hashtag#NotebookLM in my opinion.
This audio podcast is powered by Google technologies: 1) Google DeepMind Gemini 1.5 Flash model to generate scripts of a podcast, then 2) Google Cloud Vertex AI's Text to Speech model to synthesize the voice turning the scripts into the natural sounding voices (with latest addition of "Journey" voice style)
"AI Paper Reviewer" is also an open source project. Anyone can use it to build and own a personal blog on any papers of your interests. Hence, checkout the project repository below if you are interested in! : https://github.com/deep-diver/paper-reviewer
This project is going to support other models including open weights soon for both text-based content generation and voice synthesis for the podcast. The only reason I chose Gemini model is that it offers a "free-tier" which is enough to shape up this projects with non-realtime batch generations. I'm excited to see how others will use this tool to explore the world of AI research, hence feel free to share your feedback and suggestions!
Effortlessly stay up-to-date with AI research trends using a new AI tool, "AI Paper Reviewer" !!
It analyzes a list of Hugging Face Daily Papers(w/ @akhaliq) and turn them into insightful blog posts. This project leverages Gemini models (1.5 Pro, 1.5 Flash, and 1.5 Flash-8B) for content generation and Upstage Document Parse for parsing the layout and contents. blog link: https://deep-diver.github.io/ai-paper-reviewer/
Also, here is the link of GitHub repository for parsing and generating pipeline. By using this, you can easily build your own GitHub static pages based on any arXiv papers with your own interest! : https://github.com/deep-diver/paper-reviewer
I just released an unofficial demo for Moonshine ASR!
Moonshine is a fast, efficient, & accurate ASR model released by Useful Sensors. It's designed for on-device inference and licensed under the MIT license!
Maybe like me you have always wanted a super easy way to compare llama3.2-1B vs. llama3.2-3B? or the same model with different temperatures?
Trying and comparing warm Inference API models has never been easier! Just go to https://hf.co/playground, set your token and you're ready to go. We'll keep improving, feedback welcome 😊
With the open-weight release of CogVideoX-5B from THUDM, i.e. GLM team, the Video Generation Model (how about calling it VGM) field has officially became the next booming "LLM"
What does the landscape look like? What are other video generation models? This collection below is all your need.
Find out by playing Fake Insects 🐞 a Game where you need to identify which insects are fake (AI generated). Good luck & share your best score in the comments!