E PRO

SixOpen

AI & ML interests

None yet

Recent Activity

Organizations

GEM benchmark's profile picture Blog-explorers's profile picture C4AI Community's profile picture

SixOpen's activity

reacted to clem's post with ❀️ 6 months ago
view post
Post
5781
5,000 new repos (models, datasets, spaces) are created EVERY DAY on HF now. The community is amazing!
New activity in SixOpen/Florence-2-large-ft 6 months ago

[solved]

1
#3 opened 7 months ago by
Qualzz20

[solved]

1
#3 opened 7 months ago by
Qualzz20
reacted to reach-vb's post with πŸ”₯ 6 months ago
view post
Post
5270
Yet another rewarding week in Open Source AI:

1. Google dropped Gemma 27B & 9B - The best open (commercially permissive) LLM out there, according to LYMSYS.
google/gemma-2-release-667d6600fd5220e7b967f315

2. Mars5 TTS - Text to Speech with insane prosodies control & voice cloning.
CAMB-AI/MARS5-TTS

3. Meta shipped LLM Compiler - beats GPT 4 on code optimisation and compiler reasoning.
facebook/llm-compiler-667c5b05557fe99a9edd25cb

4. Arcee-Spark - Qwen2 7B (w/ merging) fine-tuned further to beat GPT 3.5 on MT Bench.
arcee-ai/Arcee-Spark

5. Gemini Nano out in the wild in Chrome - On device LLM with just 2 lines of code (fully offline)

6. Fal released a fully Open Source GAN based Super-Resolution model (with second version already cooking)
fal/AuraSR

7. NYU release Cambrian 1 - Vision Multimodal LLM that beats pretty much all other closed source competition 8-34B model size
https://huggingface.co/nyu-visionx

And.. much more like Open LLM Leaderboard got a major update, LYMSYS released Chat Vision Arena, OpenAI released a paper on CriticGPT!

What a lovely week, can’t wait for the next to see what the community is up to! Put it down in comments if I missed something πŸ”₯
  • 1 reply
Β·
New activity in ggml-org/gguf-my-repo 7 months ago

Please support this method:

7
#96 opened 7 months ago by
ZeroWw
reacted to alex-abb's post with πŸ”₯ 7 months ago
view post
Post
4820
Hi everyone!
I'm Alex, I'm 16, I've been an internship at Hugging Face for a little over a week and I've already learned a lot about using and prompting LLM models. With @victor as tutor I've just finished a space that analyzes your feelings by prompting an LLM chat model. The aim is to extend it so that it can categorize hugging face posts.

alex-abb/LLM_Feeling_Analyzer
Β·
reacted to merve's post with πŸ€— 7 months ago
view post
Post
6060
Fine-tune Florence-2 on any task πŸ”₯

Today we release a notebook and a walkthrough blog on fine-tuning Florence-2 on DocVQA dataset @andito @SkalskiP

Blog: https://huggingface.co/blog πŸ“•
Notebook: https://colab.research.google.com/drive/1hKDrJ5AH_o7I95PtZ9__VlCTNAo1Gjpf?usp=sharing πŸ“–
Florence-2 is a great vision-language model thanks to it's massive dataset and small size!

This model requires conditioning through task prefixes and it's not as generalist, requiring fine-tuning on a new task, such as DocVQA πŸ“

We have fine-tuned the model on A100 (and one can also use a smaller GPU with smaller batch size) and saw that model picks up new tasks πŸ₯Ή

See below how it looks like before and after FT 🀩
Play with the demo here andito/Florence-2-DocVQA πŸ„β€β™€οΈ
New activity in SixOpen/Florence-2-large-ft 7 months ago

Update app.py

2
#2 opened 7 months ago by
D4ve-R

Great work!

1
#1 opened 7 months ago by
merve
reacted to merve's post with πŸ”₯ 7 months ago
view post
Post
4216
I love Depth Anything V2 😍
It’s Depth Anything, but scaled with both larger teacher model and a gigantic dataset!

Here's a small TLDR of paper with a lot of findings, experiments and more.
I have also created a collection that has the models, the dataset, the demo and CoreML converted model 😚 merve/depth-anything-v2-release-6671902e798cd404513ffbf5

The authors have analyzed Marigold, a diffusion based model against Depth Anything and found out what’s up with using synthetic images vs real images for MDE:

πŸ”– Real data has a lot of label noise, inaccurate depth maps (caused by depth sensors missing transparent objects etc) and there are many details overlooked

πŸ”– Synthetic data have more precise and detailed depth labels and they are truly ground-truth, but there’s a distribution shift between real and synthetic images, and they have restricted scene coverage

The authors train different image encoders only on synthetic images and find out unless the encoder is very large the model can’t generalize well (but large models generalize inherently anyway) 🧐
But they still fail encountering real images that have wide distribution in labels (e.g. diverse instances of objects) πŸ₯²

Depth Anything v2 framework is to..

πŸ¦– Train a teacher model based on DINOv2-G based on 595K synthetic images
🏷️ Label 62M real images using teacher model
πŸ¦• Train a student model using the real images labelled by teacher
Result: 10x faster and more accurate than Marigold!

The authors also construct a new benchmark called DA-2K that is less noisy, highly detailed and more diverse!