moondream

Enterprise
company
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

vikhyatkย  new activity 1 day ago
moondream/gaze-demo:Total fail.
vikhyatkย  updated a Space 1 day ago
moondream/gaze-demo
Jamariย  updated a dataset 2 days ago
moondream/temp-VLMEvalKit-datasets
View all activity

moondream's activity

vikhyatkย 
in moondream/gaze-demo 1 day ago

Total fail.

1
#1 opened 1 day ago by
BrknSoul
vikhyatkย 
posted an update 3 months ago
view post
Post
2677
Just released a dataset with 7000+ hours of synthetically generated lo-fi music. vikhyatk/lofi
vikhyatkย 
posted an update 5 months ago
view post
Post
5555
Pushed a new update to vikhyatk/moondream2 today. TextVQA up from 60.2 to 65.2, DocVQA up from 61.9 to 70.5.

Space has been updated to the new model if you want to try it out! vikhyatk/moondream2
vikhyatkย 
posted an update 6 months ago
view post
Post
3262
๐Ÿš€ Exciting news! We've just launched "Thundermoon" - the latest version of Moondream, our open-source vision language model! ๐ŸŒ™

Key improvements in this release:
1. Massive leap in OCR capabilities
2. Enhanced document understanding
3. Significant boosts across key metrics:
* DocVQA: 61.9 (โ†‘103%)
* TextVQA: 60.2 (โ†‘5.2%)
* GQA: 64.9 (โ†‘2.9%)

What does this mean? Moondream can now tackle complex document analysis tasks with unprecedented accuracy for a model of its size. From deciphering handwritten notes to interpreting data tables, the applications are vast.

Check out the image for a glimpse of Moondream in action, effortlessly extracting insights from a 1944 sugar industry document!

Why it matters:
* Democratizing AI: As an open-source project, we're making advanced vision AI accessible to all developers.
* Efficiency: Proving that smaller models can deliver big results.
* Real-world impact: From historical document analysis to modern business intelligence, the potential use cases are exciting.

Curious to try it out? Try out the live demo here! https://moondream.ai/playground
ยท
vikhyatkย 
posted an update 8 months ago
vikhyatkย 
posted an update 8 months ago
view post
Post
3079
Just released a new version of vikhyatk/moondream2 - now supporting higher resolution images (up to 756x756)!

TextVQA score (which measures the model's ability to read and reason about text in images) is up from 53.1 to 57.2 (+7.7%). Other visual question answering and counting benchmark results are up ~0.5%.
vikhyatkย 
posted an update 8 months ago
view post
Post
1766
Cool new dataset from @isidentical - isidentical/moondream2-coyo-5M-captions

The VeCLIP paper showed a +3% gain while only using 14% of the data by synthetically captioning like this. You get diversity from the alt text (middle column) without having to deal with all of the noise.
  • 1 reply
ยท
vikhyatkย 
posted an update 9 months ago
view post
Post
3063
Updated the vikhyatk/lnqa dataset to include images, so you no longer need to separately download them from OpenImages!
vikhyatkย 
posted an update 9 months ago
view post
Post
3336
Released a new version of vikhyatk/moondream2 today! Primarily focused on improving OCR and captioning (e.g. "Describe this image", "Describe this image in one sentence"), but also seeing general improvement across all benchmarks.
  • 1 reply
ยท
vikhyatkย 
posted an update 10 months ago
vikhyatkย 
posted an update 10 months ago
view post
Post
2247
Just released a dataset with 1.5M image question/answers! vikhyatk/lnqa
vikhyatkย 
posted an update 10 months ago
view post
Post
New moondream update out with significantly improved OCR performance (among other benchmarks)!
vikhyatk/moondream2
ยท
vikhyatkย 
posted an update 10 months ago
vikhyatkย 
posted an update 10 months ago
view post
Post
Just released moondream2 - a small 1.8B parameter vision language model. Now fully open source (Apache 2.0) so you can use it without restrictions on commercial use!

vikhyatk/moondream2
ยท
vikhyatkย 
posted an update 12 months ago
view post
Post
moondream1 can now be used directly from transformers!
  • 1 reply
ยท