Paraskevi Kivroglou's picture

Paraskevi Kivroglou

KvrParaskevi

AI & ML interests

I am looking forward into a world full of AI innovation. By having small ideas in new projects, I want to take the next step and give them life.

Recent Activity

liked a dataset 1 day ago
semeru/code-text-python
liked a dataset 4 days ago
CodeEval-Pro/mbpp-pro
liked a dataset 5 days ago
m-ric/huggingface_doc
View all activity

Organizations

GEM benchmark's profile picture lora concepts library's profile picture Blog-explorers's profile picture ZeroGPU Explorers's profile picture INNOVA AI's profile picture Cognitive Computations's profile picture

KvrParaskevi's activity

reacted to reach-vb's post with ๐Ÿš€ 2 months ago
view post
Post
2995
Smol models ftw! AMD released AMD OLMo 1B - beats OpenELM, tiny llama on MT Bench, Alpaca Eval - Apache 2.0 licensed ๐Ÿ”ฅ

> Trained with 1.3 trillion (dolma 1.7) tokens on 16 nodes, each with 4 MI250 GPUs

> Three checkpoints:

- AMD OLMo 1B: Pre-trained model
- AMD OLMo 1B SFT: Supervised fine-tuned on Tulu V2, OpenHermes-2.5, WebInstructSub, and Code-Feedback datasets
- AMD OLMo 1B SFT DPO: Aligned with human preferences using Direct Preference Optimization (DPO) on UltraFeedback dataset

Key Insights:
> Pre-trained with less than half the tokens of OLMo-1B
> Post-training steps include two-phase SFT and DPO alignment
> Data for SFT:
- Phase 1: Tulu V2
- Phase 2: OpenHermes-2.5, WebInstructSub, and Code-Feedback

> Model checkpoints on the Hub & Integrated with Transformers โšก๏ธ

Congratulations & kudos to AMD on a brilliant smol model release! ๐Ÿค—

amd/amd-olmo-6723e7d04a49116d8ec95070
replied to qq8933's post 2 months ago
view reply

Awesome work. Can we finetune further this reasoning model?

reacted to qq8933's post with ๐Ÿ‘ 2 months ago
view post
Post
6389
LLaMA-O1: Open Large Reasoning Model Frameworks For Training, Inference and Evaluation With PyTorch and HuggingFace
Large Reasoning Models powered by Monte Carlo Tree Search (MCTS), Self-Play Reinforcement Learning, PPO, AlphaGo Zero's dua policy paradigm and Large Language Models!
https://github.com/SimpleBerry/LLaMA-O1/

What will happen when you compound MCTS โค LLM โค Self-Play โคRLHF?
Just a little bite of strawberry!๐Ÿ“

Past related works:
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning (2410.02884)
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B (2406.07394)
  • 2 replies
ยท
reacted to nroggendorff's post with ๐Ÿ‘€ 2 months ago
view post
Post
2653
When huggingface patches this, I'm going to be really sad, but in the meantime, here you go:

When AutoTrain creates a new space to train your model, it does so via the huggingface API. If you modify the code so that it includes a premade README.md file, you can add these two lines:

---
app_port: 8080 # or any integer besides 7860 that's greater than 2 ** 10
startup_duration_timeout: 350m
---


This will tell huggingface to listen for the iframe on your port, instead of the one autotrain is actually hosting on, and because startup time isn't charged, you get the product for free. (you can take this even further by switching compute type to A100 or something)
  • 1 reply
ยท