Clelia (Astra) Bertelli PRO

as-cle-bert

https://www.cleliasportfolio.xyz

AI & ML interests

Recent Activity

replied to their post about 17 hours ago

Hi HuggingFace community!🤗 I recently released PrAIvateSearch v2.0-beta.0 (https://github.com/AstraBert/PrAIvateSearch), my privacy-first, AI-powered, user-centered and data-safe application aimed at providing a local and open-source alternative to big AI search engines such as SearchGPT or Perplexity AI. We have several key changes: - New chat UI built with NextJS - DuckDuckGo API used for web search instead of Google - https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct as a language model served on API (by FastAPI) - Crawl4AI crawler used for web scraping - Optimizations in the data workflow inside the application Read more in my blog post 👉 https://huggingface.co/blog/as-cle-bert/search-the-web-with-ai Have fun and feel free to leave feedback about how to improve the application!✨

posted an update 1 day ago

upvoted an article 1 day ago

Search the Web with AI

View all activity

Articles

Organizations

as-cle-bert's activity

replied to their post about 17 hours ago

Thank you so much for letting me know! This is indeed a very interesting role :)

posted an update 1 day ago

Post

716

Hi HuggingFace community!🤗

I recently released PrAIvateSearch v2.0-beta.0 (https://github.com/AstraBert/PrAIvateSearch), my privacy-first, AI-powered, user-centered and data-safe application aimed at providing a local and open-source alternative to big AI search engines such as SearchGPT or Perplexity AI.

We have several key changes:

- New chat UI built with NextJS
- DuckDuckGo API used for web search instead of Google
- Qwen/Qwen2.5-1.5B-Instruct as a language model served on API (by FastAPI)
- Crawl4AI crawler used for web scraping
- Optimizations in the data workflow inside the application

Read more in my blog post 👉 https://huggingface.co/blog/as-cle-bert/search-the-web-with-ai

Have fun and feel free to leave feedback about how to improve the application!✨

3 replies

upvoted an article 1 day ago

Article

Search the Web with AI

•

1 day ago

• 2

liked a model 1 day ago

prithivida/Splade_PP_en_v1

Fill-Mask • Updated Aug 24, 2024 • 2.81k • 21

liked a model 2 days ago

microsoft/phi-4

Text Generation • Updated 3 days ago • 35.9k • 958

liked a model 3 days ago

OramaSearch/query-translator-mini

Updated 3 days ago • 47 • 4

liked 2 models 5 days ago

Qwen/Qwen2.5-1.5B-Instruct

Text Generation • Updated Sep 25, 2024 • 383k • • 270

meta-llama/Llama-3.2-1B-Instruct

Text Generation • Updated Oct 24, 2024 • 1.02M • • 688

posted an update 7 days ago

Post

534

Are you using Obsidian to write your notes?
If the answer is yes, then this post might be for you!✅
I recently created 𝐨𝐛𝐬𝐢𝐝𝐢𝐚𝐧-𝐝𝐢𝐠𝐞𝐬𝐭, a Google Gemini-powered application that gives you feedback on style and contents of the documents you have been working on🧠

Repo 👉 https://github.com/AstraBert/obsidian-digest
PyPi Package 👉 https://pypi.org/project/obsidian-digest/

The app is available as:
- 𝐜𝐨𝐦𝐦𝐚𝐧𝐝-𝐥𝐢𝐧𝐞 𝐭𝐨𝐨𝐥: install it as a python package with 𝗽𝗶𝗽, and execute it from terminal anytime!📦
-𝐃𝐢𝐬𝐜𝐨𝐫𝐝 𝐁𝐨𝐭 𝐛𝐮𝐢𝐥𝐭 𝐟𝐫𝐨𝐦 𝐬𝐨𝐮𝐫𝐜𝐞 𝐜𝐨𝐝𝐞: clone the GitHub repo, install the needed dependencies through 𝗰𝗼𝗻𝗱𝗮, and run the bot: you will get hourly messages with suggestions and considerations about your activity on Obsidian in the previous hour🤖
- 𝐃𝐢𝐬𝐜𝐨𝐫𝐝 𝐁𝐨𝐭 𝐝𝐞𝐩𝐥𝐨𝐲𝐞𝐝 𝐥𝐨𝐜𝐚𝐥𝐥𝐲 𝐰𝐢𝐭𝐡 𝐝𝐨𝐜𝐤𝐞𝐫 𝐜𝐨𝐦𝐩𝐨𝐬𝐞: clone the GitHub repo and launch 𝗱𝗼𝗰𝗸𝗲𝗿 𝗰𝗼𝗺𝗽𝗼𝘀𝗲 𝘂𝗽. Docker builds an image on the fly with all the needed dependencies and scripts, and runs them. You'll have the same functionalities as the ones from source code, but with a way easier deployment process🐋

Go check out the GitHub repo for more info 👉 https://github.com/AstraBert/obsidian-digest

Have fun!✨

1 reply

replied to their post 9 days ago

Hi and thanks a lot for the specification!🥰

Just as a note from my side, in the article I specify that there is a difference between "open weights" and "open source" models, and I link this blog post: https://www.agora.software/en/llm-open-source-open-weight-or-proprietary/ for a deeper explanation of the difference. I never (and I would never) claimed that Llama is open source, let alone a free software (see the introduction in this article of mine on privacy and data "stealing" risks: https://huggingface.co/blog/as-cle-bert/build-an-ai-powered-search-engine-from-scratch).

And I would have gladly used also DeepSeek, if it had been available on HuggingChat! :)

I nevertheless highly appreciate your comment and I'll for sure be more cautious in using the word "open/open source" in the future. Thanks!✨

replied to their post 9 days ago

Both PdfItDown and SenTrEv only work with text for now: in future releases, support for image will be added :)
For text extraction, I use PyPDF + Langchain

posted an update 9 days ago

Post

2053

🎉𝐄𝐚𝐫𝐥𝐲 𝐍𝐞𝐰 𝐘𝐞𝐚𝐫 𝐫𝐞𝐥𝐞𝐚𝐬𝐞𝐬🎉

Hi HuggingFacers🤗, I decided to ship early this year, and here's what I came up with:

𝐏𝐝𝐟𝐈𝐭𝐃𝐨𝐰𝐧 (https://github.com/AstraBert/PdfItDown) - If you're like me, and you have all your RAG pipeline optimized for PDFs, but not for other data formats, here is your solution! With PdfItDown, you can convert Word documents, presentations, HTML pages, markdown sheets and (why not?) CSVs and XMLs in PDF format, for seamless integration with your RAG pipelines. Built upon MarkItDown by Microsoft
GitHub Repo 👉 https://github.com/AstraBert/PdfItDown
PyPi Package 👉 https://pypi.org/project/pdfitdown/

𝐒𝐞𝐧𝐓𝐫𝐄𝐯 𝐯𝟏.𝟎.𝟎 (https://github.com/AstraBert/SenTrEv/tree/v1.0.0) - If you need to evaluate the 𝗿𝗲𝘁𝗿𝗶𝗲𝘃𝗮𝗹 performance of your 𝘁𝗲𝘅𝘁 𝗲𝗺𝗯𝗲𝗱𝗱𝗶𝗻𝗴 models, I have good news for you🥳🥳
The new release for 𝐒𝐞𝐧𝐓𝐫𝐄𝐯 now supports 𝗱𝗲𝗻𝘀𝗲 and 𝘀𝗽𝗮𝗿𝘀𝗲 retrieval (thanks to FastEmbed by Qdrant) with 𝘁𝗲𝘅𝘁-𝗯𝗮𝘀𝗲𝗱 𝗳𝗶𝗹𝗲 𝗳𝗼𝗿𝗺𝗮𝘁𝘀 (.docx, .pptx, .csv, .html, .xml, .md, .pdf) and new 𝗿𝗲𝗹𝗲𝘃𝗮𝗻𝗰𝗲 𝗺𝗲𝘁𝗿𝗶𝗰𝘀!
GitHub repo 👉 https://github.com/AstraBert/SenTrEv
Release Notes 👉 https://github.com/AstraBert/SenTrEv/releases/tag/v1.0.0
PyPi Package 👉 https://pypi.org/project/sentrev/

Happy New Year and have fun!🥂

2 replies

reacted to nroggendorff's post with ➕ 10 days ago

Post

6219

hey nvidia, can you send me a gpu?
comment or react if you want ~~me~~ to get one too. 👉👈

22 replies

posted an update 12 days ago

Post

544

Hi HF Community!🤗

As my last 2024 contribution, I decided to write an article about a Competitive Debate Championship simulation I ran with 5 LLMs as competitors and 2 as judges:

https://huggingface.co/blog/as-cle-bert/debate-championship-for-llms

The article covers code, analyses and results, and you can find everything to reproduce this tournament in the GitHub repo 👉 https://github.com/AstraBert/DebateLLM-Championship

I also released a dataset related to the data (motions, arguments, topics, winners...) collected during the tournament 👉 as-cle-bert/DebateLLMs

Happy reading and happy new yeAIr!🎉

3 replies

upvoted an article 12 days ago

Article

Debate Championship for LLMs

•

12 days ago

• 4

published an article 12 days ago

Article

Debate Championship for LLMs

•

12 days ago

• 4

updated a dataset 12 days ago

as-cle-bert/DebateLLMs

Viewer • Updated 12 days ago • 20 • 22 • 2

liked a dataset 13 days ago

as-cle-bert/DebateLLMs

Viewer • Updated 12 days ago • 20 • 22 • 2

liked 2 models 13 days ago

microsoft/Phi-3.5-mini-instruct

Text Generation • Updated Sep 18, 2024 • 613k • • 749

google/gemma-2-2b-it

Text Generation • Updated Aug 27, 2024 • 380k • • 857

Clelia (Astra) Bertelli PRO

AI & ML interests

Recent Activity

Articles

Search the Web with AI

Debate Championship for LLMs

Building an AI-powered search engine from scratch

streamlit_supabase_auth_ui

AI is turning nuclear: a review

Is AI carbon footprint worrisome?

_Repetita iuvant_: how to improve AI code generation

BrAIn: next generation neurons?

What is going on with AlphaFold3?

Organizations

as-cle-bert's activity

Search the Web with AI

Debate Championship for LLMs

Debate Championship for LLMs