17 187 597

𝒕𝒂𝒏𝒗𝒊𝒓

Tanvir1337

https://linktr.ee/tanvir1337x

AI & ML interests

Deep Learning, Generative Adversarial Networks, Transformer, Diffusion, SOTA Foundation Models

Recent Activity

liked a model about 4 hours ago

NovaSky-AI/Sky-T1-32B-Preview

updated a collection about 20 hours ago

Spaces

liked a Space about 20 hours ago

Pendrokar/TTS-Spaces-Arena

View all activity

Organizations

Tanvir1337's activity

liked a model about 4 hours ago

NovaSky-AI/Sky-T1-32B-Preview

Text Generation • Updated about 23 hours ago • 70 • 43

updated a collection about 20 hours ago

Spaces

Collection

88 items • Updated about 20 hours ago

liked a Space about 20 hours ago

Running

153

🤗🏆

TTS Spaces Arena

Vote on the top HF TTS models!

updated a collection about 20 hours ago

Speech Models

Collection

10 items • Updated about 20 hours ago • 1

updated a dataset about 20 hours ago

Tanvir1337/rpapers

Updated about 20 hours ago • 745

upvoted a paper 1 day ago

Agent Laboratory: Using LLM Agents as Research Assistants

Paper • 2501.04227 • Published 4 days ago • 58

updated a collection 2 days ago

Language Models

Collection

150 items • Updated 2 days ago • 1

liked a model 2 days ago

microsoft/phi-4

Text Generation • Updated 3 days ago • 35.9k • 965

reacted to Severian's post with ❤️ 2 days ago

Post

3663

Interesting Solution to the Problem of Misguided Attention

So I've been fascinated by the problem of Misguided Attention for a few weeks. I am trying to build an inference algorithm to help LLMs address that issue; but in the process, I found a cool short-term fix I call "Mindful Attention" using just prompt-engineering.

Have you ever thought about how our brains filter reality through layers of past experiences, concepts, and mental images? For example, when you look at an oak tree, are you truly seeing that oak tree in all its unique details, or are you overlaying it with a generalized idea of "oak tree"? This phenomenon inspired the new approach.

LLMs often fall into a similar trap, hence the Misguided Attention problem. They process input not as it’s uniquely presented but through patterns and templates they’ve seen before. This leads to responses that can feel "off," like missing the point of a carefully crafted prompt or defaulting to familiar but irrelevant solutions.

I wanted to address this head-on by encouraging LLMs to slow down, focus, and engage directly with the input—free of assumptions. This is the core of the Mindful Attention Directive, a prompt designed to steer models away from over-generalization and back into the moment.

You can read more about the broader issue here: https://github.com/cpldcpu/MisguidedAttention

And if you want to try this mindful approach in action, check out the LLM I’ve set up for testing: https://hf.co/chat/assistant/677e7ebcb0f26b87340f032e. It works about 80% of the time to counteract these issues, and the results are pretty cool.

I'll add the Gist with the full prompt. I admit, it is quite verbose but it's the most effective one I have landed on yet. I am working on a smaller version that can be appended to any System Prompt to harness the Mindful Attention. Feel free to experiment to find a better version for the community!

Here is the Gist: https://gist.github.com/severian42/6dd96a94e546a38642278aeb4537cfb3