Beckett Dillon PRO
Severian
AI & ML interests
I make music, teach machines, study nature, and build things.
Recent Activity
liked
a Space
7 minutes ago
Remsky/Kokoro-TTS-Zero
liked
a model
about 2 hours ago
NovaSky-AI/Sky-T1-32B-Preview
liked
a Space
about 3 hours ago
opendatalab/MinerU
Articles
Organizations
Posts
10
Post
3630
Interesting Solution to the Problem of Misguided Attention
So I've been fascinated by the problem of Misguided Attention for a few weeks. I am trying to build an inference algorithm to help LLMs address that issue; but in the process, I found a cool short-term fix I call "Mindful Attention" using just prompt-engineering.
Have you ever thought about how our brains filter reality through layers of past experiences, concepts, and mental images? For example, when you look at an oak tree, are you truly seeing that oak tree in all its unique details, or are you overlaying it with a generalized idea of "oak tree"? This phenomenon inspired the new approach.
LLMs often fall into a similar trap, hence the Misguided Attention problem. They process input not as it’s uniquely presented but through patterns and templates they’ve seen before. This leads to responses that can feel "off," like missing the point of a carefully crafted prompt or defaulting to familiar but irrelevant solutions.
I wanted to address this head-on by encouraging LLMs to slow down, focus, and engage directly with the input—free of assumptions. This is the core of the Mindful Attention Directive, a prompt designed to steer models away from over-generalization and back into the moment.
You can read more about the broader issue here: https://github.com/cpldcpu/MisguidedAttention
And if you want to try this mindful approach in action, check out the LLM I’ve set up for testing: https://hf.co/chat/assistant/677e7ebcb0f26b87340f032e. It works about 80% of the time to counteract these issues, and the results are pretty cool.
I'll add the Gist with the full prompt. I admit, it is quite verbose but it's the most effective one I have landed on yet. I am working on a smaller version that can be appended to any System Prompt to harness the Mindful Attention. Feel free to experiment to find a better version for the community!
Here is the Gist: https://gist.github.com/severian42/6dd96a94e546a38642278aeb4537cfb3
So I've been fascinated by the problem of Misguided Attention for a few weeks. I am trying to build an inference algorithm to help LLMs address that issue; but in the process, I found a cool short-term fix I call "Mindful Attention" using just prompt-engineering.
Have you ever thought about how our brains filter reality through layers of past experiences, concepts, and mental images? For example, when you look at an oak tree, are you truly seeing that oak tree in all its unique details, or are you overlaying it with a generalized idea of "oak tree"? This phenomenon inspired the new approach.
LLMs often fall into a similar trap, hence the Misguided Attention problem. They process input not as it’s uniquely presented but through patterns and templates they’ve seen before. This leads to responses that can feel "off," like missing the point of a carefully crafted prompt or defaulting to familiar but irrelevant solutions.
I wanted to address this head-on by encouraging LLMs to slow down, focus, and engage directly with the input—free of assumptions. This is the core of the Mindful Attention Directive, a prompt designed to steer models away from over-generalization and back into the moment.
You can read more about the broader issue here: https://github.com/cpldcpu/MisguidedAttention
And if you want to try this mindful approach in action, check out the LLM I’ve set up for testing: https://hf.co/chat/assistant/677e7ebcb0f26b87340f032e. It works about 80% of the time to counteract these issues, and the results are pretty cool.
I'll add the Gist with the full prompt. I admit, it is quite verbose but it's the most effective one I have landed on yet. I am working on a smaller version that can be appended to any System Prompt to harness the Mindful Attention. Feel free to experiment to find a better version for the community!
Here is the Gist: https://gist.github.com/severian42/6dd96a94e546a38642278aeb4537cfb3
Post
585
Early Morning Before Work Project:
🌌 Introducing Cascade of Semantically Integrated Layers (CaSIL): A Humorously Over-Engineered Algorithm That Actually… Works 🤷♂️
Let me introduce CaSIL – the Cascade of Semantically Integrated Layers. Imagine giving a single question the level of introspection typically reserved for philosophical debates or maybe therapy. In short, CaSIL is a pure Python reasoning algorithm that, in a series of semantically rich layers, takes any input and rebuilds it into a nuanced response that’s (surprisingly) meaningful to a human.
I’ve been experimenting with various reasoning and agent approaches lately and decided to contribute my own quirky take on layered processing. It’s built without agent frameworks—just good ol' Python and math—and it plays nicely with any LLM. The result? A transformation from simple responses to deeper, interconnected insights. Here’s a quick peek at the steps:
✨ How CaSIL Works:
Initial Understanding: The first layer captures the basic concepts in your input, just as a warm-up.
Relationship Analysis: A lightweight knowledge graph (because why not?) maps out related ideas and builds interconnections.
Context Integration: Adds historical or contextual knowledge, bringing a bit of depth and relevance.
Response Synthesis: Pieces it all together, aiming to produce a response that feels more like a conversation than an outdated search result.
Does it work? Yes! And in record time, too. Admittedly, the code is rough—two days of intense coding with some friendly help from Claude. The beauty of CaSIL is its simplicity and versatility; it’s a pure algorithm without complex dependencies, making it easy to integrate into your own LLM setups.
🔗 Explore the repo here: https://github.com/severian42/Cascade-of-Semantically-Integrated-Layers
📜 Example outputs: https://github.com/severian42/Cascade-of-Semantically-Integrated-Layers/blob/main/examples.md
🌌 Introducing Cascade of Semantically Integrated Layers (CaSIL): A Humorously Over-Engineered Algorithm That Actually… Works 🤷♂️
Let me introduce CaSIL – the Cascade of Semantically Integrated Layers. Imagine giving a single question the level of introspection typically reserved for philosophical debates or maybe therapy. In short, CaSIL is a pure Python reasoning algorithm that, in a series of semantically rich layers, takes any input and rebuilds it into a nuanced response that’s (surprisingly) meaningful to a human.
I’ve been experimenting with various reasoning and agent approaches lately and decided to contribute my own quirky take on layered processing. It’s built without agent frameworks—just good ol' Python and math—and it plays nicely with any LLM. The result? A transformation from simple responses to deeper, interconnected insights. Here’s a quick peek at the steps:
✨ How CaSIL Works:
Initial Understanding: The first layer captures the basic concepts in your input, just as a warm-up.
Relationship Analysis: A lightweight knowledge graph (because why not?) maps out related ideas and builds interconnections.
Context Integration: Adds historical or contextual knowledge, bringing a bit of depth and relevance.
Response Synthesis: Pieces it all together, aiming to produce a response that feels more like a conversation than an outdated search result.
Does it work? Yes! And in record time, too. Admittedly, the code is rough—two days of intense coding with some friendly help from Claude. The beauty of CaSIL is its simplicity and versatility; it’s a pure algorithm without complex dependencies, making it easy to integrate into your own LLM setups.
🔗 Explore the repo here: https://github.com/severian42/Cascade-of-Semantically-Integrated-Layers
📜 Example outputs: https://github.com/severian42/Cascade-of-Semantically-Integrated-Layers/blob/main/examples.md
Collections
5
spaces
30
models
44
Severian/Nexus-IKM-RolePlay-StoryWriter-Hermes-2-Pro-7B-GGUF
Text Generation
•
Updated
•
13
•
1
Severian/Jamba-v0.1-Claude-Chat-GGUF
Updated
•
7
•
3
Severian/Jamba-Bagel-GGUF
Updated
•
7
•
4
Severian/Jamba-UltraInteract-Instruct-1B-gguf
Updated
•
23
•
2
Severian/Jamba-Nexus-4xMoE
Text Generation
•
Updated
•
21
•
10
Severian/Jamba-900M-GGUF
Updated
•
45
•
11
Severian/Llama-3-IMPACTS-2x8B-64k-GGUF
Text Generation
•
Updated
•
43
•
2
Severian/Llama-3-IMPACTS-2x8B-64k-MLX
Text Generation
•
Updated
•
8
•
4
Severian/Jamba-Hercules
Text Generation
•
Updated
•
13
•
12
Severian/Mistral-v0.2-Nexus-Internal-Knowledge-Map-7B
Text Generation
•
Updated
•
47
•
1
datasets
6
Severian/IMPACTS
Viewer
•
Updated
•
47.7k
•
37
•
5
Severian/Biomimicry-Nectar-BioDesign-STEM
Viewer
•
Updated
•
2.04M
•
37
•
2
Severian/Internal-Knowledge-Map
Viewer
•
Updated
•
4.69k
•
69
•
44
Severian/Internal-Knowledge-Map-StoryWriter-RolePlaying
Viewer
•
Updated
•
2.07k
•
35
•
11
Severian/Bio-Design-Process
Viewer
•
Updated
•
60k
•
31
•
2
Severian/Biomimicry
Viewer
•
Updated
•
4.85k
•
35
•
3