Mike Brummett PRO

GoDjMike

AI & ML interests

Edge detection, road anomaly identification, story-generation libraries

Recent Activity

liked a model 1 day ago

hexgrad/Kokoro-82M

liked a model 3 days ago

microsoft/phi-4

liked a model 5 days ago

tiiuae/Falcon3-10B-Instruct

View all activity

Organizations

None yet

GoDjMike's activity

liked a model 1 day ago

hexgrad/Kokoro-82M

Text-to-Speech • Updated 5 days ago • 8.1k • 673

liked a model 3 days ago

microsoft/phi-4

Text Generation • Updated 3 days ago • 35.9k • 978

liked a model 5 days ago

tiiuae/Falcon3-10B-Instruct

Text Generation • Updated 1 day ago • 18.9k • 81

liked a dataset 9 days ago

m-ric/agents_medium_benchmark_2

Viewer • Updated 15 days ago • 142 • 142 • 7

liked 2 models 16 days ago

opensourcerelease/DeepSeek-V3-bf16

Updated 12 days ago • 2.21k • 19

deepseek-ai/DeepSeek-V3

Updated 13 days ago • 110k • 1.66k

liked a model 18 days ago

Derendering/InkSight-Small-p

Updated about 1 month ago • 55 • 28

liked 2 Spaces 18 days ago

Running on Zero

🚀

Derendering/InkSight-Derenderings

Viewer • Updated Dec 5, 2024 • 300 • 61 • 1

liked a Space 20 days ago

Running on T4

2.11k

🐶

liked a Space 21 days ago

Running

💻🧲

20+ Multi LLM Playground with Web Search

liked a model 21 days ago

google/gemma-2-9b-it

Text Generation • Updated Aug 27, 2024 • 272k • • 616

reacted to singhsidhukuldeep's post with 🧠 21 days ago

Post

3631

Exciting breakthrough in AI: @Meta 's new Byte Latent Transformer (BLT) revolutionizes language models by eliminating tokenization!

The BLT architecture introduces a groundbreaking approach that processes raw bytes instead of tokens, achieving state-of-the-art performance while being more efficient and robust. Here's what makes it special:

>> Key Innovations
Dynamic Patching: BLT groups bytes into variable-sized patches based on entropy, allocating more compute power where the data is more complex. This results in up to 50% fewer FLOPs during inference compared to traditional token-based models.

Three-Component Architecture:
• Lightweight Local Encoder that converts bytes to patch representations
• Powerful Global Latent Transformer that processes patches
• Local Decoder that converts patches back to bytes

>> Technical Advantages
• Matches performance of Llama 3 at 8B parameters while being more efficient
• Superior handling of non-English languages and rare character sequences
• Remarkable 99.9% accuracy on spelling tasks
• Better scaling properties than token-based models

>> Under the Hood
The system uses an entropy model to determine patch boundaries, cross-attention mechanisms for information flow, and hash n-gram embeddings for improved representation. The architecture allows simultaneous scaling of both patch and model size while maintaining fixed inference costs.

This is a game-changer for multilingual AI and could reshape how we build future language models. Excited to see how this technology evolves!