Bite-sized wisdom, maximum impact
- Basic Examples - Hello world to first app
- Streaming Examples - Real-time generation
- Batch Processing - Efficient bulk operations
- Chat Applications - Building conversational AI
- API Servers - REST/GraphQL/WebSocket
- Document QA - RAG systems
- Custom Routing - Smart model selection
- Caching Strategies - Performance optimization
- Model Ensemble - Multi-model inference
- Docker Examples - Containerization
- Kubernetes - Orchestration at scale
- Monitoring - Observability
import LLMRouter from 'llm-runner-router';
const router = new LLMRouter();
const response = await router.quick("Hello AI!");await router.load('model.gguf'); // GGUF
await router.load('model.onnx'); // ONNX
await router.load('hf:gpt2'); // HuggingFacefor await (const token of router.stream(prompt)) {
process.stdout.write(token);
}router.setStrategy('quality-first'); // Or: cost-optimized, speed-priorityTry these in your browser console:
// Browser-ready example
const router = new LLMRouter({
preferredEngine: 'webgpu'
});
await router.load('tinyllama.gguf');
console.log(await router.quick("Hi!"));- Start Here: Basic Examples (5 min)
- Then: API Server (10 min)
- Advanced: Custom Routing (15 min)
- Production: Docker Deploy (20 min)
Examples are the bridges between theory and mastery 🌉