Local LLM assistance for research workflows
Location: infrastructure/llm/
Quick Reference: Modules Guide | API Reference
- Ollama Integration: Local model support (privacy-first)
- Template System: Pre-built prompts for research tasks
- Context Management: Multi-turn conversation handling
- Streaming Support: Real-time response generation
- Model Fallback: Automatic fallback to alternative models
- Token Counting: Track usage and costs
- Abstract summarization
- Literature review generation
- Code documentation
- Data interpretation
- Section drafting assistance
- Citation formatting
- Technical abstract translation (Chinese, Hindi, Russian)
from infrastructure.llm import LLMClient
# Initialize client (reads OLLAMA_HOST, OLLAMA_MODEL from environment)
client = LLMClient()
# Simple query
response = client.query("What are the key findings in this paper?")
print(response)# Apply research template
summary = client.apply_template(
"summarize_abstract",
text=abstract_text
)
# Generate literature review section
review = client.apply_template(
"literature_review",
topic="machine learning",
papers=["paper1", "paper2", "paper3"]
)# Short response (< 150 tokens)
answer = client.query_short("What is quantum entanglement?")
# Long response (> 500 tokens)
explanation = client.query_long("Explain quantum entanglement in detail")
# Structured JSON response
schema = {
"type": "object",
"properties": {
"summary": {"type": "string"},
"key_points": {"type": "array"}
}
}
result = client.query_structured("Summarize...", schema=schema)# Stream response in real-time
for chunk in client.stream_query("Write a research summary"):
print(chunk, end="", flush=True)# Check Ollama connection
uv run python -m infrastructure.llm.cli check
# List available models
uv run python -m infrastructure.llm.cli models
# Send query
uv run python -m infrastructure.llm.cli query "What is machine learning?"
# Apply template
uv run python -m infrastructure.llm.cli template summarize_abstract --input "Abstract text..."Related: Reporting Module | Rendering Module