⚡️ Speed up function extract_content_graph by 443%
#48
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 443% (4.43x) speedup for
extract_content_graphincognee/infrastructure/llm/extraction/knowledge_graph/extract_content_graph.py⏱️ Runtime :
24.5 milliseconds→4.52 milliseconds(best of24runs)📝 Explanation and details
The optimization achieves a 443% speedup primarily by caching expensive Jinja2 template operations in the
render_promptfunction, which was the dominant bottleneck consuming 71% of execution time.Key Optimizations Applied:
Jinja2 Environment Caching - Added
@lru_cache(maxsize=8)to_get_jinja_env()to cacheEnvironmentinstances by base directory. This eliminates the expensive creation ofEnvironmentobjects (with FileSystemLoader initialization) on every call.Template Directory Path Caching - Added
@lru_cache(maxsize=8)to_get_templates_dir()to cache the result ofget_absolute_path(), avoiding repeated path resolution whenbase_directoryis None.Performance Impact Analysis:
render_prompttook 391ms total, with 297ms (75.9%) spent inenv.get_template()due to repeated Environment creationrender_promptdropped to just 24ms total, with only 8.8ms spent in template loading - a 94% reductionThroughput Benefits:
extract_content_graphfunction shows multiple calls to the same prompt templateTest Case Performance:
The optimization excels in scenarios with repeated template usage, as demonstrated by the concurrent mixed model tests, where multiple calls to the same underlying prompt templates benefit from the cached Jinja2 environments and template parsing.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
import asyncio # used to run async functions
--- FUNCTION UNDER TEST (EXACT COPY) ---
import os
Patch the extract_content_graph function's dependencies for testing
import sys
from typing import Optional, Type
import pytest # used for our unit tests
from cognee.infrastructure.llm.config import get_llm_config
from cognee.infrastructure.llm.extraction.knowledge_graph.extract_content_graph import
extract_content_graph
from cognee.infrastructure.llm.LLMGateway import LLMGateway
from cognee.infrastructure.llm.prompts import render_prompt
from pydantic import BaseModel, ValidationError
Minimal mock for LLMGateway.acreate_structured_output
class DummyResponseModel(BaseModel):
type: str
source: Optional[str] = None
target: Optional[str] = None
properties: Optional[dict] = None
from cognee.infrastructure.llm.extraction.knowledge_graph.extract_content_graph import
extract_content_graph
--- UNIT TESTS ---
1. Basic Test Cases
@pytest.mark.asyncio
#------------------------------------------------
import asyncio # used to run async functions
function to test (copied exactly as provided)
import os
Monkeypatching for all relevant imports inside extract_content_graph
import sys
import types
from typing import Optional, Type
import pytest # used for our unit tests
from cognee.infrastructure.llm.config import get_llm_config
from cognee.infrastructure.llm.extraction.knowledge_graph.extract_content_graph import
extract_content_graph
from cognee.infrastructure.llm.LLMGateway import LLMGateway
from cognee.infrastructure.llm.prompts import render_prompt
from pydantic import BaseModel, ValidationError
--- Start: Minimal stubs and monkeypatching for dependencies ---
Define a simple response model for testing
class DummyResponseModel(BaseModel):
value: str
class DummyResponseModelInt(BaseModel):
number: int
from cognee.infrastructure.llm.extraction.knowledge_graph.extract_content_graph import
extract_content_graph
--- Start: Unit tests ---
1. Basic Test Cases
@pytest.mark.asyncio
async def test_extract_content_graph_throughput_mixed_models():
"""Throughput: Test concurrent calls with mixed response models."""
tasks = []
for i in range(10):
if i % 2 == 0:
tasks.append(extract_content_graph(str(i), DummyResponseModelInt))
else:
tasks.append(extract_content_graph(f"str{i}", DummyResponseModel))
results = await asyncio.gather(*tasks, return_exceptions=True)
for i, r in enumerate(results):
if i % 2 == 0:
pass
else:
pass
codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
To edit these changes
git checkout codeflash/optimize-extract_content_graph-mhtudiimand push.