Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 12, 2025

📄 48% (0.48x) speedup for PatchMatch.patchmatch_available in invokeai/backend/image_util/infill_methods/patchmatch.py

⏱️ Runtime : 334 microseconds 226 microseconds (best of 1 runs)

📝 Explanation and details

The optimized code achieves a 47% speedup by eliminating redundant method calls in the hot path. The key optimization is inlining the load check in patchmatch_available() instead of always calling _load_patch_match().

Key changes:

  1. Added explicit class attributes (patch_match = None, tried_load: bool = False) to make the caching mechanism clear and avoid potential AttributeError exceptions
  2. Inlined the tried_load check in patchmatch_available() - now it only calls _load_patch_match() when tried_load is False, avoiding the method call overhead for subsequent invocations

Why this is faster:

  • Eliminates method call overhead: The original code called _load_patch_match() on every invocation, which immediately returned if tried_load was True. The optimized version checks tried_load directly, avoiding the function call entirely
  • Reduces stack operations: Method calls involve stack frame creation/destruction, parameter passing, and return value handling - all eliminated for cached cases
  • Better branch prediction: The inlined check is simpler for the CPU to predict and optimize

Performance characteristics based on test results:

  • Cached calls see 33-124% improvement (617ns vs 1.38μs for unavailable cached case)
  • First-time calls see 22-87% improvement across various scenarios
  • Large-scale repeated calls benefit significantly (45.7% faster for 1000 iterations)

The optimization particularly benefits workloads where patchmatch_available() is called frequently after the initial load, as the caching mechanism now has minimal overhead. This is especially valuable in image processing pipelines where availability checks might occur multiple times per operation.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 12 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 75.0%
🌀 Generated Regression Tests and Runtime

import types

imports

import pytest
from invokeai.backend.image_util.infill_methods.patchmatch import PatchMatch

--- End: PatchMatch class and dependencies ---

--- Begin: Helper mocks for patching ---

class DummyConfig:
def init(self, patchmatch):
self.patchmatch = patchmatch

Dummy patchmatch module

class DummyPatchMatch:
def init(self, patchmatch_available):
self.patchmatch_available = patchmatch_available

1. Basic Test Cases

def test_patchmatch_available_true(monkeypatch):
"""
Basic: patchmatch is enabled in config and patchmatch module is available and says available.
Should return True.
"""
# Patch get_config to return patchmatch enabled
monkeypatch.setitem(globals(), 'get_config', lambda: DummyConfig(patchmatch=True))
# Patch import of patchmatch.patch_match to return DummyPatchMatch with available True
def dummy_import_patchmatch(name, globals=None, locals=None, fromlist=(), level=0):
if name == "patchmatch":
class dummy_patchmatch_mod:
patch_match = DummyPatchMatch(True)
return dummy_patchmatch_mod()
return import(name, globals, locals, fromlist, level)
monkeypatch.setitem(builtins, 'import', dummy_import_patchmatch)
# Run test
codeflash_output = PatchMatch.patchmatch_available() # 2.17μs -> 1.29μs (69.1% faster)

def test_patchmatch_available_false_when_module_unavailable(monkeypatch):
"""
Basic: patchmatch is enabled in config but patchmatch module is available and says unavailable.
Should return False.
"""
monkeypatch.setitem(globals(), 'get_config', lambda: DummyConfig(patchmatch=True))
def dummy_import_patchmatch(name, globals=None, locals=None, fromlist=(), level=0):
if name == "patchmatch":
class dummy_patchmatch_mod:
patch_match = DummyPatchMatch(False)
return dummy_patchmatch_mod()
return import(name, globals, locals, fromlist, level)
monkeypatch.setitem(builtins, 'import', dummy_import_patchmatch)
codeflash_output = PatchMatch.patchmatch_available() # 1.64μs -> 1.04μs (56.8% faster)

def test_patchmatch_disabled_in_config(monkeypatch):
"""
Basic: patchmatch is disabled in config.
Should return False, regardless of patchmatch module.
"""
monkeypatch.setitem(globals(), 'get_config', lambda: DummyConfig(patchmatch=False))
# import should not even be called, but we'll patch it anyway to raise if called
def dummy_import_patchmatch(name, globals=None, locals=None, fromlist=(), level=0):
raise RuntimeError("patchmatch module should not be imported when patchmatch is disabled")
monkeypatch.setitem(builtins, 'import', dummy_import_patchmatch)
codeflash_output = PatchMatch.patchmatch_available() # 1.62μs -> 941ns (72.3% faster)

def test_patchmatch_available_cached(monkeypatch):
"""
Basic: patchmatch_available should not reload if tried_load is True.
"""
PatchMatch.tried_load = True
PatchMatch.patch_match = DummyPatchMatch(True)
codeflash_output = PatchMatch.patchmatch_available() # 1.48μs -> 861ns (71.5% faster)

def test_patchmatch_unavailable_cached(monkeypatch):
"""
Basic: patchmatch_available should not reload if tried_load is True, and patch_match is None.
"""
PatchMatch.tried_load = True
PatchMatch.patch_match = None
codeflash_output = PatchMatch.patchmatch_available() # 1.38μs -> 617ns (124% faster)

2. Edge Test Cases

def test_patchmatch_import_raises(monkeypatch):
"""
Edge: patchmatch is enabled in config but importing patchmatch raises ImportError.
Should return False.
"""
monkeypatch.setitem(globals(), 'get_config', lambda: DummyConfig(patchmatch=True))
def dummy_import_patchmatch(name, globals=None, locals=None, fromlist=(), level=0):
if name == "patchmatch":
raise ImportError("patchmatch not installed")
return import(name, globals, locals, fromlist, level)
monkeypatch.setitem(builtins, 'import', dummy_import_patchmatch)
codeflash_output = PatchMatch.patchmatch_available() # 1.23μs -> 659ns (87.1% faster)

def test_patchmatch_module_missing_patchmatch_attribute(monkeypatch):
"""
Edge: patchmatch is enabled, patchmatch module exists, but has no patch_match attribute.
Should return False.
"""
monkeypatch.setitem(globals(), 'get_config', lambda: DummyConfig(patchmatch=True))
def dummy_import_patchmatch(name, globals=None, locals=None, fromlist=(), level=0):
if name == "patchmatch":
class dummy_patchmatch_mod:
pass
return dummy_patchmatch_mod()
return import(name, globals, locals, fromlist, level)
monkeypatch.setitem(builtins, 'import', dummy_import_patchmatch)
codeflash_output = PatchMatch.patchmatch_available() # 1.30μs -> 522ns (149% faster)

def test_patchmatch_patch_match_missing_patchmatch_available(monkeypatch):
"""
Edge: patchmatch is enabled, patchmatch module exists, patch_match exists but missing patchmatch_available attribute.
Should return False.
"""
monkeypatch.setitem(globals(), 'get_config', lambda: DummyConfig(patchmatch=True))
class PatchMatchNoAvailable:
pass
def dummy_import_patchmatch(name, globals=None, locals=None, fromlist=(), level=0):
if name == "patchmatch":
class dummy_patchmatch_mod:
patch_match = PatchMatchNoAvailable()
return dummy_patchmatch_mod()
return import(name, globals, locals, fromlist, level)
monkeypatch.setitem(builtins, 'import', dummy_import_patchmatch)
codeflash_output = PatchMatch.patchmatch_available() # 1.26μs -> 533ns (136% faster)

def test_patchmatch_patch_match_available_not_bool(monkeypatch):
"""
Edge: patchmatch_available is not a bool (e.g. a string 'yes').
Should treat truthy values as True, falsy as False.
"""
monkeypatch.setitem(globals(), 'get_config', lambda: DummyConfig(patchmatch=True))
def dummy_import_patchmatch(name, globals=None, locals=None, fromlist=(), level=0):
if name == "patchmatch":
class dummy_patchmatch_mod:
patch_match = DummyPatchMatch("yes")
return dummy_patchmatch_mod()
return import(name, globals, locals, fromlist, level)
monkeypatch.setitem(builtins, 'import', dummy_import_patchmatch)
# Python treats non-empty string as True
codeflash_output = PatchMatch.patchmatch_available() # 1.33μs -> 559ns (138% faster)

def test_patchmatch_patch_match_available_falsy(monkeypatch):
"""
Edge: patchmatch_available is a falsy value (e.g. empty string).
Should return False.
"""
monkeypatch.setitem(globals(), 'get_config', lambda: DummyConfig(patchmatch=True))
def dummy_import_patchmatch(name, globals=None, locals=None, fromlist=(), level=0):
if name == "patchmatch":
class dummy_patchmatch_mod:
patch_match = DummyPatchMatch("")
return dummy_patchmatch_mod()
return import(name, globals, locals, fromlist, level)
monkeypatch.setitem(builtins, 'import', dummy_import_patchmatch)
codeflash_output = PatchMatch.patchmatch_available() # 1.23μs -> 722ns (70.5% faster)

def test_patchmatch_config_patchmatch_none(monkeypatch):
"""
Edge: patchmatch attribute in config is None.
Should treat as disabled and return False.
"""
monkeypatch.setitem(globals(), 'get_config', lambda: DummyConfig(patchmatch=None))
def dummy_import_patchmatch(name, globals=None, locals=None, fromlist=(), level=0):
raise RuntimeError("patchmatch module should not be imported when patchmatch is None")
monkeypatch.setitem(builtins, 'import', dummy_import_patchmatch)
codeflash_output = PatchMatch.patchmatch_available() # 1.15μs -> 634ns (81.5% faster)

def test_patchmatch_config_patchmatch_missing(monkeypatch):
"""
Edge: config object missing patchmatch attribute.
Should treat as disabled and return False.
"""
class NoPatchMatchConfig:
pass
monkeypatch.setitem(globals(), 'get_config', lambda: NoPatchMatchConfig())
def dummy_import_patchmatch(name, globals=None, locals=None, fromlist=(), level=0):
raise RuntimeError("patchmatch module should not be imported when patchmatch is missing")
monkeypatch.setitem(builtins, 'import', dummy_import_patchmatch)
codeflash_output = PatchMatch.patchmatch_available() # 1.28μs -> 637ns (101% faster)

3. Large Scale Test Cases

def test_patchmatch_available_many_times(monkeypatch):
"""
Large scale: Call patchmatch_available repeatedly (1000 times) to test caching and performance.
Should only load once.
"""
monkeypatch.setitem(globals(), 'get_config', lambda: DummyConfig(patchmatch=True))
import_counter = {'count': 0}
def dummy_import_patchmatch(name, globals=None, locals=None, fromlist=(), level=0):
if name == "patchmatch":
import_counter['count'] += 1
class dummy_patchmatch_mod:
patch_match = DummyPatchMatch(True)
return dummy_patchmatch_mod()
return import(name, globals, locals, fromlist, level)
monkeypatch.setitem(builtins, 'import', dummy_import_patchmatch)

# Call 1000 times
for _ in range(1000):
    codeflash_output = PatchMatch.patchmatch_available() # 292μs -> 201μs (45.7% faster)

def test_patchmatch_available_toggle(monkeypatch):
"""
Large scale: Simulate toggling patchmatch config on/off between calls.
Should reflect config value only on first call due to caching.
"""
# First call: enabled, available True
monkeypatch.setitem(globals(), 'get_config', lambda: DummyConfig(patchmatch=True))
def dummy_import_patchmatch(name, globals=None, locals=None, fromlist=(), level=0):
if name == "patchmatch":
class dummy_patchmatch_mod:
patch_match = DummyPatchMatch(True)
return dummy_patchmatch_mod()
return import(name, globals, locals, fromlist, level)
monkeypatch.setitem(builtins, 'import', dummy_import_patchmatch)
codeflash_output = PatchMatch.patchmatch_available()

# Now, change config to disabled, but due to tried_load caching, value should not change
monkeypatch.setitem(globals(), 'get_config', lambda: DummyConfig(patchmatch=False))
codeflash_output = PatchMatch.patchmatch_available()  # still True due to caching

# Reset class for a fresh test
PatchMatch.tried_load = False
PatchMatch.patch_match = None
# Now, config disabled, so should return False
codeflash_output = PatchMatch.patchmatch_available()

def test_patchmatch_available_with_large_config_object(monkeypatch):
"""
Large scale: config object with many (1000) attributes, only patchmatch matters.
Should still work and return correct value.
"""
class LargeConfig:
def init(self, patchmatch):
for i in range(1000):
setattr(self, f"attr_{i}", i)
self.patchmatch = patchmatch
monkeypatch.setitem(globals(), 'get_config', lambda: LargeConfig(patchmatch=True))
def dummy_import_patchmatch(name, globals=None, locals=None, fromlist=(), level=0):
if name == "patchmatch":
class dummy_patchmatch_mod:
patch_match = DummyPatchMatch(True)
return dummy_patchmatch_mod()
return import(name, globals, locals, fromlist, level)
monkeypatch.setitem(builtins, 'import', dummy_import_patchmatch)
codeflash_output = PatchMatch.patchmatch_available()

codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

#------------------------------------------------
import pytest
from invokeai.backend.image_util.infill_methods.patchmatch import PatchMatch

--- Begin: PatchMatch class and dependencies mock for testing purposes ---

We'll create minimal stubs/mocks for the dependencies so that PatchMatch can be tested.

These are NOT using pytest's mock, but just simple classes/functions.

Simulate patchmatch module and its patch_match class

class DummyPatchMatchModule:
def init(self, available):
self.patchmatch_available = available

Simulate config object returned by get_config()

class DummyConfig:
def init(self, patchmatch_enabled):
self.patchmatch = patchmatch_enabled

--- End: PatchMatch class and dependencies mock ---

--- Begin: Dependency Injection for tests ---

We'll use global variables to inject dependencies for testing.

This avoids using pytest.mock and keeps the environment close to production.

These will be set by each test

_injected_config = None
_injected_patchmatch_module = None

1. Basic Test Cases

def test_patchmatch_enabled_and_available():
"""
Test when config enables patchmatch and patchmatch module is available.
Should return True.
"""
global _injected_config, _injected_patchmatch_module
_injected_config = DummyConfig(patchmatch_enabled=True)
_injected_patchmatch_module = DummyPatchMatchModule(available=True)
codeflash_output = PatchMatch.patchmatch_available() # 3.97μs -> 2.15μs (84.1% faster)

def test_patchmatch_enabled_and_not_available():
"""
Test when config enables patchmatch but patchmatch module is NOT available.
Should return False.
"""
global _injected_config, _injected_patchmatch_module
_injected_config = DummyConfig(patchmatch_enabled=True)
_injected_patchmatch_module = DummyPatchMatchModule(available=False)
codeflash_output = PatchMatch.patchmatch_available() # 2.44μs -> 1.62μs (50.3% faster)

def test_patchmatch_disabled():
"""
Test when config disables patchmatch.
Should return False regardless of patchmatch module.
"""
global _injected_config, _injected_patchmatch_module
_injected_config = DummyConfig(patchmatch_enabled=False)
_injected_patchmatch_module = DummyPatchMatchModule(available=True)
codeflash_output = PatchMatch.patchmatch_available() # 1.65μs -> 1.12μs (47.8% faster)

2. Edge Test Cases

def test_patchmatch_module_is_none():
"""
Test when patchmatch module import returns None (simulating import failure).
Should return False.
"""
global _injected_config, _injected_patchmatch_module
_injected_config = DummyConfig(patchmatch_enabled=True)
_injected_patchmatch_module = None
codeflash_output = PatchMatch.patchmatch_available() # 1.36μs -> 1.11μs (22.2% faster)

def test_patchmatch_module_patchmatch_available_is_not_bool():
"""
Test when patchmatch_available attribute is not a bool.
Should return the value as-is (Python truthiness).
"""
class WeirdPatchMatchModule:
patchmatch_available = "yes"
global _injected_config, _injected_patchmatch_module
_injected_config = DummyConfig(patchmatch_enabled=True)
_injected_patchmatch_module = WeirdPatchMatchModule()
# "yes" is truthy, so should return "yes"
codeflash_output = PatchMatch.patchmatch_available() # 2.22μs -> 1.67μs (33.3% faster)

def test_patchmatch_called_multiple_times_caches_result():
"""
Test that repeated calls do not reload patchmatch after first load.
Should return same result and not re-import.
"""
global _injected_config, _injected_patchmatch_module
_injected_config = DummyConfig(patchmatch_enabled=True)
_injected_patchmatch_module = DummyPatchMatchModule(available=True)
codeflash_output = PatchMatch.patchmatch_available(); result1 = codeflash_output # 1.48μs -> 1.08μs (37.0% faster)
# Change the patchmatch module to unavailable, but tried_load should prevent reload
_injected_patchmatch_module = DummyPatchMatchModule(available=False)
codeflash_output = PatchMatch.patchmatch_available(); result2 = codeflash_output # 570ns -> 427ns (33.5% faster)

def test_patchmatch_available_with_config_patchmatch_is_none():
"""
Test when config.patchmatch is None.
Should treat as False (disabled).
"""
class NonePatchMatchConfig:
patchmatch = None
global _injected_config, _injected_patchmatch_module
_injected_config = NonePatchMatchConfig()
_injected_patchmatch_module = DummyPatchMatchModule(available=True)
codeflash_output = PatchMatch.patchmatch_available() # 2.12μs -> 1.44μs (46.8% faster)

3. Large Scale Test Cases

def test_patchmatch_available_large_scale_enabled():
"""
Test with a config object that simulates a large number of attributes.
Only 'patchmatch' matters, but ensures scalability.
"""
class LargeConfig:
def init(self, patchmatch):
self.patchmatch = patchmatch
# Add many unrelated attributes
for i in range(1000):
setattr(self, f"attr_{i}", i)
global _injected_config, _injected_patchmatch_module
_injected_config = LargeConfig(patchmatch=True)
_injected_patchmatch_module = DummyPatchMatchModule(available=True)
codeflash_output = PatchMatch.patchmatch_available() # 2.22μs -> 1.37μs (62.8% faster)

def test_patchmatch_available_large_scale_disabled():
"""
Test with a config object with many attributes, but patchmatch disabled.
Should return False.
"""
class LargeConfig:
def init(self, patchmatch):
self.patchmatch = patchmatch
for i in range(1000):
setattr(self, f"attr_{i}", i)
global _injected_config, _injected_patchmatch_module
_injected_config = LargeConfig(patchmatch=False)
_injected_patchmatch_module = DummyPatchMatchModule(available=True)
codeflash_output = PatchMatch.patchmatch_available() # 1.98μs -> 1.29μs (52.8% faster)

def test_patchmatch_available_large_scale_module():
"""
Test with a patchmatch module object with many attributes.
Only patchmatch_available matters.
"""
class LargePatchMatchModule:
def init(self, available):
self.patchmatch_available = available
for i in range(1000):
setattr(self, f"attr_{i}", i)
global _injected_config, _injected_patchmatch_module
_injected_config = DummyConfig(patchmatch_enabled=True)
_injected_patchmatch_module = LargePatchMatchModule(available=True)
codeflash_output = PatchMatch.patchmatch_available() # 1.79μs -> 1.25μs (43.5% faster)

def test_patchmatch_available_large_scale_module_unavailable():
"""
Test with a patchmatch module object with many attributes, not available.
Should return False.
"""
class LargePatchMatchModule:
def init(self, available):
self.patchmatch_available = available
for i in range(1000):
setattr(self, f"attr_{i}", i)
global _injected_config, _injected_patchmatch_module
_injected_config = DummyConfig(patchmatch_enabled=True)
_injected_patchmatch_module = LargePatchMatchModule(available=False)
codeflash_output = PatchMatch.patchmatch_available() # 1.87μs -> 1.18μs (59.5% faster)

codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

#------------------------------------------------
from invokeai.backend.image_util.infill_methods.patchmatch import PatchMatch

def test_PatchMatch_patchmatch_available():
PatchMatch.patchmatch_available(PatchMatch)

To edit these changes git checkout codeflash/optimize-PatchMatch.patchmatch_available-mhvqvp4g and push.

Codeflash Static Badge

The optimized code achieves a **47% speedup** by eliminating redundant method calls in the hot path. The key optimization is **inlining the load check** in `patchmatch_available()` instead of always calling `_load_patch_match()`.

**Key changes:**
1. **Added explicit class attributes** (`patch_match = None`, `tried_load: bool = False`) to make the caching mechanism clear and avoid potential AttributeError exceptions
2. **Inlined the tried_load check** in `patchmatch_available()` - now it only calls `_load_patch_match()` when `tried_load` is False, avoiding the method call overhead for subsequent invocations

**Why this is faster:**
- **Eliminates method call overhead**: The original code called `_load_patch_match()` on every invocation, which immediately returned if `tried_load` was True. The optimized version checks `tried_load` directly, avoiding the function call entirely
- **Reduces stack operations**: Method calls involve stack frame creation/destruction, parameter passing, and return value handling - all eliminated for cached cases
- **Better branch prediction**: The inlined check is simpler for the CPU to predict and optimize

**Performance characteristics based on test results:**
- **Cached calls see 33-124% improvement** (617ns vs 1.38μs for unavailable cached case)
- **First-time calls see 22-87% improvement** across various scenarios
- **Large-scale repeated calls benefit significantly** (45.7% faster for 1000 iterations)

The optimization particularly benefits workloads where `patchmatch_available()` is called frequently after the initial load, as the caching mechanism now has minimal overhead. This is especially valuable in image processing pipelines where availability checks might occur multiple times per operation.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 12, 2025 08:35
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant