Skip to content

Conversation

tafk7
Copy link
Collaborator

@tafk7 tafk7 commented Sep 26, 2025

Summary

This PR enables CustomOp registration through Python module namespaces, eliminating the need for manual dictionary registration. CustomOps can now be discovered automatically by importing the module corresponding to the ONNX domain name. The system maintains full backward compatibility with legacy custom_op dictionaries while providing better IDE support and a more Pythonic API.

Key Changes

Core Registry (src/qonnx/custom_op/registry.py)

  • CustomOps now discoverable via module namespace (e.g., getattr(module, op_type))
  • Namespace lookup prioritized with automatic fallback to legacy custom_op dictionaries
  • Domain aliasing system enables cross-package compatibility (add_domain_alias())
  • Added resolve_domain() helper for alias resolution
  • Enhanced error messages with actionable guidance

Performance & Safety Improvements

  • Lazy on-demand discovery with in-memory caching (_OP_REGISTRY)
  • Thread-safe access via RLock
  • Privatized registry state (_OP_REGISTRY, _DOMAIN_ALIASES)

Cleanup

  • Removed add_op_to_domain() and get_ops_in_domain() functions
  • Removed unnecessary __all__ exports from modules
  • Removed unused entrypoint system

Benefits

  • Developer Experience: CustomOps work like normal Python classes with IDE autocomplete
  • Extensibility: External packages register ops by exposing classes in module namespace
  • Simplicity: No manual dictionary management required
  • Performance: Cached lookups after initial discovery
  • Compatibility: Existing custom_op dictionaries continue to work

Migration Path

No code changes required. The system automatically discovers ops from both module namespaces and legacy custom_op dictionaries. New CustomOps can be registered by:

  1. Exporting CustomOp classes in module namespace (preferred, Pythonic)
  2. Adding to legacy custom_op dictionaries (backward compatibility)
  3. Using add_domain_alias() to map custom domains to module paths

  Replace dictionary-based registration with module namespace lookup:
  - Direct attribute lookup in module namespace (preferred)
  - Legacy custom_op dictionary support (backward compatibility)
  - Domain aliasing (e.g., onnx.brevitas -> qonnx.custom_op.general)
  - Runtime op registration via add_op_to_domain()
  - No expensive fallbacks (removed case-insensitive matching, full module scan)
  - Clear, actionable error messages

  Benefits:
  - Better IDE support with direct imports
  - Cleaner, more Pythonic API
  - O(1) lookup performance
  - Predictable behavior
  - External packages work via domain-based imports (e.g., finn.custom_op.fpgadataflow)
@jurevreca12 jurevreca12 mentioned this pull request Sep 29, 2025
@jurevreca12 jurevreca12 self-requested a review October 5, 2025 18:22
@jurevreca12
Copy link
Collaborator

How are these ops in the custom domain suppose to execute?
qonnx/core/onnx_exec.py uses an if statement to check for custom ops.

if is_finn_op(node.domain):
    ...

However, the function is_finn_op remains unchanged. It still checks for the "classic" domains:

def is_finn_op(op_type):
    return op_type.startswith("finn") or op_type.startswith("qonnx.custom_op") or op_type.startswith("onnx.brevitas")

I also tested the code on chisel4ml, with a custom defined in the chisel4ml.preprocess namespace. I then called
qonnx.custom_op.registry.add_domain_alias("chisel4ml.preprocess", "qonnx.custom_op.general"). However, the if statement described above did not trigger and I get an onnx runtime error.

Am I missing something? should I be calling something else? Honestly, I just don't see how this can execute.

Example:
add_op_to_domain("qonnx.custom_op.general", "TestOp", TestOp)
"""
if not inspect.isclass(op_class) or not issubclass(op_class, CustomOp):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is not correct. It evaluates to True even if op_class is not subclass of CustomOp. This code will evaluate to True on anything, even if op_class is a float.

Regardless of that, I do not see the point of using inspect here. If the intention is to prevent the user from using an object instance here, issubclass will raise a TypeError. Which is a more appropriate error than ValueError.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is from an old iteration where it cached all CustomOps at a given domain on first access, this avoided illegal comparisons. Thanks for catching this, will remove in next commit.

# Strategy 1: Direct namespace lookup (preferred)
if hasattr(module, op_type):
obj = getattr(module, op_type)
if inspect.isclass(obj) and issubclass(obj, CustomOp):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is obj a class? judging from the code bellow it is a class not an object. This makes the variable name obj somewhat misleading. Also the same argument about inspect holds here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in minor refactor, along with adding caching and thread safety.

@tafk7
Copy link
Collaborator Author

tafk7 commented Oct 11, 2025

However, the function is_finn_op remains unchanged. It still checks for the "classic" domains:

def is_finn_op(op_type):
    return op_type.startswith("finn") or op_type.startswith("qonnx.custom_op") or op_type.startswith("onnx.brevitas")

I also tested the code on chisel4ml, with a custom defined in the chisel4ml.preprocess namespace. I then called qonnx.custom_op.registry.add_domain_alias("chisel4ml.preprocess", "qonnx.custom_op.general"). However, the if statement described above did not trigger and I get an onnx runtime error.

Am I missing something? should I be calling something else? Honestly, I just don't see how this can execute.

@jurevreca12 Could you clarify the expected behavior of the "Brevitas exception?" Since it was only applied in the registry, I assumed similar aliasing should only be used for path resolution when the node's domain doesn't match its import path. Therefore, it's proper for is_finn_op to be directly applied to the node's domain attribute, instead of resolving any aliasing. Instead, should is_finn_op always compare against the module's true path?

  - Replace eager module inspection with on-demand discovery
  - Add RLock for thread-safe registry access
  - Cache discovered ops in _OP_REGISTRY dict
  - Remove get_ops_in_domain() and add_op_to_domain()
  - Privatize registry state variables
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants