SG-37902 - Remove duplicate filter conditions in find and summarize operations #409

camiloleal-globant · 2025-08-07T07:09:22Z

SG-37902 - Deduplicate filter conditions in the Python API before sending requests

Overview

This change introduces automatic deduplication of filter conditions within the Python API client. This prevents redundant filter data from being sent to the server, which reduces payload size and minimizes network latency. While this adds a small amount of overhead to local processing, the resulting decrease in network transfer time improves the end-to-end performance of find() and summarize() operations. The fix is self-contained within the Python API and requires no server-side changes.

Problem

API requests constructed with duplicate filter conditions are inefficient. They result in:

Larger Payloads: Unnecessary data is sent over the network, consuming bandwidth.
Increased Latency: Larger requests take more time to transfer, especially on slower networks.
Wasted Server Resources: The server must still receive and parse the redundant information.

Solution

A new set of utility functions has been added to shotgun_api3/shotgun.py to recursively clean filters before they are sent to the server.

remove_duplicate_filters(filters): The main entry point that sanitizes a list of filters.
Core Logic: The implementation uses a set of normalized filter representations to efficiently track and discard duplicates while preserving the original order of the first occurrence.
Resilience: The process is wrapped in a try...except block to ensure that if any unexpected error occurs during deduplication, the original, untouched filters are sent to the server, preventing any client-side crashes.
Integration: The _translate_filters_list function now calls remove_duplicate_filters before processing, making the change transparent to all API methods that use filters (find, find_one, summarize).

By handling deduplication on the client side, the API creates a more efficient and robust integration, saving network bandwidth and reducing the overall request time.

Performance Impact

Client-side deduplication has two main effects: a minor processing overhead and a major payload reduction.

The added processing overhead is negligible for most queries (under 0.1ms for 100 filters) and only becomes potentially noticeable in extreme cases with thousands of filters.
This overhead is offset by a significant reduction in payload size—up to 94% in tests—which directly translates to faster, more efficient network requests.

Click to view Full Benchmark Report

Overview of Tests

I tested the performance overhead and payload reduction of the client-side deduplication feature by running a series of filter translation operations. The tests covered scenarios with a small (10), medium (100), and large (1000) number of filters. For each of these sizes, I tested cases with no duplicates (0%), few duplicates (20%), some duplicates (50%), and many duplicates (90%).

Results Summary

The following table shows the median time it takes to prepare the API request (overhead) and the final size of the data sent to the server (payload). A positive percentage in the 'Impact' column means the feature made the process slower.

Test Scenario	Time without Fix (ms)	Time with Fix (ms)	Performance Impact	Payload without Fix (bytes)	Payload with Fix (bytes)	Payload Reduction
Small (10 filters), 0% Dups	0.0030	0.0110	+222.86 %	691	571	17.37%
Small (10 filters), 20% Dups	0.0030	0.0110	+232.35 %	687	589	14.26%
Small (10 filters), 50% Dups	0.0030	0.0100	+191.43 %	725	377	48.00%
Small (10 filters), 90% Dups	0.0030	0.0080	+142.86 %	713	105	85.27%
Medium (100 filters), 0% Dups	0.0330	0.0980	+194.28 %	6729	4367	35.10%
Medium (100 filters), 20% Dups	0.0320	0.0940	+195.61 %	6660	3600	45.95%
Medium (100 filters), 50% Dups	0.0320	0.0910	+181.99 %	6731	2461	63.44%
Medium (100 filters), 90% Dups	0.0320	0.0810	+150.78 %	6653	655	90.15%
Large (1000 filters), 0% Dups	0.3680	0.9600	+160.96 %	66267	33703	49.14%
Large (1000 filters), 20% Dups	0.3560	0.8870	+148.96 %	66351	26776	59.64%
Large (1000 filters), 50% Dups	0.3510	0.8320	+136.97 %	66303	17170	74.10%
Large (1000 filters), 90% Dups	0.3470	0.7570	+117.88 %	66262	3914	94.09%

How the Tests Were Run

I ran the tests by directly calling the filter translation logic within the Python API. To get stable results, I ran each scenario 50 times and recorded the median processing time and resulting payload size.

Example

The following example demonstrates how the filter list is cleaned before being sent to the server.

project_filter = ['project', 'is', {'type': 'Project', 'id': 123}]

# Define filters with duplicates
filters_with_duplicates = [
    project_filter,
    ['sg_status_list', 'is', 'rev'],
    project_filter,  # DUPLICATE
    ['entity', 'type_is', 'Shot'],
    project_filter   # DUPLICATE
]

# This call will now automatically deduplicate the filters
shots = sg.find('Shot', filters_with_duplicates, ['id', 'code'])

Resulting conditions sent to the server (After client-side deduplication):
The Python API now generates a cleaner, smaller set of conditions to send in the request body.

{
  "logical_operator": "and",
  "conditions": [
    {"path": "project", "relation": "is", "values": [{"type": "Project", "id": 123}]},
    {"path": "sg_status_list", "relation": "is", "values": ["rev"]},
    {"path": "entity", "relation": "type_is", "values": ["Shot"]}
  ]
}

Additional examples including complex nested filters and edge cases can be found in the new test suite at tests/test_unit.py.

Remove duplicate filter conditions in find and summarize operations

eb0f186

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SG-37902 - Remove duplicate filter conditions in find and summarize operations #409

SG-37902 - Remove duplicate filter conditions in find and summarize operations #409

Uh oh!

camiloleal-globant commented Aug 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

SG-37902 - Remove duplicate filter conditions in find and summarize operations #409

Are you sure you want to change the base?

SG-37902 - Remove duplicate filter conditions in find and summarize operations #409

Uh oh!

Conversation

camiloleal-globant commented Aug 7, 2025

Overview

Problem

Solution

Performance Impact

Overview of Tests

Results Summary

How the Tests Were Run

Example

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant