⚡️ Speed up method Graph.topologicalSort by 140%
#890
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 140% (1.40x) speedup for
Graph.topologicalSortincode_to_optimize/topological_sort.py⏱️ Runtime :
2.67 milliseconds→1.11 milliseconds(best of13runs)📝 Explanation and details
The optimized code achieves a 140% speedup by replacing an inefficient list operation with a more performant approach. The key optimization is changing
stack.insert(0, v)tostack.append(v)followed by a singlestack.reverse()call.What changed:
topologicalSortUtil:stack.insert(0, v)→stack.append(v)topologicalSort: Addedstack.reverse()before returningvisited[i] == False→not visited[i](slightly more Pythonic)Why this is faster:
The original code performs
stack.insert(0, v)for every node visited, which is an O(N) operation since Python lists must shift all existing elements when inserting at the head. For a graph with N nodes, this results in O(N²) total time complexity just for list operations.The optimized version uses
stack.append(v)(O(1) operation) for each node, then performs a singlestack.reverse()(O(N)) at the end. This reduces the list operation complexity from O(N²) to O(N).Performance impact:
The line profiler shows the stack operation time dropped from 3.06ms (21% of total time) to 1.78ms (12.6% of total time) in
topologicalSortUtil. The optimization is particularly effective for larger graphs - test cases show 157-197% speedup for graphs with 1000 nodes, while smaller graphs (≤5 nodes) show minimal or mixed results since the O(N²) vs O(N) difference isn't significant at small scales.This optimization maintains identical functionality and correctness while dramatically improving performance for larger topological sorting workloads.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
import uuid
from collections import defaultdict
imports
import pytest
from code_to_optimize.topological_sort import Graph
unit tests
----------- BASIC TEST CASES -----------
def test_empty_graph():
# Edge: Empty graph (0 vertices)
g = Graph(0)
result, sort_id = g.topologicalSort() # 5.25μs -> 5.33μs (1.56% slower)
def test_single_node_graph():
# Basic: Single node, no edges
g = Graph(1)
result, sort_id = g.topologicalSort() # 5.50μs -> 5.62μs (2.22% slower)
def test_two_nodes_no_edges():
# Basic: Two nodes, no edges
g = Graph(2)
result, sort_id = g.topologicalSort() # 5.92μs -> 5.67μs (4.41% faster)
def test_two_nodes_one_edge():
# Basic: Two nodes, one edge 0->1
g = Graph(2)
g.graph[0].append(1)
result, sort_id = g.topologicalSort() # 5.50μs -> 5.33μs (3.13% faster)
def test_three_nodes_linear():
# Basic: Linear chain 0->1->2
g = Graph(3)
g.graph[0].append(1)
g.graph[1].append(2)
result, sort_id = g.topologicalSort() # 5.54μs -> 5.46μs (1.52% faster)
def test_three_nodes_branching():
# Basic: Branching 0->1, 0->2
g = Graph(3)
g.graph[0].append(1)
g.graph[0].append(2)
result, sort_id = g.topologicalSort() # 5.38μs -> 5.58μs (3.73% slower)
def test_three_nodes_diamond():
# Basic: Diamond 0->1, 0->2, 1->2
g = Graph(3)
g.graph[0].append(1)
g.graph[0].append(2)
g.graph[1].append(2)
result, sort_id = g.topologicalSort() # 5.42μs -> 5.17μs (4.84% faster)
def test_disconnected_components():
# Basic: Two disconnected chains: 0->1, 2->3
g = Graph(4)
g.graph[0].append(1)
g.graph[2].append(3)
result, sort_id = g.topologicalSort() # 5.42μs -> 5.42μs (0.018% faster)
----------- EDGE TEST CASES -----------
def test_cycle_detection():
# Edge: Graph with a cycle (should not be a valid topological sort)
# The provided implementation does not detect cycles, so it will return a result
# We check that the output does not satisfy topological sort constraints for a cycle
g = Graph(3)
g.graph[0].append(1)
g.graph[1].append(2)
g.graph[2].append(0)
result, sort_id = g.topologicalSort() # 5.12μs -> 5.17μs (0.813% slower)
# For a cycle, no valid topological sort exists. The output will be some permutation.
# We check that at least one dependency is violated
# 0->1, so 0 before 1; 1->2, so 1 before 2; 2->0, so 2 before 0
# This is impossible; so at least one constraint must be violated.
violations = 0
if result.index(0) > result.index(1): violations += 1
if result.index(1) > result.index(2): violations += 1
if result.index(2) > result.index(0): violations += 1
def test_self_loop():
# Edge: Node with a self-loop
g = Graph(1)
g.graph[0].append(0)
result, sort_id = g.topologicalSort() # 4.62μs -> 4.79μs (3.46% slower)
def test_multiple_edges():
# Edge: Multiple edges between same nodes
g = Graph(3)
g.graph[0].append(1)
g.graph[0].append(1)
g.graph[1].append(2)
g.graph[1].append(2)
result, sort_id = g.topologicalSort() # 5.08μs -> 5.12μs (0.820% slower)
def test_isolated_nodes():
# Edge: Some nodes have no edges
g = Graph(4)
g.graph[0].append(1)
# nodes 2 and 3 are isolated
result, sort_id = g.topologicalSort() # 5.29μs -> 5.29μs (0.000% faster)
def test_reverse_edges():
# Edge: All edges reversed (2->1, 1->0)
g = Graph(3)
g.graph[2].append(1)
g.graph[1].append(0)
result, sort_id = g.topologicalSort() # 4.96μs -> 5.00μs (0.820% slower)
def test_graph_with_no_edges():
# Edge: Graph with nodes but no edges
g = Graph(5)
result, sort_id = g.topologicalSort() # 5.50μs -> 5.62μs (2.22% slower)
def test_graph_with_duplicate_edges():
# Edge: Duplicate edges between nodes
g = Graph(3)
g.graph[0].extend([1,1,1])
result, sort_id = g.topologicalSort() # 5.21μs -> 5.17μs (0.813% faster)
def test_large_graph_sparse_edges():
# Large: 100 nodes, only 1 edge
g = Graph(100)
g.graph[0].append(99)
result, sort_id = g.topologicalSort() # 24.7μs -> 20.9μs (18.1% faster)
----------- LARGE SCALE TEST CASES -----------
def test_large_linear_chain():
# Large: Linear chain of 1000 nodes
N = 1000
g = Graph(N)
for i in range(N-1):
g.graph[i].append(i+1)
result, sort_id = g.topologicalSort()
def test_large_branching_tree():
# Large: Tree structure, 1000 nodes, each node i points to i+1 and i+2 (if possible)
N = 1000
g = Graph(N)
for i in range(N):
if i+1 < N:
g.graph[i].append(i+1)
if i+2 < N:
g.graph[i].append(i+2)
result, sort_id = g.topologicalSort()
# For each i, i before i+1 and i+2
for i in range(N):
if i+1 < N:
pass
if i+2 < N:
pass
def test_large_disconnected_graph():
# Large: 10 chains of 100 nodes each, disconnected
chains = 10
chain_len = 100
N = chains * chain_len
g = Graph(N)
for c in range(chains):
start = c * chain_len
for i in range(chain_len - 1):
g.graph[start + i].append(start + i + 1)
result, sort_id = g.topologicalSort() # 415μs -> 155μs (167% faster)
# For each chain, order must be preserved
for c in range(chains):
start = c * chain_len
for i in range(chain_len - 1):
pass
def test_large_graph_all_isolated():
# Large: 1000 nodes, all isolated
N = 1000
g = Graph(N)
result, sort_id = g.topologicalSort() # 405μs -> 150μs (169% faster)
def test_large_graph_with_cycles():
# Large: 100 nodes, create a cycle between last three
N = 100
g = Graph(N)
for i in range(N-3):
g.graph[i].append(i+1)
# Create cycle: N-3 -> N-2 -> N-1 -> N-3
g.graph[N-3].append(N-2)
g.graph[N-2].append(N-1)
g.graph[N-1].append(N-3)
result, sort_id = g.topologicalSort() # 25.2μs -> 19.8μs (26.9% faster)
# At least one constraint must be violated due to cycle
violations = 0
if result.index(N-3) > result.index(N-2): violations += 1
if result.index(N-2) > result.index(N-1): violations += 1
if result.index(N-1) > result.index(N-3): violations += 1
----------- DETERMINISM AND UUID TESTS -----------
def test_uuid_is_unique():
# Each call should produce a unique uuid string
g = Graph(1)
uuid_set = set()
for _ in range(10):
_, sort_id = g.topologicalSort() # 29.7μs -> 29.9μs (0.699% slower)
uuid_set.add(sort_id)
def test_uuid_format():
# UUID should be a valid string representation
g = Graph(1)
_, sort_id = g.topologicalSort() # 4.88μs -> 5.00μs (2.50% slower)
import re
----------- RANDOMIZED TEST CASES (DETERMINISTIC) -----------
def test_random_graph_small():
# Random graph, small size, deterministic edges
g = Graph(5)
g.graph[0].append(2)
g.graph[1].append(2)
g.graph[2].append(3)
g.graph[3].append(4)
result, sort_id = g.topologicalSort() # 5.71μs -> 5.54μs (3.01% faster)
codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import uuid
from collections import defaultdict
imports
import pytest # used for our unit tests
from code_to_optimize.topological_sort import Graph
unit tests
def is_valid_topo_sort(order, graph_edges, num_vertices):
"""Helper function to check if the returned order is a valid topological sort."""
position = {node: idx for idx, node in enumerate(order)}
for u in range(num_vertices):
for v in graph_edges[u]:
if position[u] >= position[v]:
return False
return set(order) == set(range(num_vertices)) and len(order) == num_vertices
------------------------
1. Basic Test Cases
------------------------
def test_single_node_graph():
# Graph with a single node and no edges
g = Graph(1)
order, sort_id = g.topologicalSort() # 4.96μs -> 4.88μs (1.70% faster)
def test_two_nodes_one_edge():
# Graph: 0 -> 1
g = Graph(2)
g.graph[0].append(1)
order, sort_id = g.topologicalSort() # 4.96μs -> 4.92μs (0.854% faster)
def test_three_nodes_chain():
# Graph: 0 -> 1 -> 2
g = Graph(3)
g.graph[0].append(1)
g.graph[1].append(2)
order, sort_id = g.topologicalSort() # 5.08μs -> 5.04μs (0.813% faster)
def test_three_nodes_branch():
# Graph: 0 -> 1, 0 -> 2
g = Graph(3)
g.graph[0].append(1)
g.graph[0].append(2)
order, sort_id = g.topologicalSort() # 5.17μs -> 5.08μs (1.63% faster)
def test_multiple_components():
# Graph: 0->1, 2->3
g = Graph(4)
g.graph[0].append(1)
g.graph[2].append(3)
order, sort_id = g.topologicalSort() # 5.21μs -> 5.17μs (0.813% faster)
------------------------
2. Edge Test Cases
------------------------
def test_empty_graph():
# Empty graph (no nodes)
g = Graph(0)
order, sort_id = g.topologicalSort() # 4.25μs -> 4.38μs (2.86% slower)
def test_graph_with_no_edges():
# Graph with 5 nodes, no edges
g = Graph(5)
order, sort_id = g.topologicalSort() # 5.58μs -> 5.46μs (2.31% faster)
def test_graph_with_cycle():
# Graph: 0->1->2->0 (cycle)
g = Graph(3)
g.graph[0].append(1)
g.graph[1].append(2)
g.graph[2].append(0)
# The function does NOT detect cycles, so it will recurse infinitely or stack overflow.
# We expect a RecursionError or stack overflow.
with pytest.raises(RecursionError):
g.topologicalSort()
def test_graph_with_self_loop():
# Graph: 0->0 (self loop)
g = Graph(1)
g.graph[0].append(0)
with pytest.raises(RecursionError):
g.topologicalSort()
def test_disconnected_graph():
# Graph: 0->1, 2, 3 (disconnected nodes)
g = Graph(4)
g.graph[0].append(1)
order, sort_id = g.topologicalSort() # 7.21μs -> 7.79μs (7.48% slower)
# 2 and 3 can be anywhere
def test_duplicate_edges():
# Graph: 0->1, 0->1 (duplicate edge)
g = Graph(2)
g.graph[0].append(1)
g.graph[0].append(1)
order, sort_id = g.topologicalSort() # 6.08μs -> 6.46μs (5.81% slower)
------------------------
3. Large Scale Test Cases
------------------------
def test_large_linear_chain():
# Graph: 0->1->2->...->999
n = 1000
g = Graph(n)
for i in range(n-1):
g.graph[i].append(i+1)
order, sort_id = g.topologicalSort()
for i in range(n-1):
pass
def test_large_wide_graph():
# Graph: 0->i for i=1 to 999
n = 1000
g = Graph(n)
for i in range(1, n):
g.graph[0].append(i)
order, sort_id = g.topologicalSort() # 432μs -> 168μs (157% faster)
# 0 must come before all others
for i in range(1, n):
pass
def test_large_sparse_graph():
# Graph: edges only from even to next odd (0->1, 2->3, ...)
n = 1000
g = Graph(n)
for i in range(0, n-1, 2):
g.graph[i].append(i+1)
order, sort_id = g.topologicalSort() # 387μs -> 130μs (197% faster)
for i in range(0, n-1, 2):
pass
def test_large_graph_with_multiple_components():
# Two chains: 0->1->...->499 and 500->501->...->999
n = 1000
g = Graph(n)
for i in range(0, 499):
g.graph[i].append(i+1)
for i in range(500, 999):
g.graph[i].append(i+1)
order, sort_id = g.topologicalSort() # 404μs -> 141μs (187% faster)
# Check chains are respected
for i in range(0, 499):
pass
for i in range(500, 999):
pass
def test_large_graph_with_no_edges():
# 1000 nodes, no edges
n = 1000
g = Graph(n)
order, sort_id = g.topologicalSort() # 401μs -> 148μs (169% faster)
codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from code_to_optimize.topological_sort import Graph
def test_Graph_topologicalSort():
Graph.topologicalSort(Graph(1))
🔎 Concolic Coverage Tests and Runtime
codeflash_concolic_7dnvqc5e/tmptwugmlm2/test_concolic_coverage.py::test_Graph_topologicalSortTo edit these changes
git checkout codeflash/optimize-Graph.topologicalSort-mhq0bhxyand push.