⚡️ Speed up function _get_max_datetime_complete by 21%
#175
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 21% (0.21x) speedup for
_get_max_datetime_completeinoptuna/visualization/_timeline.py⏱️ Runtime :
1.21 milliseconds→1.01 milliseconds(best of250runs)📝 Explanation and details
The optimization achieves a 20% speedup by eliminating redundant iterations over
study.trialsand reducing unnecessary computations.Key optimizations:
Single-pass data collection: Instead of iterating through
study.trialsthree times (once for max duration calculation, once for_is_running_trials_in_study, and once for max datetime_complete), the optimized version combines the first and third iterations into a single loop that computes bothmax_run_durationandmax_datetime_completesimultaneously.Eliminated list comprehensions: The original code created intermediate lists for duration and datetime_complete calculations using list comprehensions with
max(). The optimized version uses direct comparisons during iteration, avoiding memory allocation and reducing overhead.Conditional
datetime.now()calls: The optimized code only callsdatetime.now()when actually needed (when returning current time), rather than potentially calling it multiple times in different code paths.Reduced multiplication operations: In the running trials check,
5 * max_run_durationis computed once and stored in a variable rather than recalculating it for each trial.Performance characteristics by test case:
The line profiler shows the optimization trades some complexity in the main loop (more time spent per iteration doing multiple calculations) for dramatically reduced total execution time by eliminating redundant work. This is particularly effective since the function appears to process trial data for visualization purposes, where performance on larger datasets is important.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
import datetime
from enum import Enum
from typing import List, Optional
imports
import pytest
from optuna.visualization._timeline import _get_max_datetime_complete
--- Minimal stubs for optuna classes to allow testing ---
class TrialState(Enum):
RUNNING = "RUNNING"
COMPLETE = "COMPLETE"
FAIL = "FAIL"
WAITING = "WAITING"
PRUNED = "PRUNED"
class FrozenTrial:
def init(
self,
datetime_start: Optional[datetime.datetime],
datetime_complete: Optional[datetime.datetime],
state: TrialState,
):
self.datetime_start = datetime_start
self.datetime_complete = datetime_complete
self.state = state
class Study:
def init(self, trials: List[FrozenTrial]):
self.trials = trials
from optuna.visualization._timeline import _get_max_datetime_complete
--- Unit tests ---
class TestGetMaxDatetimeComplete:
# --- Basic Test Cases ---
def test_single_complete_trial(self):
# One trial, completed, start and complete set
start = datetime.datetime(2024, 6, 1, 10, 0, 0)
complete = datetime.datetime(2024, 6, 1, 11, 0, 0)
trial = FrozenTrial(start, complete, TrialState.COMPLETE)
study = Study([trial])
codeflash_output = _get_max_datetime_complete(study); result = codeflash_output # 6.10μs -> 2.44μs (150% faster)
codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import datetime
from typing import List, Optional
imports
import pytest
from optuna.visualization._timeline import _get_max_datetime_complete
--- Minimal mocks for Optuna types ---
class TrialState:
RUNNING = "RUNNING"
COMPLETE = "COMPLETE"
FAIL = "FAIL"
WAITING = "WAITING"
class FrozenTrial:
def init(
self,
datetime_start: Optional[datetime.datetime],
datetime_complete: Optional[datetime.datetime],
state: str = TrialState.COMPLETE,
):
self.datetime_start = datetime_start
self.datetime_complete = datetime_complete
self.state = state
class Study:
def init(self, trials: List[FrozenTrial]):
self.trials = trials
from optuna.visualization._timeline import _get_max_datetime_complete
--- Unit tests ---
1. BASIC TEST CASES
def test_no_trials_returns_now(monkeypatch):
# No trials at all, should return datetime.now()
study = Study([])
before = datetime.datetime.now()
codeflash_output = _get_max_datetime_complete(study); result = codeflash_output # 3.67μs -> 2.13μs (71.9% faster)
after = datetime.datetime.now()
def test_all_trials_complete_returns_max_datetime_complete():
# All trials are complete, should return the latest datetime_complete
dt1 = datetime.datetime(2023, 1, 1, 10)
dt2 = datetime.datetime(2023, 1, 1, 12)
dt3 = datetime.datetime(2023, 1, 1, 15)
trials = [
FrozenTrial(dt1, dt1 + datetime.timedelta(hours=2), TrialState.COMPLETE),
FrozenTrial(dt2, dt2 + datetime.timedelta(hours=1), TrialState.COMPLETE),
FrozenTrial(dt3, dt3 + datetime.timedelta(hours=3), TrialState.COMPLETE),
]
study = Study(trials)
codeflash_output = _get_max_datetime_complete(study); result = codeflash_output # 6.01μs -> 3.07μs (95.6% faster)
def test_some_trials_missing_complete():
# Some trials have None for datetime_complete, should ignore those
dt1 = datetime.datetime(2023, 1, 2, 10)
dt2 = datetime.datetime(2023, 1, 2, 12)
trials = [
FrozenTrial(dt1, None, TrialState.COMPLETE),
FrozenTrial(dt2, dt2 + datetime.timedelta(hours=1), TrialState.COMPLETE),
]
study = Study(trials)
codeflash_output = _get_max_datetime_complete(study); result = codeflash_output # 5.15μs -> 2.25μs (129% faster)
def test_some_trials_missing_start():
# Some trials have None for datetime_start, should ignore those for duration
dt1 = datetime.datetime(2023, 1, 3, 10)
dt2 = datetime.datetime(2023, 1, 3, 12)
trials = [
FrozenTrial(None, dt1 + datetime.timedelta(hours=2), TrialState.COMPLETE),
FrozenTrial(dt2, dt2 + datetime.timedelta(hours=1), TrialState.COMPLETE),
]
study = Study(trials)
codeflash_output = _get_max_datetime_complete(study); result = codeflash_output # 5.16μs -> 2.45μs (110% faster)
def test_trials_with_various_states():
# Only COMPLETE trials' datetime_complete are considered for max
dt1 = datetime.datetime(2023, 1, 4, 10)
dt2 = datetime.datetime(2023, 1, 4, 12)
dt3 = datetime.datetime(2023, 1, 4, 14)
trials = [
FrozenTrial(dt1, dt1 + datetime.timedelta(hours=1), TrialState.COMPLETE),
FrozenTrial(dt2, dt2 + datetime.timedelta(hours=2), TrialState.RUNNING),
FrozenTrial(dt3, dt3 + datetime.timedelta(hours=3), TrialState.FAIL),
]
study = Study(trials)
codeflash_output = _get_max_datetime_complete(study); result = codeflash_output # 5.43μs -> 2.67μs (103% faster)
2. EDGE TEST CASES
def test_all_trials_missing_complete_returns_now():
# All trials have None for datetime_complete, so should return now
dt1 = datetime.datetime(2023, 1, 5, 10)
dt2 = datetime.datetime(2023, 1, 5, 12)
trials = [
FrozenTrial(dt1, None, TrialState.COMPLETE),
FrozenTrial(dt2, None, TrialState.FAIL),
]
study = Study(trials)
before = datetime.datetime.now()
codeflash_output = _get_max_datetime_complete(study); result = codeflash_output # 3.60μs -> 2.30μs (56.6% faster)
after = datetime.datetime.now()
def test_running_trial_recent_returns_now(monkeypatch):
# There is a running trial started recently, so should return now
now = datetime.datetime.now()
dt1 = now - datetime.timedelta(minutes=1)
complete_trial = FrozenTrial(now - datetime.timedelta(hours=2), now - datetime.timedelta(hours=1), TrialState.COMPLETE)
running_trial = FrozenTrial(dt1, None, TrialState.RUNNING)
study = Study([complete_trial, running_trial])
before = datetime.datetime.now()
codeflash_output = _get_max_datetime_complete(study); result = codeflash_output # 4.63μs -> 2.24μs (106% faster)
after = datetime.datetime.now()
def test_running_trial_old_does_not_return_now():
# Running trial started long ago, should not return now if outside 5duration
now = datetime.datetime.now()
# Complete trial duration: 1 hour
complete_trial = FrozenTrial(now - datetime.timedelta(hours=6), now - datetime.timedelta(hours=5), TrialState.COMPLETE)
# Running trial started 10 hours ago (outside 5duration = 5 hours)
running_trial = FrozenTrial(now - datetime.timedelta(hours=10), None, TrialState.RUNNING)
study = Study([complete_trial, running_trial])
codeflash_output = _get_max_datetime_complete(study); result = codeflash_output # 4.61μs -> 2.14μs (115% faster)
def test_running_trial_no_complete_trials_returns_now():
# Running trial, but no complete trials (so max_run_duration is None)
now = datetime.datetime.now()
running_trial = FrozenTrial(now - datetime.timedelta(hours=1), None, TrialState.RUNNING)
study = Study([running_trial])
before = datetime.datetime.now()
codeflash_output = _get_max_datetime_complete(study); result = codeflash_output # 3.40μs -> 2.10μs (62.5% faster)
after = datetime.datetime.now()
def test_complete_and_running_trials_with_missing_dates():
# Some trials missing start or complete, but running trial triggers now
now = datetime.datetime.now()
complete_trial = FrozenTrial(None, None, TrialState.COMPLETE)
running_trial = FrozenTrial(now - datetime.timedelta(minutes=2), None, TrialState.RUNNING)
study = Study([complete_trial, running_trial])
before = datetime.datetime.now()
codeflash_output = _get_max_datetime_complete(study); result = codeflash_output # 3.43μs -> 2.25μs (52.3% faster)
after = datetime.datetime.now()
def test_all_trials_missing_start_and_complete_returns_now():
# All trials missing both start and complete, should return now
trials = [
FrozenTrial(None, None, TrialState.COMPLETE),
FrozenTrial(None, None, TrialState.RUNNING),
]
study = Study(trials)
before = datetime.datetime.now()
codeflash_output = _get_max_datetime_complete(study); result = codeflash_output # 3.40μs -> 2.21μs (53.8% faster)
after = datetime.datetime.now()
def test_trial_with_zero_duration():
# Complete trial with zero duration (start == complete)
dt = datetime.datetime(2023, 1, 7, 10)
trials = [FrozenTrial(dt, dt, TrialState.COMPLETE)]
study = Study(trials)
codeflash_output = _get_max_datetime_complete(study); result = codeflash_output # 5.06μs -> 2.13μs (138% faster)
def test_trial_with_negative_duration_ignored():
# If a trial has datetime_complete < datetime_start, duration is negative but still valid for max
dt1 = datetime.datetime(2023, 1, 8, 10)
dt2 = datetime.datetime(2023, 1, 8, 11)
trials = [
FrozenTrial(dt2, dt1, TrialState.COMPLETE), # negative duration
FrozenTrial(dt1, dt2, TrialState.COMPLETE), # positive duration
]
study = Study(trials)
codeflash_output = _get_max_datetime_complete(study); result = codeflash_output # 5.41μs -> 2.70μs (100% faster)
3. LARGE SCALE TEST CASES
def test_many_trials_all_complete():
# 1000 trials, all complete, should return the latest datetime_complete
base = datetime.datetime(2023, 1, 1)
trials = [
FrozenTrial(base + datetime.timedelta(hours=i), base + datetime.timedelta(hours=i+1), TrialState.COMPLETE)
for i in range(1000)
]
study = Study(trials)
codeflash_output = _get_max_datetime_complete(study); result = codeflash_output # 156μs -> 116μs (34.1% faster)
def test_many_trials_some_missing_dates():
# 1000 trials, half missing datetime_complete, should return the latest valid datetime_complete
base = datetime.datetime(2023, 1, 1)
trials = []
for i in range(1000):
if i % 2 == 0:
trials.append(FrozenTrial(base + datetime.timedelta(hours=i), base + datetime.timedelta(hours=i+1), TrialState.COMPLETE))
else:
trials.append(FrozenTrial(base + datetime.timedelta(hours=i), None, TrialState.COMPLETE))
study = Study(trials)
codeflash_output = _get_max_datetime_complete(study); result = codeflash_output # 101μs -> 90.5μs (12.4% faster)
def test_many_trials_with_recent_running_trial_returns_now():
# 999 complete, 1 running trial started recently, should return now
base = datetime.datetime(2023, 1, 1)
now = datetime.datetime.now()
trials = [
FrozenTrial(base + datetime.timedelta(hours=i), base + datetime.timedelta(hours=i+1), TrialState.COMPLETE)
for i in range(999)
]
running_trial = FrozenTrial(now - datetime.timedelta(minutes=2), None, TrialState.RUNNING)
trials.append(running_trial)
study = Study(trials)
before = datetime.datetime.now()
codeflash_output = _get_max_datetime_complete(study); result = codeflash_output # 136μs -> 116μs (16.7% faster)
after = datetime.datetime.now()
def test_many_trials_with_old_running_trial_returns_max_complete():
# 999 complete, 1 running trial started long ago, should return max datetime_complete
base = datetime.datetime(2023, 1, 1)
now = datetime.datetime.now()
trials = [
FrozenTrial(base + datetime.timedelta(hours=i), base + datetime.timedelta(hours=i+1), TrialState.COMPLETE)
for i in range(999)
]
# Complete trial duration: 1 hour, so 5*duration = 5 hours
# Running trial started 10 hours ago
running_trial = FrozenTrial(now - datetime.timedelta(hours=10), None, TrialState.RUNNING)
trials.append(running_trial)
study = Study(trials)
codeflash_output = _get_max_datetime_complete(study); result = codeflash_output # 134μs -> 115μs (16.2% faster)
def test_many_trials_all_missing_complete_returns_now():
# 1000 trials, all missing datetime_complete, should return now
base = datetime.datetime(2023, 1, 1)
trials = [
FrozenTrial(base + datetime.timedelta(hours=i), None, TrialState.COMPLETE)
for i in range(1000)
]
study = Study(trials)
before = datetime.datetime.now()
codeflash_output = _get_max_datetime_complete(study); result = codeflash_output # 58.6μs -> 61.0μs (3.93% slower)
after = datetime.datetime.now()
codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
To edit these changes
git checkout codeflash/optimize-_get_max_datetime_complete-mhttaysvand push.