Add stop train callback, torch filesystem #87

Matvezy · 2025-03-31T17:04:14Z

Description

Added early stopping and additional filesystem setting for pytorch

Type of change

Please delete options that are not relevant.

New feature (non-breaking change which adds functionality)

How has this change been tested, please provide a testcase or example of how you tested the change?

Tested locally

Any specific deployment considerations

N/A

Docs

Docs updated? What were the changes: No

SkalskiP · 2025-03-31T17:25:13Z

Hi @Matvezy 👋🏻 Could you show how this feature is intended to be used?

It seems that when you start training like this:

from rfdetr import RFDETRBase

model = RFDETRBase()

model.train(dataset_dir=<DATASET_PATH>, epochs=10, batch_size=4, grad_accum_steps=4, lr=1e-4, output_dir=<OUTPUT_PATH>)

The training will run until the end, and calling model.request_early_stop() won’t actually be possible during training?

SkalskiP · 2025-03-31T20:19:25Z

rfdetr/main.py

@@ -151,6 +159,17 @@ def train(self, callbacks: DefaultDict[str, List[Callable]], **kwargs):
        print(args)
        device = torch.device(args.device)

+        # Initialize early stopping if enabled
+        if args.early_stopping:


I think we should place initialization of early stopping next to MetricsPlotSink and MetricsTensorBoardSink, in rfdetr/detr.py to keep things consistant.

metrics_plot_sink = MetricsPlotSink(output_dir=config.output_dir) self.callbacks["on_fit_epoch_end"].append(metrics_plot_sink.update) self.callbacks["on_train_end"].append(metrics_plot_sink.save) metrics_tensor_board_sink = MetricsTensorBoardSink(output_dir=config.output_dir) self.callbacks["on_fit_epoch_end"].append(metrics_tensor_board_sink.update) self.callbacks["on_train_end"].append(metrics_tensor_board_sink.close)

@Matvezy is there any reason why we couldn't do that?

Matvezy · 2025-03-31T20:44:32Z

Got that change in there

SkalskiP · 2025-03-31T20:46:30Z

Awesome! Let me test it and we should be good to merge this change

…/rf-detr into add-stop-train-callback

…top-train-callback

probicheaux · 2025-04-01T01:09:11Z

rfdetr/util/early_stopping.py

        else:
-            # No valid mAP metric found, skip early stopping check
+            if self.verbose:
+                print("Early stopping: No valid mAP metric found, skipping check")


this should probably just raise an exception

probicheaux · 2025-04-01T01:09:45Z

rfdetr/util/early_stopping.py

            self.best_map = current_map
            self.counter = 0
            if self.verbose:
-                print(f"Early stopping: mAP improved to {current_map:.4f}")
+                print(f"Early stopping: mAP improved to {current_map:.4f} using {metric_source} metric")


prints should be logging.logger

…top-train-callback

probicheaux

lets not change default behavior

rfdetr/config.py

SkalskiP · 2025-04-02T15:03:11Z

@probicheaux I set the default value of early_stopping to False; I assume that's what you flagged

Matvezy added 4 commits March 29, 2025 00:53

early stopping

2c3fd2b

mp upd

5e46652

100 eps

84ad2ac

early stopping

b15c33a

Matvezy marked this pull request as ready for review March 31, 2025 17:08

eraly stopping callback

4503135

SkalskiP reviewed Mar 31, 2025

View reviewed changes

eraly stopping callback

c8c576a

Matvezy and others added 7 commits March 31, 2025 22:18

trun early stopping on by default

468c525

Delete rfdetr/test_early_stopping.py

ecc0ee9

Delete rfdetr/test_output/log.txt

37b0832

fix callback saving

52bf804

Merge branch 'add-stop-train-callback' of https://github.com/roboflow…

2cd747f

…/rf-detr into add-stop-train-callback

fix bug

c4c4197

import fix

eca71c1

probicheaux approved these changes Mar 31, 2025

View reviewed changes

Matvezy added 6 commits March 31, 2025 23:33

updated based on max ema or regular

d25f5e1

updated based on max ema or regular

2050872

filesystem

6b8e1f8

default 10 steps

49c2c5e

drop redundnant log

f8a5664

Merge branch 'main' of https://github.com/roboflow/rf-detr into add-s…

10bb487

…top-train-callback

probicheaux approved these changes Apr 1, 2025

View reviewed changes

Matvezy added 3 commits April 1, 2025 19:47

Merge branch 'main' of https://github.com/roboflow/rf-detr into add-s…

600a064

…top-train-callback

pull changes

b67bf6e

fix merging and model saving

88de763

probicheaux approved these changes Apr 2, 2025

View reviewed changes

probicheaux requested changes Apr 2, 2025

View reviewed changes

SkalskiP reviewed Apr 2, 2025

View reviewed changes

rfdetr/config.py Outdated Show resolved Hide resolved

Update rfdetr/config.py

575abe5

probicheaux approved these changes Apr 2, 2025

View reviewed changes

SkalskiP merged commit e6de200 into main Apr 2, 2025
1 check passed

SkalskiP mentioned this pull request Apr 2, 2025

Update Saving with Callbacks #54

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add stop train callback, torch filesystem #87

Add stop train callback, torch filesystem #87

Uh oh!

Matvezy commented Mar 31, 2025 •

edited

Loading

Uh oh!

SkalskiP commented Mar 31, 2025

Uh oh!

SkalskiP Mar 31, 2025

Uh oh!

Matvezy commented Mar 31, 2025

Uh oh!

SkalskiP commented Mar 31, 2025

Uh oh!

probicheaux Apr 1, 2025

Uh oh!

probicheaux Apr 1, 2025

Uh oh!

probicheaux left a comment

Uh oh!

Uh oh!

SkalskiP commented Apr 2, 2025

Uh oh!

Uh oh!

Uh oh!

Add stop train callback, torch filesystem #87

Add stop train callback, torch filesystem #87

Uh oh!

Conversation

Matvezy commented Mar 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

Any specific deployment considerations

Docs

Uh oh!

SkalskiP commented Mar 31, 2025

Uh oh!

SkalskiP Mar 31, 2025

Choose a reason for hiding this comment

Uh oh!

Matvezy commented Mar 31, 2025

Uh oh!

SkalskiP commented Mar 31, 2025

Uh oh!

probicheaux Apr 1, 2025

Choose a reason for hiding this comment

Uh oh!

probicheaux Apr 1, 2025

Choose a reason for hiding this comment

Uh oh!

probicheaux left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

SkalskiP commented Apr 2, 2025

Uh oh!

Uh oh!

Uh oh!

Matvezy commented Mar 31, 2025 •

edited

Loading