Add performance tips tutorial #1065

mollyxu · 2025-11-20T05:31:08Z

Consolidate performance tips in docs

NicolasHug

Made a first pass, thanks @mollyxu , it looks great!

examples/decoding/performance_tips.py

NicolasHug · 2025-11-21T10:11:17Z

examples/decoding/performance_tips.py

+# If you need to decode multiple frames at once, it is faster when using the batch methods. TorchCodec's batch APIs reduce overhead and can leverage
+# internal optimizations.


Nit: here it might be useful to explicitly say that the batch methods are faster than the single-frame decoding methods - e.g. get_frames_at() is faster than calling get_frame_at() multiple times.

examples/decoding/performance_tips.py

NicolasHug · 2025-11-21T10:47:45Z

examples/decoding/performance_tips.py

+# - If you care about exactness of frame seeking, use “exact”.
+# - If you can sacrifice exactness of seeking for speed, which is usually the case when doing clip sampling, use “approximate”.
+# - If your videos don’t have variable framerate and their metadata is correct, then “approximate” mode is a net win: it will be just as accurate as the “exact” mode while still being significantly faster.
+# - If your size is small enough and we’re decoding a lot of frames, there’s a chance exact mode is actually faster.


Above:

This is a good description. I think we can be more nuanced about when to recommend approximate, e.g. we should try to clearly articulate the last 3 bullet points which are currently slightly overlapping and contradictory (we now know that approximate won't always be "a net win").

That's on me: I need to first have a clear understanding of why approximate mode is sometimes slower, and I'll need to update the approximate mode tutorial with more detailed recommendations.

I won't be able to do that in the next few days, so to unblock yourself I think you can just remove the claims about approximate being strictly superior ( bullet points 2 and 3), and the more generic reco could be something like

If the video is long and you're only decoding a small amount of frames, approximate mode should be faster.

It's not super actionable for users but I hope the dedicated tutorial I'll edit will be more precise.

examples/decoding/performance_tips.py

NicolasHug · 2025-11-21T11:38:59Z

examples/decoding/performance_tips.py

+#
+# **Performance impact:** CUDA decoding can significantly outperform CPU decoding,
+# especially for high-resolution videos and when combined with GPU-based transforms.
+# Actual speedup varies by hardware, resolution, and codec.


I think it's good to have those bullet points here. They overlap with what is already in the CUDA decoding tutorial, and I think we'll want to remove them from there and have them here instead.

Eventually we'll also want to update the CUDA tutorial to explain to users how to check whether they're falling back to the CPU.

Mainly here in this tutorial, I think we should insist on one thing (as the main point): users should be using the Beta interface with

with set_cuda_backend("beta"): dec = VideoDecoder("file.mp4", device="cuda")

mollyxu · 2025-11-21T16:39:16Z

Thanks for the feedback!

Dan-Flores · 2025-11-21T19:34:40Z

examples/decoding/performance_tips.py

+# - :meth:`~torchcodec.decoders.VideoDecoder.get_frames_at` for specific indices
+# - :meth:`~torchcodec.decoders.VideoDecoder.get_frames_in_range` for ranges
+# - :meth:`~torchcodec.decoders.VideoDecoder.get_frames_played_at` for timestamps
+# - :meth:`~torchcodec.decoders.VideoDecoder.get_frames_played_in_range` for time ranges


Maybe it would be more clear to group the similar functions here?
For example, we could add two small headers to group index based vs timestamp based retrieval:

For index based frame retrieval:

get_frames_at

get_frames_in_range

For timestamp based frame retrieval:
...

Dan-Flores · 2025-11-21T19:35:35Z

Let's update docs/source/index.rst so this tutorial appears on the main index.html page (similar to these changes)

Dan-Flores · 2025-11-21T19:37:30Z

examples/decoding/performance_tips.py

+#
+# - You need bit-exact results
+# - Small resolution videos and the PCI-e transfer latency is large
+# - GPU is already busy and CPU is idle


super nit: this is a personal writing style preference, but within a section lets consistently use either active or passive voice. For example, we could remove "you" from the first bullet point, and instead use the passive voice: "bit-exact results are needed"

first draft of performance tips tutorial

304fdf9

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 20, 2025

modify format

5693776

meta-pytorch deleted a comment from meta-codesync bot Nov 20, 2025

mollyxu added 2 commits November 20, 2025 14:18

Merge branch 'meta-pytorch:main' into performance-tips-tutorial

e8b2a73

Merge branch 'meta-pytorch:main' into performance-tips-tutorial

7ac0d2f

NicolasHug reviewed Nov 21, 2025

View reviewed changes

address feedback

a74f653

Dan-Flores reviewed Nov 21, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add performance tips tutorial #1065

Add performance tips tutorial #1065

mollyxu commented Nov 20, 2025

Uh oh!

NicolasHug left a comment

Uh oh!

Uh oh!

NicolasHug Nov 21, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

NicolasHug Nov 21, 2025

Uh oh!

Uh oh!

NicolasHug Nov 21, 2025

Uh oh!

mollyxu commented Nov 21, 2025

Uh oh!

Dan-Flores Nov 21, 2025

Uh oh!

Dan-Flores commented Nov 21, 2025

Uh oh!

Dan-Flores Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		# If you need to decode multiple frames at once, it is faster when using the batch methods. TorchCodec's batch APIs reduce overhead and can leverage
		# internal optimizations.

Add performance tips tutorial #1065

Are you sure you want to change the base?

Add performance tips tutorial #1065

Conversation

mollyxu commented Nov 20, 2025

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

NicolasHug Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

NicolasHug Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

NicolasHug Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

mollyxu commented Nov 21, 2025

Uh oh!

Dan-Flores Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

Dan-Flores commented Nov 21, 2025

Uh oh!

Dan-Flores Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants