Accept float frame_rate in VideoEncoder #1061

Dan-Flores · 2025-11-18T19:56:53Z

As noted in #1057, frame_rate often is a float, so this PR updates the VideoEncoder API to reflect that.

test_video_encoder_against_ffmpeg_cli is updated to parametrize frame_rate and compare frame timestamp values, such as pts.

Dan-Flores · 2025-11-18T21:16:17Z

src/torchcodec/_core/Encoder.cpp

-  avCodecContext_->framerate = {inFrameRate_, 1};
+  // TODO-VideoEncoder: Add and utilize output frame_rate option
+  AVRational frameRate = av_d2q(inFrameRate_, INT_MAX);
+  avCodecContext_->time_base = av_inv_q(frameRate);


These functions are defined here, and do exactly what we need.

That's probably OK but do see https://fburl.com/metamate/mcunp6zn, we might want to use a higher-resolution time base (I'm not sure). I added a comment below about checking the int pts and float pts (seconds) against he FFmpeg cli. They might slightly differ from ours, and if they do it probably means we should increase the resolution of our timebase here.

Thanks for sharing! I updated these lines to use the recommended call order.

In terms of actually using the timebase when setting frame pts, I believe its handled by the call to av_packet_rescale_ts - https://ffmpeg.org/doxygen/5.1/group__lavc__packet.html#gae5c86e4d93f6e7aa62ef2c60763ea67e

test/test_encoders.py

NicolasHug · 2025-11-19T09:57:38Z

src/torchcodec/_core/Encoder.cpp

-  avCodecContext_->framerate = {inFrameRate_, 1};
+  // TODO-VideoEncoder: Add and utilize output frame_rate option
+  AVRational frameRate = av_d2q(inFrameRate_, INT_MAX);
+  avCodecContext_->time_base = av_inv_q(frameRate);


That's probably OK but do see https://fburl.com/metamate/mcunp6zn, we might want to use a higher-resolution time base (I'm not sure). I added a comment below about checking the int pts and float pts (seconds) against he FFmpeg cli. They might slightly differ from ours, and if they do it probably means we should increase the resolution of our timebase here.

Dan-Flores · 2025-11-19T16:07:02Z

test/test_encoders.py

-                fields=fields,
-            )
-            assert ffmpeg_metadata == encoder_metadata
+            # mkv container format handles frame_rate differently, so we skip it here


My understanding is that mkv does not store frame_rate but estimates it, so these assertions fails.
https://video.stackexchange.com/questions/18403/matroska-formats-framerate-is-not-accurate-with-certain-values

Update on this: webm format had the same error which led me to dig deeper, and I was able to resolve both by setting avStream_->r_frame_rate.

…eg_cli

Dan-Flores · 2025-11-20T05:13:33Z

test/test_encoders.py

 class TestVideoEncoder:
    def decode(self, source=None) -> torch.Tensor:
-        return VideoDecoder(source).get_frames_in_range(start=0, stop=60)
+        return VideoDecoder(source).get_frames_in_range(start=0, stop=30).data


I reduced the frames we decode from 60 -> 30.
CI was failing with No space left on device, so this change alleviates that for now.
See the error at: https://github.com/meta-pytorch/torchcodec/runs/55868185381

NicolasHug

Thanks @Dan-Flores , this looks good. I left a big-ish comment below but it should be easy to address, I'll approve to unblock

NicolasHug · 2025-11-20T11:43:25Z

test/test_encoders.py

+                for key in ffmpeg_frame.keys():
+                    assert key in encoder_frame
+                    assert ffmpeg_frame[key] == encoder_frame[key]
+


This is good, but we can improve a bit:

we don't need key-by-key comparison. We can condense all of the above into just dict comparison, i.e. assert ffmpeg_frames_info == encoder_frames_info

we should also validate the float values for duration and pts. For now we're only comparing the integer values. But the int values can be equal and still lead to different float values if the timebase is different. So it's safer to also assert the float values. And to be even safer, we could also assert the timebase in the stream info!

In both _get_frames_info and _get_video_metadata, we should validate that we actually get back the fields we are requesting! Otherwise, we might think we're validating all these fields, but if FFmpeg doesn't find them we're not actually validating anything. I realized that pkt_pts,dts,pkt_duration are often missing (but we may not actually need to validate those).

I did all this locally to check a few things and validate my undertanding, so you don't need to re-do it yourself. Here's a diff with my suggested changes

diff --git a/test/test_encoders.py b/test/test_encoders.py index 0be163a..8d106f9 100644 --- a/test/test_encoders.py +++ b/test/test_encoders.py @@ -603,9 +603,10 @@ class TestVideoEncoder: if "=" in line: key, value = line.split("=", 1) metadata[key] = value + assert all(field in metadata for field in fields) return metadata - def _get_frames_info(self, file_path): + def _get_frames_info(self, file_path, fields): """Helper function to get frame info (pts, dts, etc.) using ffprobe.""" result = subprocess.run( [ @@ -615,7 +616,7 @@ class TestVideoEncoder: "-select_streams", "v:0", "-show_entries", - "frame=pts,pkt_pts,dts,pkt_duration,duration", + f"frame={','.join(fields)}", "-of", "json", str(file_path), @@ -624,7 +625,9 @@ class TestVideoEncoder: check=True, text=True, ) - return json.loads(result.stdout)["frames"] + frames = json.loads(result.stdout)["frames"] + assert all(field in frame for field in fields for frame in frames) + return frames @pytest.mark.parametrize("method", ("to_file", "to_tensor", "to_file_like")) def test_bad_input_parameterized(self, tmp_path, method): @@ -1047,7 +1050,13 @@ class TestVideoEncoder: # Check that video metadata is the same if method == "to_file": - fields = ["duration", "duration_ts", "r_frame_rate", "nb_frames"] + fields = [ + "duration", + "duration_ts", + "r_frame_rate", + "time_base", + "nb_frames", + ] ffmpeg_metadata = self._get_video_metadata( ffmpeg_encoded_path, fields=fields, @@ -1059,16 +1068,17 @@ class TestVideoEncoder: assert ffmpeg_metadata == encoder_metadata # Check that frame timestamps and duration are the same - ffmpeg_frames_info = self._get_frames_info(ffmpeg_encoded_path) - encoder_frames_info = self._get_frames_info(encoder_output_path) - - assert len(ffmpeg_frames_info) == len(encoder_frames_info) - for ffmpeg_frame, encoder_frame in zip( - ffmpeg_frames_info, encoder_frames_info - ): - for key in ffmpeg_frame.keys(): - assert key in encoder_frame - assert ffmpeg_frame[key] == encoder_frame[key] + fields = ("pts", "pts_time") + if format != "flv": + fields += ("duration", "duration_time") + ffmpeg_frames_info = self._get_frames_info( + ffmpeg_encoded_path, fields=fields + ) + encoder_frames_info = self._get_frames_info( + encoder_output_path, fields=fields + ) + + assert ffmpeg_frames_info == encoder_frames_info def test_to_file_like_custom_file_object(self): """Test to_file_like with a custom file-like object that implements write and seek."""

You can just copy-paste it into a file and then do git apply the_file. It passes locally, but I know ffmpeg encodes metadata fields differently depending on the versions, you might have do do some minor edits for it to pass on all our CI, let's see. The most important path of the diff is the assertions that we're getting the fields we expect.

Thanks, I've applied the patch so CI can test across different FFmpeg versions.

we should validate that we actually get back the fields we are requesting! Otherwise, we might think we're validating all these fields, but if FFmpeg doesn't find them we're not actually validating anything

My thinking here was different - if our encoded video and FFmpeg's encoded video have the same metadata for the present fields, and any missing fields are missing from both, it seems likely we encoded the video correctly.

I do see the problem case of all fields being missing, in which case no validation was done.

update type to float, update tests

e504970

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 18, 2025

Dan-Flores commented Nov 18, 2025

View reviewed changes

test/test_encoders.py Show resolved Hide resolved

use pytest.approx

c1b3adc

Dan-Flores marked this pull request as ready for review November 18, 2025 22:29

NicolasHug reviewed Nov 19, 2025

View reviewed changes

compare frame ts in test, update framerate set

b6f45f6

Dan-Flores commented Nov 19, 2025

View reviewed changes

Dan-Flores added 2 commits November 19, 2025 15:14

set stream frame_rate to fix tests

d12c895

reduce decoded frames to avoid OOM in test_video_encoder_against_ffmp…

920a36c

…eg_cli

Dan-Flores commented Nov 20, 2025

View reviewed changes

NicolasHug approved these changes Nov 20, 2025

View reviewed changes

Dan-Flores added 2 commits November 20, 2025 08:26

apply suggested patch

8a6a26d

skip metadata checks on ffmpeg 4, 5

3d8f479

Dan-Flores merged commit cac99ae into meta-pytorch:main Nov 20, 2025
70 checks passed

Dan-Flores deleted the float_input_frame_rate branch November 20, 2025 16:06

Dan-Flores mentioned this pull request Nov 20, 2025

Video Encoder notes #1057

Closed

Accept float frame_rate in VideoEncoder #1061

Accept float frame_rate in VideoEncoder #1061

Uh oh!

Conversation

Dan-Flores commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Dan-Flores commented Nov 18, 2025 •

edited

Loading