Add video encoding tutorial doc #1063

Dan-Flores · 2025-11-19T06:03:07Z

No description provided.

NicolasHug

Great tutorial @Dan-Flores , thank you! I left a few minor comments. Let's also edit the docstring of the VideoEncoder methods, and link to the relevant sections here in this tutorial.

I think that since we'll now have to ship these large decoded video frames with our docs, we are potentially increasing the "docs" size by a few dozens of MB. But I think that's OK.

NicolasHug · 2025-11-19T10:25:20Z

examples/encoding/video_encoding.py

+
+# %%
+# First, we'll download a video and decode some frames to tensors.
+# These will be the input to the VideoEncoder. For more details on decoding,


Make sure to use :class:`~torchcodec.encoders.VideoEncoder` everywhere.

It should link to the docstring page. Right now it doesn't, because you need to add it to https://github.com/meta-pytorch/torchcodec/blob/main/docs/source/api_ref_encoders.rst?plain=1

NicolasHug · 2025-11-19T10:27:01Z

examples/encoding/video_encoding.py

+
+
+# Video source: https://www.pexels.com/video/adorable-cats-on-the-lawn-4977395/
+# License: CC0. Author: Altaf Shah.


I think we can still use it, but I don't see the license being explicitly CC0.

Suggested change

# License: CC0. Author: Altaf Shah.

# Author: Altaf Shah.

NicolasHug · 2025-11-19T10:28:50Z

examples/encoding/video_encoding.py

+raw_video_bytes = response.content
+
+decoder = VideoDecoder(raw_video_bytes)
+frames = decoder[:60]  # Get first 60 frames


Let's use get_frames_in_range instead, it's more efficient, and we want users to use the most efficient decoding methods.

Slicing actually calls get_frames_in_range():

torchcodec/src/torchcodec/decoders/_video_decoder.py

Lines 193 to 203 in 1ea235a

def _getitem_slice(self, key: slice) -> Tensor:

assert isinstance(key, slice)

start, stop, step = key.indices(len(self))

frame_data, *_ = core.get_frames_in_range(

self._decoder,

start=start,

stop=stop,

step=step,

)

return frame_data

NicolasHug · 2025-11-19T10:30:00Z

examples/encoding/video_encoding.py

+# round-trip encode/decode process works as expected:
+
+decoder_verify = VideoDecoder(encoded_frames)
+decoded_frames = decoder_verify[:]


NicolasHug · 2025-11-19T10:32:55Z

examples/encoding/video_encoding.py

+        capture_output=True,
+        text=True,
+    )
+    print(f"Codec used in {output}: {result.stdout.strip()}")


Great section above. The only issue is that it pollutes the codespace with libx264_encoded.mp4 and hevc_encoded.mp4. Let's use temporary files instead, with e.g. https://docs.python.org/3/library/tempfile.html

NicolasHug · 2025-11-19T10:33:49Z

examples/encoding/video_encoding.py

+# Codec Selection
+# ---------------
+#
+# The ``codec`` parameter specifies which video codec to use for encoding.


we could start this section by indicating that by default, the codec is selected automatically based on the container format, for example "mp4" tends to default to h264 (I think? Please check me on this)

@NicolasHug, made a similar comment to what I did above. :) Doing it here or in the previous section are both great to me.

I added an intro here explaining the default behavior, this way all codec related text is under the same header. I added the mp4 -> h264 example, as it is often the case in my experience.

I think this works well. My one follow-up suggestion is to connect the sentence about codec selection with the default. Something like, "If you want a codec other than the default, use the codec parameter." Followed by the explanation of what it is.

NicolasHug · 2025-11-19T10:35:39Z

examples/encoding/video_encoding.py

+# %%
+# Low quality (high CRF)
+low_quality_output = encoder.to_tensor(format="mp4", codec="libx264", crf=50)
+play_video(low_quality_output)


well done, it's really cool to visually see the effect it has on quality!

Agreed, I came here to say the same thing. :)

examples/encoding/video_encoding.py

NicolasHug · 2025-11-19T10:38:00Z

examples/encoding/video_encoding.py

+#     to check available options for your selected codec.
+#
+
+import os


I think we can use pathlib instead https://docs.python.org/3/library/pathlib.html#pathlib.Path.stat

scotts · 2025-11-19T12:53:04Z

Let's add a reference to this tutorial in index.rst in the "Encoding" section.

scotts · 2025-11-19T13:00:26Z

examples/encoding/video_encoding.py

+#
+# :class:`~torchcodec.encoders.VideoEncoder` supports encoding frames into a
+# file via the :meth:`~torchcodec.encoders.VideoEncoder.to_file` method, to
+# file-like objects via the :meth:`~torchcodec.encoders.VideoEncoder.to_filelike`


Suggested change

# file-like objects via the :meth:`~torchcodec.encoders.VideoEncoder.to_filelike`

# file-like objects via the :meth:`~torchcodec.encoders.VideoEncoder.to_file_like`

scotts · 2025-11-19T15:21:08Z

examples/encoding/video_encoding.py

+
+print(f"Re-decoded video: {decoded_frames.shape = }")
+print(f"Original frames: {frames.shape = }")
+


I think this is an excellent place to explain that the format parameter selects the default codec - we can also briefly explain the difference between, say, an mp4 video file and the actual codec used to decode and encode the video streams in that file. If this is well explained in any externall FFmpeg docs, we can link to those as well.

That then sets us up for the next section, as the natural next question a reader may have is, what if I don't want the default codec?

At the end of the "Codec Selection" section, we should give some guidance on when to just use format and when to specify codec as well. Nothing elaborate, just a sentence or two. I think that will go a long way to informing our about the relationship between these two options.

Thanks for the suggestions, I added some brief guidance on codec vs format at the end.

I drafted text to explain the difference between container-format and codec, but I am worried it dilutes the "Codec Selection" section with text that is not specific to the API. I would be happy to add a link, but I was not able to find useful FFmpeg docs on this subject.

scotts · 2025-11-19T15:45:30Z

examples/encoding/video_encoding.py

+# control of encoding settings beyond the common parameters.
+#
+# For example, some potential extra options for the commonly used H.264 codec, ``libx264`` include:
+# For example, with , ``libx264``:


Looks like the second "For example" is from before adding the first.

Dan-Flores · 2025-11-19T19:21:37Z

src/torchcodec/encoders/_video_encoder.py

            extra_options (dict[str, Any], optional): A dictionary of additional
                encoder options to pass, e.g. ``{"qp": 5, "tune": "film"}``.
                Values will be converted to strings before passing to the encoder.
+                See :ref:`extra_options` for details.


I did not edit the docstrings, as I expect they can be useful as a quick reference to valid values. I am open to suggestions on this, though.

Docstrings look good, I left minor suggestions above. I agree it is super valuable to have short descriptions of valid values in there.

NicolasHug

Looks great, thank you! Left some minor comments that are easy to address but LGTM

NicolasHug · 2025-11-20T10:58:12Z

src/torchcodec/encoders/_video_encoder.py

                "yuv420p", "yuv444p"). If not specified, uses codec's default format.
+                See :ref:`pixel_format` for details.
            crf (int or float, optional): Constant Rate Factor for encoding quality. Lower values
                mean better quality. Valid range depends on the encoder (commonly 0-51).


Here and for the other docstrings, I think we should rather say

Suggested change

mean better quality. Valid range depends on the encoder (commonly 0-51).

mean better quality. Valid range depends on the encoder (e.g. 0-51 for libx264).

NicolasHug · 2025-11-20T10:59:31Z

src/torchcodec/encoders/_video_encoder.py

                Defaults to None (which will use encoder's default).
+                See :ref:`crf` for details.
            preset (str or int, optional): Encoder option that controls the tradeoff between
                encoding speed and compression. Valid values depend on the encoder (commonly


Here and in the other methods. I suggest this because it wasn't immediately obvious to me what "compression" meant in this case.

Suggested change

encoding speed and compression. Valid values depend on the encoder (commonly

encoding speed and compression (output size). Valid values depend on the encoder (commonly

NicolasHug · 2025-11-20T11:00:07Z

src/torchcodec/encoders/_video_encoder.py

+                See :ref:`preset` for details.
            extra_options (dict[str, Any], optional): A dictionary of additional
                encoder options to pass, e.g. ``{"qp": 5, "tune": "film"}``.
                Values will be converted to strings before passing to the encoder.


Let's remove this, I think this is an implementation detail.

NicolasHug · 2025-11-20T11:01:14Z

src/torchcodec/encoders/_video_encoder.py

            extra_options (dict[str, Any], optional): A dictionary of additional
                encoder options to pass, e.g. ``{"qp": 5, "tune": "film"}``.
                Values will be converted to strings before passing to the encoder.
+                See :ref:`extra_options` for details.


Docstrings look good, I left minor suggestions above. I agree it is super valuable to have short descriptions of valid values in there.

NicolasHug · 2025-11-20T11:04:32Z

examples/encoding/video_encoding.py

+# %%
+# For most cases, you can simply specify the format parameter and let the FFmpeg select the default codec.
+# However, specifying the codec parameter is useful to select a particular codec implementation
+# (``libx264`` vs ``libx265``) or to have more control over the encoding behavior.


No strong opinion but I think this last part could be left out. The intro to this section (starting with "By default, the codec ...") is really good and covers all what is necessary IMO

Reading it back it is a bit repetitive. Since its the only text after a code block, I'll remove it for consistency.

NicolasHug · 2025-11-20T11:05:12Z

examples/encoding/video_encoding.py

+# For example, with the commonly used H.264 codec, ``libx264``:
+#
+# - Values range from 0 (lossless) to 51 (worst quality)
+# - Values 17 or 18 are conisdered visually lossless, and the default is 23.


Suggested change

# - Values 17 or 18 are conisdered visually lossless, and the default is 23.

# - Values 17 or 18 are considered visually lossless, and the default is 23.

NicolasHug · 2025-11-20T11:06:16Z

examples/encoding/video_encoding.py

+# For example, with the commonly used H.264 codec, ``libx264`` presets include:
+#
+# - ``"ultrafast"`` (fastest), ``"fast"``, ``"medium"`` (default), ``"slow"``, ``"veryslow"`` (slowest, best compression).
+# - See additional details in the `H.264 Video Encoding Guide <https://trac.ffmpeg.org/wiki/Encode/H.264#a2.Chooseapresetandtune>`_.


Definitely a nitpick, but I think we don't need bullet points here.

…o encoding_tutorial

add tutorial w videos

ba3cbbf

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 19, 2025

NicolasHug reviewed Nov 19, 2025

View reviewed changes

scotts reviewed Nov 19, 2025

View reviewed changes

Dan-Flores added 2 commits November 19, 2025 13:56

add suggestions, link in docstrings

d5be152

add word commonly

fd59e4c

Dan-Flores commented Nov 19, 2025

View reviewed changes

Dan-Flores marked this pull request as ready for review November 19, 2025 20:22

transition sentence between codec default and selection

3eaee28

NicolasHug approved these changes Nov 20, 2025

View reviewed changes

Dan-Flores added 3 commits November 20, 2025 08:41

adjust docstirngs, apply nits

9bbeb1f

Merge branch 'main' of https://github.com/meta-pytorch/torchcodec int…

49a6614

…o encoding_tutorial

remove todo to use float frame rate

1bcb9ce

Dan-Flores merged commit 559b27e into meta-pytorch:main Nov 20, 2025
77 of 78 checks passed

Dan-Flores deleted the encoding_tutorial branch November 20, 2025 18:43



		# Video source: https://www.pexels.com/video/adorable-cats-on-the-lawn-4977395/
		# License: CC0. Author: Altaf Shah.

	def _getitem_slice(self, key: slice) -> Tensor:
	assert isinstance(key, slice)

	start, stop, step = key.indices(len(self))
	frame_data, *_ = core.get_frames_in_range(
	self._decoder,
	start=start,
	stop=stop,
	step=step,
	)
	return frame_data

	# file-like objects via the :meth:`~torchcodec.encoders.VideoEncoder.to_filelike`
	# file-like objects via the :meth:`~torchcodec.encoders.VideoEncoder.to_file_like`


		print(f"Re-decoded video: {decoded_frames.shape = }")
		print(f"Original frames: {frames.shape = }")

	mean better quality. Valid range depends on the encoder (commonly 0-51).
	mean better quality. Valid range depends on the encoder (e.g. 0-51 for libx264).

	encoding speed and compression. Valid values depend on the encoder (commonly
	encoding speed and compression (output size). Valid values depend on the encoder (commonly

	# - Values 17 or 18 are conisdered visually lossless, and the default is 23.
	# - Values 17 or 18 are considered visually lossless, and the default is 23.

Add video encoding tutorial doc #1063

Add video encoding tutorial doc #1063

Conversation

Dan-Flores commented Nov 19, 2025

Uh oh!

NicolasHug left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scotts commented Nov 19, 2025

Uh oh!

scotts Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scotts Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

NicolasHug left a comment •

edited

Loading

scotts Nov 19, 2025 •

edited

Loading

scotts Nov 19, 2025 •

edited

Loading