Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.3.1 loops on "We must catchup" #4364

Open
martinkirch opened this issue Feb 20, 2025 · 12 comments
Open

2.3.1 loops on "We must catchup" #4364

martinkirch opened this issue Feb 20, 2025 · 12 comments
Labels

Comments

@martinkirch
Copy link
Contributor

Description

Since v2.3.0 , after a few (between 5 and 60) minutes of execution my script loops as follows:

2025/02/20 07:23:54 [clock.blank.2:2] We must catchup 0.20 seconds!
2025/02/20 07:23:55 [clock.blank.2:2] We must catchup 0.20 seconds!
2025/02/20 07:23:56 [clock.blank.2:2] We must catchup 0.21 seconds!
2025/02/20 07:23:57 [clock.blank.2:2] We must catchup 0.20 seconds!

Once in that loop, catchup time slowly increases.

Steps to reproduce

The script is mostly a combination of queues (complete script here) but this happens when queues were empty, so a playlist on a folder is playing. CPU use is low. The only suspicious log line above the loop is

[mp3float @ 0x7f6044114340] Could not update timestamps for skipped samples.

but this appears a few (3 to 10) times before the catchup loop, it might be unrelated.

I'm trying to reproduce it in smaller scripts, but hardly could isolate something deterministic.

Expected behavior

Liquidsoap should catchup.

Liquidsoap version

Liquidsoap rolling-release-v2.3.x-146-gbe901d2+dev
Copyright (c) 2003-2024 Savonet team
Liquidsoap is open-source software, released under GNU General Public License.
See <http://liquidsoap.info> for more information.

Liquidsoap build config

* Liquidsoap version  : rolling-release-v2.3.x-146-gbe901d2+dev

 * Compilation options
   - Release build       : false
   - Git SHA             : (none)
   - OCaml version       : 5.2.1
   - OS type             : Unix
   - Libs versions       : alsa=0.3.0 angstrom=0.16.1 asetmap=0.8.1 asn1-combinators=0.3.2 astring=0.8.5 backoff=0.1.1 base=v0.17.1 base.base_internalhash_types=v0.17.1 base.shadow_stdlib=v0.17.1 base64=3.5.1 bigarray-compat=1.1.0 bigstringaf=0.10.0 bos=0.2.1 bytes=[distributed with OCaml 4.02 or above] ca-certs=v1.0.0 camlp-streams camomile.lib=2.0 cohttp=5.3.1 cohttp-lwt=5.3.0 cohttp-lwt-unix=5.3.0 conduit=7.1.0 conduit-lwt=7.1.0 conduit-lwt-unix=7.1.0 cry=1.0.3 cstruct=6.2.0 ctypes=0.23.0 ctypes-foreign=0.23.0 ctypes.stubs=0.23.0 curl=0.9.2 digestif.c=1.2.0 domain-name=0.4.1 dtools=0.4.5 dune-build-info=3.17.2 dune-private-libs.dune-section=3.17.2 dune-site=3.17.2 dune-site.private=3.17.2 duppy=0.9.4 eqaf=0.10 faad=0.5.1 ffmpeg-av=1.2.2 ffmpeg-avcodec=1.2.2 ffmpeg-avdevice=1.2.2 ffmpeg-avfilter=1.2.2 ffmpeg-avutil=1.2.2 ffmpeg-swresample=1.2.2 ffmpeg-swscale=1.2.2 fileutils=v0.6.6 fmt=0.9.0 fpath=0.7.3 gen=1.1 gmap=0.3.0 inotify=v2.6 integers ipaddr=5.6.0 ipaddr-sexp=5.6.0 ipaddr.unix=5.6.0 kdf.hkdf=1.0.0 kdf.pbkdf=1.0.0 ladspa=0.2.2 lame=0.3.7 liquidsoap-lang=rolling-release-v2.3.x-146-gbe901d2 liquidsoap-lang.console=rolling-release-v2.3.x-146-gbe901d2 liquidsoap_alsa=rolling-release-v2.3.x-146-gbe901d2-dirty liquidsoap_builtins=rolling-release-v2.3.x-146-gbe901d2-dirty liquidsoap_core=rolling-release-v2.3.x-146-gbe901d2-dirty liquidsoap_faad=rolling-release-v2.3.x-146-gbe901d2-dirty liquidsoap_ffmpeg=rolling-release-v2.3.x-146-gbe901d2-dirty liquidsoap_ladspa=rolling-release-v2.3.x-146-gbe901d2-dirty liquidsoap_lame=rolling-release-v2.3.x-146-gbe901d2-dirty liquidsoap_mad=rolling-release-v2.3.x-146-gbe901d2-dirty liquidsoap_ndi=rolling-release-v2.3.x-146-gbe901d2-dirty liquidsoap_optionals=rolling-release-v2.3.x-146-gbe901d2-dirty liquidsoap_oss=rolling-release-v2.3.x-146-gbe901d2-dirty liquidsoap_prometheus=rolling-release-v2.3.x-146-gbe901d2-dirty liquidsoap_pulseaudio=rolling-release-v2.3.x-146-gbe901d2-dirty liquidsoap_runtime=rolling-release-v2.3.x-146-gbe901d2-dirty liquidsoap_sdl=rolling-release-v2.3.x-146-gbe901d2-dirty liquidsoap_sdl_log_level=rolling-release-v2.3.x-146-gbe901d2-dirty liquidsoap_soundtouch=rolling-release-v2.3.x-146-gbe901d2-dirty liquidsoap_ssl=rolling-release-v2.3.x-146-gbe901d2-dirty liquidsoap_stereotool=rolling-release-v2.3.x-146-gbe901d2-dirty liquidsoap_tls=rolling-release-v2.3.x-146-gbe901d2-dirty liquidsoap_xmlplaylist=rolling-release-v2.3.x-146-gbe901d2-dirty liquidsoap_yaml=rolling-release-v2.3.x-146-gbe901d2-dirty logs=0.7.0 logs.fmt=0.7.0 logs.lwt=0.7.0 lwt lwt.unix macaddr=5.6.0 mad=0.5.3 magic-mime=1.3.1 mem_usage=0.1.2 menhirLib=20240715 metadata=0.3.0 mirage-crypto=2.0.0 mirage-crypto-ec=2.0.0 mirage-crypto-pk=2.0.0 mirage-crypto-rng=2.0.0 mirage-crypto-rng.unix=2.0.0 mm=0.8.6 mm.audio=0.8.6 mm.base=0.8.6 mm.image=0.8.6 mm.midi=0.8.6 mm.video=0.8.6 multicore-magic=2.3.1 multicore-magic.__private__.multicore_magic_atomic_array_unboxed5=2.3.1 ndi=rolling-release-v2.3.x-146-gbe901d2-dirty ocaml_intrinsics_kernel=v0.17.1 ocplib-endian ocplib-endian.bigstring ohex ppx_compare.runtime-lib=v0.17.0 ppx_hash.runtime-lib=v0.17.0 ppx_sexp_conv.runtime-lib=v0.17.0 ppx_string.runtime=v0.17.0 prometheus=1.2 prometheus-app=1.2 ptime=1.2.0 ptime.clock=1.2.0 ptime.clock.os=1.2.0 pulseaudio=0.1.5 re=1.12.0 rresult=0.7.0 saturn_lockfree=0.5.0 sedlex=3.3 seq=[distributed with OCaml 4.07 or above] sexplib0=v0.17.0 soundtouch=0.1.9 ssl=0.7.0 stdlib-shims=0.3.0 stereotool=rolling-release-v2.3.x-146-gbe901d2-dirty str=5.2.1 stringext=1.6.0 threads=5.2.1 threads.posix=[internal] tls=2.0.0 tsdl=v1.1.0 tsdl-image=0.5 tsdl-ttf=0.6 unix=5.2.1 uri=4.4.0 uri-sexp=4.4.0 uri.services=4.4.0 x509=1.0.5 xml-light=2.5 xmlm=1.4.0 xmlplaylist=0.1.5 yaml=3.2.0 yaml.bindings=3.2.0 yaml.bindings.types=3.2.0 yaml.c=3.2.0 yaml.ffi=3.2.0 yaml.types=3.2.0 zarith=1.14
   - architecture        : amd64
   - host                : x86_64-pc-linux-gnu
   - target              : x86_64-pc-linux-gnu
   - system              : linux
   - ocamlopt_cflags     : -O2 -fno-strict-aliasing -fwrapv -pthread -fPIC -pthread
   - native_c_compiler   : gcc -O2 -fno-strict-aliasing -fwrapv -pthread -fPIC -pthread -D_FILE_OFFSET_BITS=64
   - native_c_libraries  : -lm -lpthread

 * Configured paths
   - mode              : default
   - standard library  : (set by dune-site)
   - scripted binaries : (set by dune-site)
   - rundir            : (set by dune-site)
   - logdir            : (set by dune-site)
   - user cache        : $HOME/.cache/liquidsoap (override with $LIQ_CACHE_USER_DIR)
   - system cache      : (set by dune-site) (override with $LIQ_CACHE_SYSTEM_DIR)
   - camomile files    : (set by dune-site)

 * Supported input formats
   - MP3               : yes
   - AAC               : yes
   - Ffmpeg            : yes
   - Flac (native)     : no (requires flac)
   - Flac (ogg)        : no (requires ogg)
   - Opus              : no (requires opus)
   - Speex             : no (requires speex)
   - Theora            : no (requires theora)
   - Vorbis            : no (requires vorbis)
   - WAV/AIFF          : yes (native)

 * Supported output formats
   - FDK-AAC           : no (requires fdkaac)
   - FFmpeg            : yes
   - MP3               : yes
   - MP3 (fixed-point) : no (requires shine)
   - Flac (native)     : no (requires flac)
   - Flac (ogg)        : no (requires ogg)
   - Opus              : no (requires opus)
   - Speex             : no (requires speex)
   - Theora            : no (requires theora)
   - Vorbis            : no (requires vorbis)
   - WAV/AIFF          : yes (native)

 * Tags
   - AAC               : yes
   - FFmpeg            : yes
   - FLAC (native)     : no (requires flac)
   - Flac (ogg)        : no (requires ogg)
   - Native decoder    : yes
   - Vorbis            : no (requires vorbis)

 * Input / output
   - ALSA              : yes
   - AO                : no (requires ao)
   - FFmpeg            : yes
   - JACK              : no (requires bjack)
   - NDI               : yes
   - OSS               : yes
   - Portaudio         : no (requires portaudio)
   - Pulseaudio        : yes
   - SRT               : no (requires srt)

 * Audio manipulation
   - FFmpeg            : yes
   - LADSPA            : yes
   - Lilv              : no (requires lilv)
   - Samplerate        : no (requires samplerate)
   - SoundTouch        : yes
   - StereoTool        : yes

 * Video manipulation
   - camlimages        : no (requires camlimages)
   - FFmpeg            : yes
   - frei0r            : no (requires frei0r)
   - ImageLib          : no (requires imagelib)
   - SDL               : yes

 * MIDI manipulation
   - DSSI              : no (requires dssi)

 * Visualization
   - GD                : no (requires gd)
   - Graphics          : no (requires graphics)
   - SDL               : yes

 * Additional libraries
   - FFmpeg filters    : yes
   - FFmpeg devices    : yes
   - inotify           : yes
   - irc               : no (requires irc-client-unix)
   - jemalloc          : no (requires jemalloc)
   - lo                : no (requires lo)
   - memtrace          : no (requires memtrace)
   - osc               : no (requires osc-unix)
   - ssl               : yes
   - sqlite3           : no (requires sqlite3)
   - tls               : yes
   - posix-time2       : no (requires posix)
   - windows service   : no (requires winsvc)
   - YAML support      : yes
   - XML playlists     : yes

 * Monitoring
   - Prometheus        : yes

Installation method

From OPAM

Additional Info

No response

@toots
Copy link
Member

toots commented Mar 6, 2025

That log line shouldn't be a problem.

Do you have more logs, in particular regarding clocks?

Since CPU usage remains low, something must be off in the streaming loop.

I'd also consider removing operators one by one until you cannot reproduce the issue anymore.

@vitoyucepi
Copy link
Collaborator

Hi @toots,
Is it possible to measure the performance of different source operators so that they can be collected using the Prometheus exporter?
For example, input.http should be able to play N seconds, but instead plays M seconds. The values are cumulative, like process CPU seconds.
Ideally, I think, the test case should be a pipeline with a bunch of operations, and for some of them the performance is not 100%. Maybe faster than realtime, maybe slower.

@toots
Copy link
Member

toots commented Mar 7, 2025

Yes! There's a latency metric:

# Metric factory
latency_metrics_create = prometheus.latency(labels=["radio", "type", "name"])

# Instantiate:
latency_metrics_create(label_values=[radio_name, "input", source.name.(s)], s)

It would be an excellent idea to get metrics at this level to understand the bottleneck indeed.

@martinkirch
Copy link
Contributor Author

I'd also consider removing operators one by one until you cannot reproduce the issue anymore.

it turns out I can reproduce it with from the main branch with a script as simple as this

music= playlist("/home/martin/radio/liquidsoap/Music")
main_output = output.pulseaudio(mksafe(music))

but it only fails when one MP3 is in the folder. That file seems to cause a few occurrences of [mp3float @ 0x7f30780d8780] Could not update timestamps for skipped samples before the catchup loop.

Said file is available for 7 days here, build config and debug log are here

@toots
Copy link
Member

toots commented Mar 17, 2025

Thanks I'm testing it. Can you send logs with log.level := 4?

@toots
Copy link
Member

toots commented Mar 17, 2025

Also, I can't figure out the version you're using, could you give more details on it?

@toots
Copy link
Member

toots commented Mar 17, 2025

Nevermind got it.

@toots
Copy link
Member

toots commented Mar 18, 2025

I'm really unable to reproduce here. Could you do the following:

  • Rebuild with 4.14.2
  • Install memtrace
  • Run with the env var: MEMTRACE=/tmp/trace.cf

And send me the resulting file once you see a sufficient catchup. Also, could you try with the latest mmain or rolling-release-v2.3.x? I pushed some changes there recently that may or may not be related.. 😄

@extobi
Copy link

extobi commented Mar 18, 2025

I have the same problem/error:

2025/03/18 19:26:03 [clock.generic.8:2] We must catchup 17.89 seconds!
2025/03/18 19:26:09 [clock.generic.8:2] We must catchup 19.98 seconds!
2025/03/18 19:26:11 [clock.generic.8:2] We must catchup 17.73 seconds!
2025/03/18 19:26:19 [clock.generic.8:2] We must catchup 22.79 seconds!
2025/03/18 19:26:21 [clock.generic.8:2] We must catchup 18.43 seconds!
2025/03/18 19:26:25 [clock.generic.8:2] We must catchup 19.80 seconds!
2025/03/18 19:26:31 [clock.generic.8:2] We must catchup 24.41 seconds!
2025/03/18 19:26:33 [clock.generic.8:2] We must catchup 24.64 seconds!
2025/03/18 19:26:37 [clock.generic.8:2] We must catchup 23.31 seconds!
2025/03/18 19:26:43 [clock.generic.8:2] We must catchup 21.55 seconds!
[tls @ 0x7c3bb0040400] IO error: Connection timed out
[mp3float @ 0x7c3b70e76e00] 2025/03/18 19:26:53 [clock.generic.8:2] We must catchup 29.51 seconds!
invalid new backstep -1
2025/03/18 19:26:53 [input.http.7:2] Feeding failed: Avutil.Error(Connection timed out)
2025/03/18 20:04:22 [input.http.9:3] Source created multiple tracks in a single frame! Sub-frame tracks are not supported and are merged into a single one..
[mp3float @ 0x7c3b68019bc0] Could not update timestamps for discarded samples.
2025/03/18 20:05:29 [input.http.3:2] Feeding failed: Ffmpeg_decoder.End_of_file
2025/03/18 20:05:29 [decoder.ffmpeg:3] Requested content-type for "https://www.xxx.de/api/sendungen/radio1.mp3": {audio=pcm(stereo)}
2025/03/18 20:05:29 [decoder.ffmpeg:3] FFmpeg recognizes "https://www.xxx.de/api/sendungen/radio1.mp3" as audio: {codec: mp3, 44100Hz, 2 channel(s)}
2025/03/18 20:05:29 [decoder.ffmpeg:3] Decoded content-type for "https://www.xxx.de/api/sendungen/radio1.mp3": {audio=pcm(stereo)}
[mp3float @ 0x7c3b680273c0] Could not update timestamps for skipped samples.
2025/03/18 20:05:29 [input.http.3:3] Source created multiple tracks in a single frame! Sub-frame tracks are not supported and are merged into a single one..

and the playlist is only planed for thursday on 20 a Clock. but loaded outside of thursday with this log. And much other Playlists with other Dates and Times, with the same on the log.
(Installed via: Azuracast)

The Playlist Entry (only copied here the important data from the conf, it is correctly from azuracst and much more ):
playlist_radio1 = mksafe(buffer(buffer=10., input.http("https://www.xxx.de/api/sendungen/radio1.mp3")))
({ (4w) and 20h0m-21h0m }, playlist_radio1)

@martinkirch
Copy link
Contributor Author

It might not be related :/
The problem still occured at rev 748cd0c , with OCaml 5 (and updates dependencies) and 4.14.2.
The build config and log are concatenated in this gist and memtrace is in this zip

@toots
Copy link
Member

toots commented Mar 19, 2025

Thanks that is very much appreciated! I see two files in those logs and a safe blank. Do you have logs with the simple script or do I need a slightly more complex script?

@martinkirch
Copy link
Contributor Author

not sure I understand your question ; but that log is executing this script:

music= playlist("/home/martin/radio/liquidsoap/Music")
main_output = output.pulseaudio(mksafe(music))

There was 3 files in the folder (but 2 from the same artist)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants