CM5 does not work with ptp4l #39

jclark · 2025-01-15T02:12:39Z

The CM5 doesn't work with ptp4l. Here's how to test without needing any special hardware.

Connect a CM4 and a CM5 directly with a short ethernet cable. When running headless, I do this by having the CM4 use DHCP on eth0, and then on the CM5 plug an ethernet dongle in the USB port and then plug that into my main network; on the CM5 I then use nmcli to configure eth0 with the shared method.
Install and set up chrony on both the CM4 and CM5. Ideally configure with a common local NTP server (by specifying e.g. server ntp.lan iburst in /etc/chrony/chrony.conf. Make sure chrony is successfully syncing CM4 and CM5 using chronyc sources.
Install linuxptp on both the CM4 and CM5: sudo apt install linuxptp.
On the CM5, run phc2sys to synchronize the PHC to the system clock: sudo phc2sys -q -m -l 6 -c eth0 -s CLOCK_REALTIME -O 0 >phc2sys.log &
Do tail -f phc2sys.log and wait for 30 seconds or so until the sys offsets are consistently small (absolute value < 100).
Start ptp4l on the CM5: sudo ptp4l -i eth0 --tx_timestamp_timeout 100 -l 6 -m -q. After a few seconds it should display selected local clock 2ccf67.fffe.c114d2 as best master, where the first and last part of the hex number shown corresponds to the MAC address of eth0 on the CM5 (2c:cf:67:c1:14:d2 in this case`).
Now open a terminal on the CM4 and run ptp4l as a slave: sudo ptp4l -i eth0 --tx_timestamp_timeout 100 -l 6 -m -q -s >ptp4l.log &
Do tail -f ptp4.log and wait till master offset has settled down (absolute value < 100).
Now do phc_ctl eth0 cmp. This is the difference in nanoseconds between the system clock and the PHC. If everything is working properly this should certainly be less than a millisecond ie 1,000,000. The value I actually see varies all the time. Right now it is 4629122817ns which is 4.5 seconds. However, I get wildly different results on different days.

This is on an 8Gb CM5, with 8 Jan 2025 firmware (as shown by rpi-eeprom-update), kernel 6.6.62+rpt-rpi-2712.

Since the CM4 is known good, this is going to be a hardware, firmware or driver problem on the CM5.

This isn't a problem with phc2sys: you can see the same problem just using phc_ctl eth0 set on the CM5 to do a one time synchronization of the system clock.

The text was updated successfully, but these errors were encountered:

JN19aban · 2025-01-15T02:19:40Z

@jclark let me ask something. On CM5 what bootloader it has? I am telling this because on latest updates NUMA is enabled and this might have a drawback on ptp4l or the opposite might needed to be enabled. The last days almost the same thing happened on RPI 5 (not with ptp4l) which I was used new bootloader with older kernel and the performance was bad, very very bad.

https://github.com/raspberrypi/rpi-eeprom/tree/master/firmware-2712/latest

Have a check on this.....

But on CM4 with the latest bootloader (NUMA enabled) latest kernel the whole thing is flying...... (saying as performance on the system, on PTP4l is the same)

jclark · 2025-01-15T02:28:01Z

@JN19aban It looks like NUMA can be turned off by setting SDRAM_BANKLOW=0 with rpi-eeprom-config. I will give that a try.

JN19aban · 2025-01-15T02:28:46Z

@jclark update to latest bootloader first and then disable NUMA. I see that have updates for the RAM modules specially for 8gb versions.

JN19aban · 2025-01-15T02:39:03Z

It has to be on this release....


BOOTLOADER: up to date
   CURRENT: Wed Jan  8 17:52:48 UTC 2025 (1736358768)
    LATEST: Wed Jan  8 17:52:48 UTC 2025 (1736358768)
   RELEASE: latest (/lib/firmware/raspberrypi/bootloader-2712/latest)
            Use raspi-config to change the release.

jclark · 2025-01-15T02:42:28Z

vcgencmd bootloader_version gives me

2025/01/08 17:52:48
version 97facbf492c43a5b6b0e9719860798b7cebfdebb (release)
timestamp 1736358768
update-time 1736908130
capabilities 0x0000007f

What command were you running?

jclark · 2025-01-15T02:45:46Z

With that the offset is 344096797ns i.e. 0.34s, which is a lot different from before.

Do you have any insight as to why NUMA should be affecting ptp4l so much?

JN19aban · 2025-01-15T02:47:51Z

This command:

sudo rpi-eeprom-update

To update use this:

sudo rpi-eeprom-update -a

But you have to change the bootloader release from stable to latest.

To do this you have to run:

sudo apt update
sudo apt upgrade
sudo reboot

1. sudo raspi-config
2. 6 Advance Options
3. A5 Bootloader Version
4. E1 Latest
5. Hit ok and the finish and reboot
6. run sudo rpi-eeprom-update -a
7. and reboot

jclark · 2025-01-15T02:50:17Z

I'm running latest already and get the same output from rpi-eeprom-update as you.

JN19aban · 2025-01-15T02:51:57Z

"Do you have any insight as to why NUMA should be affecting ptp4l so much? "

Wrong RAM timings the first one......... and you have to be with NUMA enabled on kernel... otherwise bad very bad performance...

Although I do not know precisely what effect has on PTP4L because yet I do not have available any CM5.

JN19aban · 2025-01-15T02:54:03Z

So this value 344096797ns is with NUMA enabled or disabled you posted before?

jclark · 2025-01-15T02:55:35Z

With NUMA disabled

JN19aban · 2025-01-15T02:57:24Z

OK try with SDRAM_BANKLOW=1

and re run the test with SDRAM_BANKLOW=2

jclark · 2025-01-15T02:57:59Z

The first result (more than 4 seconds offset) was with latest firmware and nothing added to the firmware config, so NUMA enabled.

JN19aban · 2025-01-15T02:58:58Z

Theoretically is with SDRAM_BANKLOW=3 but to make sure run with 1 and 2 values......

To add something important. After all the test try with SDRAM_BANKLOW=-1 to disable NUMA and RAM enchantments.

JN19aban · 2025-01-15T03:48:01Z

@jclark I am thinking something more.

With the NUMA setup (RAM timings etc.) the system is adjusted for performance, compatibility, etc.

On theory the HW Timestamps shouldn't be affected that much as the previously posted latencies.

Can you do more testing that is the HW Timestamp truly enabled and not "faked" somehow on PTP4L? and the whole latency you see might be Software Timestamps?

An other scenario I am thinking can be the new I/O controller (RP1) that might affect somehow the performance of the PHY???

jclark · 2025-01-15T04:35:28Z

With SDRAM_BANKLOW=1, I get -26605883ns (so -26,605,883ns = 0.02s).
With SDRAM_BANKLOW=2, I get 460105087ns (so 460,105,087ns = 0.4s).
With SDRAM_BANKLOW=-1, I get 640659702ns (so 640,659,702ns =0.6s).

Note that in all cases the value isn't constant: it changes gradually. So I wonder how reproducible these are.

Going back to no SDRAM_BANKLOW entry, I get 469640908ns (so 469,640,908ns = 0.5s). Going back to SDRAM_BANKLOW=1, I get 523767888ns so (so 523,767,888ns = 0.5s). So not reproducible at all. I suspect this is not the issue.

JN19aban · 2025-01-15T05:04:08Z

It might be, I will catch an eye for more news that come up on the RPI 5 / CM 5. I will probably have one on my hands (CM5) by the 15th of February if all goes well.

I strongly think these 2 things on my previous comment though.

jclark · 2025-01-16T05:27:01Z

Wireshark shows clearly that the problem is that the hardware transmit timestamps from the CM5 are incorrect.

After some quality time with bpftrace, I think I know what is going on. The behaviour is a kernel bug. It is a result of three things:

the CM5 has two PTP clocks associated with eth0, each with its own packet timestamper; one is the PHY level one from drivers/net/phy/bcm-phy-ptp.c and the other is MAC level one from drivers/net/ethernet/cadence/macb_ptp.c (neither the CM4 nor Pi 5 have two PTP clocks)
the macb driver uses a legacy method of controlling hardware timestamping, using ndo_eth_ioctl instead of the newer ndo_hwtstamp_set/_get methods
the code that handles these ioctls (dev_{set,get}_hwtstamp) in net/core/dev_ioctl.c doesn't properly handle the case whether there are both MAC and PHY level timestampers, but the MAC level timestamper uses the legacy ndo_eth_ioctl

The overall result is that the hardware transmit timestamping code in the PHY level driver doesn't get called, and I think there is a weird mix of the two PHCs being used, which is why there are wrong timestamps.

In kernel 6.8, the macb driver is updated to use the newer ndo_hwtstamp_set/_get, so this problem should go away. I tried applying the patch for this to the current Raspberry Pi kernel, but I got strange errors from the macb driver ("DMA bus error: HRESP not OK"), and I have absolutely no idea what is causing those.

So a simple workaround is to compile the current kernel commenting about CONFIG_MACB_USE_HWSTAMP in the config also avoids having the two competing clocks.

Unfortunately after fixing this problem, another appears: ptp4l gives the infamous timed out while polling for tx timestamp error, even after increasing increasing tx_timestamp_timeout to a ridiculously large value. Argh!

JN19aban · 2025-01-16T05:38:30Z

This means that you see on /dev/ two clocks ptp0 and ptp1?

jclark · 2025-01-16T05:43:14Z

Right: the CM5 has a /dev/ptp0 (like the /dev/ptp0 on the CM4) and a /dev/ptp1 (like the /dev/ptp0 on the Pi5).

JN19aban · 2025-01-16T05:46:26Z

That is new..... so you can lets say use 2 clocks (an example PTP and SyncE together), of course when will work correct. That sounds nice so far.

I am also wondering if also support partitioning (NPAR) or might in future.

lhoward · 2025-01-17T06:43:24Z

It sounds nice but isn't there the issue described here where the kernel can only report one timestamp at a time?

jclark · 2025-01-18T04:55:15Z

I just did an update to next (using sudo rpi-update next), and it is now working. The test above gives a difference of 0.5 milliseconds, which is about right.

uname -a gives

Linux valdeon 6.12.10-v8-16k+ #1840 SMP PREEMPT Fri Jan 17 18:08:09 GMT 2025 aarch64 GNU/Linux

rpi-eeprom-update gives

BOOTLOADER: up to date
   CURRENT: Tue 14 Jan 00:16:48 UTC 2025 (1736813808)
    LATEST: Tue 14 Jan 00:16:48 UTC 2025 (1736813808)
   RELEASE: latest (/lib/firmware/raspberrypi/bootloader-2712/latest)
            Use raspi-config to change the release.

JN19aban · 2025-01-18T05:36:12Z

@jclark So it seems to be fixed on kernel 6.12.x Kernel. Nice.

@lhoward See this https://lwn.net/Articles/859792/

geerlingguy · 2025-01-21T04:01:39Z

Have you raised an issue about this in the Pi kernel repo? They might be willing to back port the fix. Or maybe motivation to get to next LTS kernel sooner, which I would love for many other reasons!

jclark · 2025-01-21T04:16:10Z

@geerlingguy I posted it on the forum thread where they invited feedback on the 6.12 kernel https://forums.raspberrypi.com/viewtopic.php?t=379745&start=100#p2287594. Hopefully that will put it on their radar and encourage them to move to 6.12 soon.

I don't think the backport would be easy. There's the problem with the MAC timestamper being used. The only fix I see for that is to update the macb driver to the new method for setting hardware timestamps. But when I did that, I got mysterious DMA-related errors. Then there's another problem related to the interrupt register in the bcm_phy_ptp driver, and I have no idea what fixed that.

As of Nov 20th, they were talking about moving to 6.12 in a few months, so I'm hoping it will be soon. But sudo rpi-update next is already pretty easy.

Waynechen026 · 2025-02-12T01:04:01Z

Hi @jclark ,I have updated to 6.12, but I still cannot use ptp4l in CM5. The error "timed out while polling for tx timestamp" still occurs. Why is this happening? We are using the TimeProvider TP4100 for time synchronization.

jclark · 2025-02-12T01:58:23Z

@Waynechen026 Are you using the tx_timestamp_timeout option? This is still needed. I usually use 100. If you still get the error, open a new issue with full details of your setup (kernel version, firmware version, logs, hardware etc).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CM5 does not work with ptp4l #39

CM5 does not work with ptp4l #39

jclark commented Jan 15, 2025 •

edited

Loading

JN19aban commented Jan 15, 2025 •

edited

Loading

jclark commented Jan 15, 2025

JN19aban commented Jan 15, 2025 •

edited

Loading

JN19aban commented Jan 15, 2025

jclark commented Jan 15, 2025

jclark commented Jan 15, 2025

JN19aban commented Jan 15, 2025

jclark commented Jan 15, 2025

JN19aban commented Jan 15, 2025 •

edited

Loading

JN19aban commented Jan 15, 2025

jclark commented Jan 15, 2025

JN19aban commented Jan 15, 2025

jclark commented Jan 15, 2025

JN19aban commented Jan 15, 2025 •

edited

Loading

JN19aban commented Jan 15, 2025

jclark commented Jan 15, 2025 •

edited

Loading

JN19aban commented Jan 15, 2025

jclark commented Jan 16, 2025 •

edited

Loading

JN19aban commented Jan 16, 2025 •

edited

Loading

jclark commented Jan 16, 2025 •

edited

Loading

JN19aban commented Jan 16, 2025 •

edited

Loading

lhoward commented Jan 17, 2025 •

edited

Loading

jclark commented Jan 18, 2025

JN19aban commented Jan 18, 2025

geerlingguy commented Jan 21, 2025

jclark commented Jan 21, 2025

Waynechen026 commented Feb 12, 2025

jclark commented Feb 12, 2025

CM5 does not work with ptp4l #39

CM5 does not work with ptp4l #39

Comments

jclark commented Jan 15, 2025 • edited Loading

JN19aban commented Jan 15, 2025 • edited Loading

jclark commented Jan 15, 2025

JN19aban commented Jan 15, 2025 • edited Loading

JN19aban commented Jan 15, 2025

jclark commented Jan 15, 2025

jclark commented Jan 15, 2025

JN19aban commented Jan 15, 2025

jclark commented Jan 15, 2025

JN19aban commented Jan 15, 2025 • edited Loading

JN19aban commented Jan 15, 2025

jclark commented Jan 15, 2025

JN19aban commented Jan 15, 2025

jclark commented Jan 15, 2025

JN19aban commented Jan 15, 2025 • edited Loading

JN19aban commented Jan 15, 2025

jclark commented Jan 15, 2025 • edited Loading

JN19aban commented Jan 15, 2025

jclark commented Jan 16, 2025 • edited Loading

JN19aban commented Jan 16, 2025 • edited Loading

jclark commented Jan 16, 2025 • edited Loading

JN19aban commented Jan 16, 2025 • edited Loading

lhoward commented Jan 17, 2025 • edited Loading

jclark commented Jan 18, 2025

JN19aban commented Jan 18, 2025

geerlingguy commented Jan 21, 2025

jclark commented Jan 21, 2025

Waynechen026 commented Feb 12, 2025

jclark commented Feb 12, 2025

jclark commented Jan 15, 2025 •

edited

Loading

JN19aban commented Jan 15, 2025 •

edited

Loading

JN19aban commented Jan 15, 2025 •

edited

Loading

JN19aban commented Jan 15, 2025 •

edited

Loading

JN19aban commented Jan 15, 2025 •

edited

Loading

jclark commented Jan 15, 2025 •

edited

Loading

jclark commented Jan 16, 2025 •

edited

Loading

JN19aban commented Jan 16, 2025 •

edited

Loading

jclark commented Jan 16, 2025 •

edited

Loading

JN19aban commented Jan 16, 2025 •

edited

Loading

lhoward commented Jan 17, 2025 •

edited

Loading