-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CM5 does not work with ptp4l #39
Comments
@jclark let me ask something. On CM5 what bootloader it has? I am telling this because on latest updates NUMA is enabled and this might have a drawback on ptp4l or the opposite might needed to be enabled. The last days almost the same thing happened on RPI 5 (not with ptp4l) which I was used new bootloader with older kernel and the performance was bad, very very bad. https://github.com/raspberrypi/rpi-eeprom/tree/master/firmware-2712/latest Have a check on this..... But on CM4 with the latest bootloader (NUMA enabled) latest kernel the whole thing is flying...... (saying as performance on the system, on PTP4l is the same) |
@JN19aban It looks like NUMA can be turned off by setting SDRAM_BANKLOW=0 with rpi-eeprom-config. I will give that a try. |
@jclark update to latest bootloader first and then disable NUMA. I see that have updates for the RAM modules specially for 8gb versions. |
It has to be on this release....
|
What command were you running? |
With that the offset is Do you have any insight as to why NUMA should be affecting ptp4l so much? |
This command:
To update use this:
But you have to change the bootloader release from stable to latest. To do this you have to run:
1. |
I'm running latest already and get the same output from rpi-eeprom-update as you. |
"Do you have any insight as to why NUMA should be affecting ptp4l so much? " Wrong RAM timings the first one......... and you have to be with NUMA enabled on kernel... otherwise bad very bad performance... Although I do not know precisely what effect has on PTP4L because yet I do not have available any CM5. |
So this value |
With NUMA disabled |
OK try with and re run the test with |
The first result (more than 4 seconds offset) was with latest firmware and nothing added to the firmware config, so NUMA enabled. |
Theoretically is with To add something important. After all the test try with |
@jclark I am thinking something more. With the NUMA setup (RAM timings etc.) the system is adjusted for performance, compatibility, etc. On theory the HW Timestamps shouldn't be affected that much as the previously posted latencies. Can you do more testing that is the HW Timestamp truly enabled and not "faked" somehow on PTP4L? and the whole latency you see might be Software Timestamps? An other scenario I am thinking can be the new I/O controller (RP1) that might affect somehow the performance of the PHY??? |
With SDRAM_BANKLOW=1, I get -26605883ns (so -26,605,883ns = 0.02s). Note that in all cases the value isn't constant: it changes gradually. So I wonder how reproducible these are. Going back to no SDRAM_BANKLOW entry, I get 469640908ns (so 469,640,908ns = 0.5s). Going back to SDRAM_BANKLOW=1, I get 523767888ns so (so 523,767,888ns = 0.5s). So not reproducible at all. I suspect this is not the issue. |
It might be, I will catch an eye for more news that come up on the RPI 5 / CM 5. I will probably have one on my hands (CM5) by the 15th of February if all goes well. I strongly think these 2 things on my previous comment though. |
Wireshark shows clearly that the problem is that the hardware transmit timestamps from the CM5 are incorrect. After some quality time with bpftrace, I think I know what is going on. The behaviour is a kernel bug. It is a result of three things:
The overall result is that the hardware transmit timestamping code in the PHY level driver doesn't get called, and I think there is a weird mix of the two PHCs being used, which is why there are wrong timestamps. In kernel 6.8, the macb driver is updated to use the newer ndo_hwtstamp_set/_get, so this problem should go away. I tried applying the patch for this to the current Raspberry Pi kernel, but I got strange errors from the macb driver ("DMA bus error: HRESP not OK"), and I have absolutely no idea what is causing those. So a simple workaround is to compile the current kernel commenting about Unfortunately after fixing this problem, another appears: ptp4l gives the infamous |
This means that you see on |
Right: the CM5 has a |
That is new..... so you can lets say use 2 clocks (an example PTP and SyncE together), of course when will work correct. That sounds nice so far. I am also wondering if also support |
It sounds nice but isn't there the issue described here where the kernel can only report one timestamp at a time? |
I just did an update to uname -a gives
rpi-eeprom-update gives
|
@jclark So it seems to be fixed on kernel @lhoward See this https://lwn.net/Articles/859792/ |
Have you raised an issue about this in the Pi kernel repo? They might be willing to back port the fix. Or maybe motivation to get to next LTS kernel sooner, which I would love for many other reasons! |
@geerlingguy I posted it on the forum thread where they invited feedback on the 6.12 kernel https://forums.raspberrypi.com/viewtopic.php?t=379745&start=100#p2287594. Hopefully that will put it on their radar and encourage them to move to 6.12 soon. I don't think the backport would be easy. There's the problem with the MAC timestamper being used. The only fix I see for that is to update the macb driver to the new method for setting hardware timestamps. But when I did that, I got mysterious DMA-related errors. Then there's another problem related to the interrupt register in the bcm_phy_ptp driver, and I have no idea what fixed that. As of Nov 20th, they were talking about moving to 6.12 in a few months, so I'm hoping it will be soon. But |
Hi @jclark ,I have updated to 6.12, but I still cannot use ptp4l in CM5. The error "timed out while polling for tx timestamp" still occurs. Why is this happening? We are using the TimeProvider TP4100 for time synchronization. |
@Waynechen026 Are you using the tx_timestamp_timeout option? This is still needed. I usually use 100. If you still get the error, open a new issue with full details of your setup (kernel version, firmware version, logs, hardware etc). |
The CM5 doesn't work with ptp4l. Here's how to test without needing any special hardware.
server ntp.lan iburst
in/etc/chrony/chrony.conf
. Make sure chrony is successfully syncing CM4 and CM5 usingchronyc sources
.sudo apt install linuxptp
.sudo phc2sys -q -m -l 6 -c eth0 -s CLOCK_REALTIME -O 0 >phc2sys.log &
tail -f phc2sys.log
and wait for 30 seconds or so until the sys offsets are consistently small (absolute value < 100).sudo ptp4l -i eth0 --tx_timestamp_timeout 100 -l 6 -m -q
. After a few seconds it should displayselected local clock 2ccf67.fffe.c114d2 as best master
, where the first and last part of the hex number shown corresponds to the MAC address of eth0 on the CM5 (2c:cf:67:c1:14:d2
in this case`).sudo ptp4l -i eth0 --tx_timestamp_timeout 100 -l 6 -m -q -s >ptp4l.log &
tail -f ptp4.log
and wait till master offset has settled down (absolute value < 100).phc_ctl eth0 cmp
. This is the difference in nanoseconds between the system clock and the PHC. If everything is working properly this should certainly be less than a millisecond ie 1,000,000. The value I actually see varies all the time. Right now it is4629122817ns
which is 4.5 seconds. However, I get wildly different results on different days.This is on an 8Gb CM5, with 8 Jan 2025 firmware (as shown by rpi-eeprom-update), kernel 6.6.62+rpt-rpi-2712.
Since the CM4 is known good, this is going to be a hardware, firmware or driver problem on the CM5.
This isn't a problem with phc2sys: you can see the same problem just using
phc_ctl eth0 set
on the CM5 to do a one time synchronization of the system clock.The text was updated successfully, but these errors were encountered: