Skip to content

[WIP] Optimize pack() #18524

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from
Draft

[WIP] Optimize pack() #18524

wants to merge 3 commits into from

Conversation

nielsdos
Copy link
Member

@nielsdos nielsdos commented May 8, 2025

No description provided.

@staabm
Copy link
Contributor

staabm commented May 21, 2025

maybe a interessting benchmark/test-case for unpack: stomp-php/stomp-php#184

<?php
$file = file_get_contents('FILE');
echo count(unpack('C*', $file)) . "\n";

vs.

<?php
$file = file_get_contents('FILE');
echo strlen($file) . "\n";

using truncate -s 80M FILE.

the strlen() variant is a lot faster

@nielsdos
Copy link
Member Author

nielsdos commented Jun 8, 2025

maybe a interessting benchmark/test-case for unpack: stomp-php/stomp-php#184

<?php
$file = file_get_contents('FILE');
echo count(unpack('C*', $file)) . "\n";

vs.

<?php
$file = file_get_contents('FILE');
echo strlen($file) . "\n";

using truncate -s 80M FILE.

the strlen() variant is a lot faster

Unpack will always be slower than just strlen. However, your code revealed that repetitions were handled in a slow way where lots of temporary strings were created and then parsed. I opened a PR to fix that particular issue: #18803

@nielsdos nielsdos mentioned this pull request Jun 10, 2025
@divinity76
Copy link
Contributor

divinity76 commented Jun 10, 2025

could benchmark

static inline void php_pack(const zval *val, size_t size,
                            php_pack_endianness enc, char *out)
{
    zend_long z = zval_get_long(val);

    if ((enc == PHP_LITTLE_ENDIAN) != MACHINE_LITTLE_ENDIAN) {
        z = PHP_LONG_BSWAP(z);
    }
    memcpy(out, (char*)&z + sizeof(z) - size, size);
}

might be faster

@nielsdos
Copy link
Member Author

Very strangely, my original code with zend_never_inline is slightly faster than master, but your code without zend_never_inline seems to beat that in my test with the 'J' specifier. Testing some more stuff...

@divinity76
Copy link
Contributor

if the performance difference insignificant/marginal, as in hardly even benchmark-able, i would recommend just ignoring it.

I like how this makes pack the code much simpler (assuming it actually works on BE)

@nielsdos
Copy link
Member Author

I think I managed to make the compiler happy and let it make good inlining decisions while keeping the code simple.

For example for this:

for ($i = 0; $i < 10_000_000; ++$i) {
  pack("J", 0x7FFFFFFFFFFFFFFF);
}

On an i7-4790:

Benchmark 1: ./sapi/cli/php pack.php
  Time (mean ± σ):     408.8 ms ±   3.4 ms    [User: 406.1 ms, System: 1.6 ms]
  Range (min … max):   403.6 ms … 413.6 ms    10 runs
 
Benchmark 2: ./sapi/cli/php_old pack.php
  Time (mean ± σ):     451.7 ms ±   7.7 ms    [User: 448.5 ms, System: 2.0 ms]
  Range (min … max):   442.8 ms … 461.2 ms    10 runs
 
Summary
  ./sapi/cli/php pack.php ran
    1.11 ± 0.02 times faster than ./sapi/cli/php_old pack.php

And for this:

for ($i=0;$i<4000000;$i++)
pack("nvc*", 0x1234, 0x5678, 65, 66);

On the same machine:

Benchmark 1: ./sapi/cli/php pack.php
  Time (mean ± σ):     239.3 ms ±   6.0 ms    [User: 236.2 ms, System: 2.3 ms]
  Range (min … max):   233.2 ms … 256.8 ms    12 runs
 
Benchmark 2: ./sapi/cli/php_old pack.php
  Time (mean ± σ):     271.9 ms ±   3.3 ms    [User: 269.7 ms, System: 1.3 ms]
  Range (min … max):   267.4 ms … 279.0 ms    11 runs
 
Summary
  ./sapi/cli/php pack.php ran
    1.14 ± 0.03 times faster than ./sapi/cli/php_old pack.php

Let's hope it's reproducible

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants