Commit 0ea0b91
Add writeVarintSve for aarch64 - retry
Summary:
Implemented an explicit SVE version of writeVarint.
Throughput for 64-bit types shows a ~15% improvement.
16-bit and 32-bit cases seem to show a small improvement as well.
All three functions are branch-free, their disassembly can be seen here: https://godbolt.org/z/jG5d8Wfe8
before:
bench_write(u16_any_branch_free) 110.66% 2.00us 500.10K
bench_write(u32_any_branch_free) 126.90% 2.00us 499.37K
bench_write(u64_any_branch_free) 193.56% 2.33us 429.37K
bench_write(u16_1b_branch_free) 99.562% 1.91us 522.97K
bench_write(u16_2b_branch_free) 114.92% 2.00us 500.59K
bench_write(u16_3b_branch_free) 111.66% 2.00us 500.99K
bench_write(u32_1b_branch_free) 97.918% 1.93us 518.38K
bench_write(u32_2b_branch_free) 113.76% 1.99us 502.29K
bench_write(u32_3b_branch_free) 111.14% 1.99us 503.03K
bench_write(u32_4b_branch_free) 115.72% 1.97us 507.52K
bench_write(u32_5b_branch_free) 122.05% 2.00us 498.82K
bench_write(u64_1b_branch_free) 99.089% 1.95us 511.71K
bench_write(u64_2b_branch_free) 90.484% 2.53us 396.00K
bench_write(u64_3b_branch_free) 93.335% 2.38us 419.63K
bench_write(u64_4b_branch_free) 100.61% 2.24us 446.86K
bench_write(u64_5b_branch_free) 123.18% 2.37us 421.24K
bench_write(u64_6b_branch_free) 120.10% 2.33us 429.84K
bench_write(u64_7b_branch_free) 144.69% 2.36us 423.79K
bench_write(u64_8b_branch_free) 149.44% 2.25us 443.92K
bench_write(u64_9b_branch_free) 174.37% 2.31us 433.60K
bench_write(u64_10b_branch_free) 176.81% 2.28us 438.61K
bench_write(exponential_1b_branch_free) 108.05% 1.91us 522.52K
bench_write(exponential_2b_branch_free) 118.34% 1.98us 504.37K
bench_write(exponential_3b_branch_free) 114.22% 1.99us 501.87K
after:
bench_write(u16_any_branch_free) 115.30% 1.97us 507.43K
bench_write(u32_any_branch_free) 130.06% 1.97us 508.40K
bench_write(u64_any_branch_free) 226.45% 1.96us 509.18K
bench_write(u16_1b_branch_free) 101.37% 1.84us 543.01K
bench_write(u16_2b_branch_free) 116.65% 1.97us 508.51K
bench_write(u16_3b_branch_free) 111.17% 1.96us 510.12K
bench_write(u32_1b_branch_free) 99.679% 1.93us 519.42K
bench_write(u32_2b_branch_free) 115.98% 1.98us 506.04K
bench_write(u32_3b_branch_free) 111.45% 1.98us 503.85K
bench_write(u32_4b_branch_free) 116.04% 1.95us 513.18K
bench_write(u32_5b_branch_free) 124.59% 1.97us 508.35K
bench_write(u64_1b_branch_free) 99.669% 1.91us 522.26K
bench_write(u64_2b_branch_free) 117.53% 1.93us 518.86K
bench_write(u64_3b_branch_free) 111.95% 1.95us 511.77K
bench_write(u64_4b_branch_free) 111.29% 1.98us 504.98K
bench_write(u64_5b_branch_free) 124.53% 1.96us 510.52K
bench_write(u64_6b_branch_free) 145.48% 1.90us 526.18K
bench_write(u64_7b_branch_free) 172.51% 1.97us 506.83K
bench_write(u64_8b_branch_free) 174.92% 1.95us 514.13K
bench_write(u64_9b_branch_free) 202.27% 1.97us 508.08K
bench_write(u64_10b_branch_free) 205.43% 1.96us 510.44K
bench_write(exponential_1b_branch_free) 105.67% 1.91us 523.63K
bench_write(exponential_2b_branch_free) 116.10% 1.95us 512.64K
bench_write(exponential_3b_branch_free) 119.08% 1.95us 513.34K
Reviewed By: embg
Differential Revision: D735130031 parent ea2855e commit 0ea0b91
File tree
2 files changed
+106
-12
lines changed- third-party/thrift/src/thrift/lib/cpp/util
- test
2 files changed
+106
-12
lines changedLines changed: 104 additions & 10 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
50 | 50 | | |
51 | 51 | | |
52 | 52 | | |
53 | | - | |
| 53 | + | |
54 | 54 | | |
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
58 | 58 | | |
59 | 59 | | |
| 60 | + | |
60 | 61 | | |
61 | 62 | | |
62 | 63 | | |
| |||
430 | 431 | | |
431 | 432 | | |
432 | 433 | | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
433 | 524 | | |
434 | | - | |
435 | 525 | | |
436 | | - | |
437 | | - | |
438 | | - | |
439 | | - | |
440 | | - | |
441 | | - | |
442 | | - | |
443 | 526 | | |
444 | 527 | | |
445 | 528 | | |
446 | | - | |
| 529 | + | |
447 | 530 | | |
448 | 531 | | |
449 | 532 | | |
| |||
494 | 577 | | |
495 | 578 | | |
496 | 579 | | |
| 580 | + | |
| 581 | + | |
| 582 | + | |
| 583 | + | |
| 584 | + | |
| 585 | + | |
| 586 | + | |
| 587 | + | |
| 588 | + | |
| 589 | + | |
| 590 | + | |
497 | 591 | | |
498 | 592 | | |
499 | 593 | | |
| |||
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
235 | 235 | | |
236 | 236 | | |
237 | 237 | | |
238 | | - | |
239 | | - | |
| 238 | + | |
| 239 | + | |
240 | 240 | | |
241 | 241 | | |
242 | 242 | | |
| |||
0 commit comments