-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Post about schema coders and benchmarks.
- Loading branch information
Showing
2 changed files
with
127 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,127 @@ | ||
--- | ||
title: "Schemas and Coders and Benchmarks" | ||
date: 2025-01-22T06:24:24-08:00 | ||
tags: | ||
- beam | ||
- go | ||
- hobby sdk | ||
- dev | ||
categories: | ||
- Dev | ||
--- | ||
|
||
This weekend I got nerd snipped into working on a part of my hobby SDK, that | ||
I didn't have too much motivation to do. Beam Schema Row coders. | ||
|
||
But they do need to be done eventually, so I implemented them. | ||
|
||
In particular, what I needed was a way to take the Beam Schema Proto, and turn | ||
it into coder that could produce a dynamic row value. The existing Go SDK in | ||
principle could do it, but it's not easy due to a choice I made years and years | ||
ago. I'm not convinced it's the wrong choice though. Besides, adding the new handling | ||
to the API would take longer than adding it to the hobby SDK, or for what I needed | ||
it for from scratch. | ||
|
||
Perhaps that's not entirely true. I'm being paranoid about continuing to broaden | ||
the API surface of the existing SDK. It's already quite large and complex, and | ||
the interactions are already quite subtle. | ||
|
||
Anyway, I wrote a quick naive implementation of the code, added tests, made them all | ||
pass, and [wrote a benchmark](https://github.com/lostluck/beam-go/blob/9429632fa47a6752671c4a4d6ad0325742485599/internal/schema/schema_test.go#L143). | ||
|
||
```go | ||
func BenchmarkRoundtrip(b *testing.B) { | ||
for _, test := range suite { | ||
b.Run(test.name, func(b *testing.B) { | ||
c := ToCoder(test.schema) | ||
b.ReportAllocs() | ||
b.ResetTimer() | ||
for range b.N { | ||
r := coders.Decode(c, test.data) | ||
if got, want := coders.Encode(c, r), test.data; !cmp.Equal(got, want) { | ||
b.Errorf("round trip decode-encode not equal: want %v, got %v", want, got) | ||
} | ||
} | ||
}) | ||
} | ||
} | ||
``` | ||
|
||
It's a pretty straightforward thing. I took the implementation straight from the | ||
above tests, set it to report allocations, and restart the benchmark timer, and | ||
then get to iterating. | ||
|
||
This gives bad results though. The problem is here: | ||
|
||
```go | ||
if got, want := coders.Encode(c, r), test.data; !cmp.Equal(got, want) { | ||
b.Errorf("round trip decode-encode not equal: want %v, got %v", want, got) | ||
} | ||
``` | ||
|
||
I kept the comparison in, to validate that everything continues to work as | ||
desired through the thousands of runs the coders would be put though. Or I was | ||
lazy about it. `cmp` is not intended to be high performance, it's intended to be | ||
convenient and correct. | ||
|
||
Had I used `bytes.Equal` instead of the general `cmp.Equal`, | ||
I'd have spent less CPU, and probably wouldn't have noticed. | ||
|
||
For this task, I didn't much care about CPU. I cared about allocations, which do | ||
directly affect CPU and memory usage. And `cmp` is allocation heavy by | ||
comparison to a straight byte slice equality check. | ||
|
||
That's not all though. I had used my convenience functions for the benchmark too, | ||
to trivially get the through the decode and encode cycle. | ||
`coders.Decode` and `coders.Encode` are just small wrappers to simplify quick one | ||
off encodings, such as for testing. | ||
|
||
But like `cmp`, they are convenient, not high performance. | ||
|
||
The test body [now looks like this](https://github.com/lostluck/beam-go/blob/53c2c2b073dce1eebbd4090b2d642362f277c895/internal/schema/schema_test.go#L225): | ||
|
||
```go | ||
c := ToCoder(test.schema) | ||
// Mild shenanigans to prevent unnecessary allocations. | ||
enc := coders.NewEncoder() | ||
dec := *coders.NewDecoder(test.data) | ||
n := len(test.data) | ||
want := test.data | ||
|
||
b.ReportAllocs() | ||
b.ResetTimer() | ||
for range b.N { | ||
enc.Reset(n) | ||
dec = *coders.NewDecoder(test.data) | ||
r := c.Decode(&dec) | ||
c.Encode(enc, r) | ||
if got := enc.Data(); !bytes.Equal(got, want) { | ||
b.Errorf("encoding not equal: want %v, got %v", want, got) | ||
} | ||
} | ||
``` | ||
|
||
Moving the `Encoder` and `Decoder` allocations out of the hot loop, removed them | ||
from the profile graph. There's a bit of "fun" with the Decoder to avoid it | ||
getting heap allocated anew for each loop when the test data is being reset. | ||
Having the comparison back in adds a few nanoseconds per run, but helps keep the | ||
code staying robust. | ||
|
||
All of this was a problem largely because I wanted to collect a nice and clean | ||
"before" and "after" versions of the metrics as I cleaned up and changed the | ||
implementation. I like seeing performance improvements. But now the results are | ||
incomparable, and not apples to apples, so I've cleared them away. | ||
|
||
I am happy where this has ended up though. I incorporated the `unsafe` tricks | ||
protocol buffers use to minimize allocations and space in their protoreflect | ||
package: https://github.com/protocolbuffers/protobuf-go/blob/master/reflect/protoreflect/value_unsafe_go121.go. | ||
|
||
The idea is the same, really. Be able to refer to and mutate values and fields | ||
efficiently, against a known schema. I'd use their implementation directly if I | ||
were able to, but they have it quite locked down for the same reasons why I've | ||
put this stuff in an internal package for the time being. I want it to be able | ||
to change. | ||
|
||
This is basically half of a useful article, other than the lesson about | ||
being certain you know what you're measuring in your benchmarks. We'll see if | ||
I turn back the clock and collect clean measurements of the implementations. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -59,7 +59,6 @@ | |
// hyphens: auto; | ||
|
||
white-space: normal; | ||
max-width: 60rem; | ||
} | ||
} | ||
|
||
|