-
Notifications
You must be signed in to change notification settings - Fork 976
Open
Labels
arrowChanges to the arrow crateChanges to the arrow crateenhancementAny new improvement worthy of a entry in the changelogAny new improvement worthy of a entry in the changeloggood first issueGood for newcomersGood for newcomersperformance
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
While reviewing #7263 from @zhuqi-lucas I noticed that the json reader is allocating a new for each string field it parses
arrow-rs/arrow-json/src/reader/string_array.rs
Lines 110 to 120 in f4fde76
TapeElement::I32(n) if coerce_primitive => { | |
builder.append_value(n.to_string()); | |
} | |
TapeElement::F32(n) if coerce_primitive => { | |
builder.append_value(n.to_string()); | |
} | |
TapeElement::F64(high) if coerce_primitive => match tape.get(p + 1) { | |
TapeElement::F32(low) => { | |
let val = f64::from_bits(((high as u64) << 32) | low as u64); | |
builder.append_value(val.to_string()); | |
} |
Describe the solution you'd like
I would like to make the json reader faster by not allocating in the inner loop
Describe alternatives you've considered
I think instead of doing
for ... {
builder.append_value(n.to_string())
}
A typically faster pattern is
// temp buffer
let mut s = String::new();
for ... {
s.clear(); // reuse allocation
write!(&mut s, "{n}");
builder.append_value(s)
}
Additional context
Metadata
Metadata
Assignees
Labels
arrowChanges to the arrow crateChanges to the arrow crateenhancementAny new improvement worthy of a entry in the changelogAny new improvement worthy of a entry in the changeloggood first issueGood for newcomersGood for newcomersperformance