Skip to content

Conversation

@last-genius
Copy link
Collaborator

@last-genius last-genius commented Nov 12, 2025

opening to see how this runs in the CI.

this is a very naive implementation where we read the file byte-by-byte to see how this fares in the benchmarks as the worst possible case.

next step - implement our own buffering (most importantly, it needs to support native dup-ing, preserving the buffer and the current seek position)

@last-genius last-genius force-pushed the asv/posix-io branch 2 times, most recently from 1a746a1 to 9433cf5 Compare November 12, 2025 20:49
@last-genius
Copy link
Collaborator Author

There are still a few correctness issues left that I can't quite figure out, but benchmarks show that without buffering I/O (and reading byte-by-byte) we are definitely slower than bash. Depending on the case, this change makes osh_cpp from 5% to 300% slower than before

@andychu
Copy link
Contributor

andychu commented Nov 15, 2025

Hmm this is a very good experiment! Thank you

And it highlights that the only thing we really use is readline()

Let me see what other shells do for reading a line

I will started a thread on Zulip

Signed-off-by: Andrii Sultanov <[email protected]>
Magic of POSIX I/O!

Signed-off-by: Andrii Sultanov <[email protected]>
@last-genius last-genius force-pushed the asv/posix-io branch 2 times, most recently from 4000146 to 8df12fc Compare November 19, 2025 15:33
@last-genius
Copy link
Collaborator Author

Implemented buffering - seems to be mostly functional.

There's this odd error in benchmarks though:

if test -n "$ac_unrecognized_opts" && test "$enable_option_checking" != no; then
  { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: unrecognized options: $ac_unrecognized_opts" >&5
$as_echo "$as_me: WARNING: unrecognized options: $ac_unrecognized_opts" >&2;}
fi
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            ^~
benchmarks/testdata/configure-coreutils:67955: Couldn't find terminator for here doc that starts here

Some of the mycpp-examples are failing as well, but the cause is not completely clear to me.

The algorithm is as follows:
1. Fill the buffer
2. Move through the buffer (incrementing pos_), separating it into
   segments split by newlines. Return the next segment on each call
3. If newline is not found and we are in the middle of the buffer, move the
   segment since the last newline to the start of the buffer and fill the
   rest, move to 2.
4. If newline is not found and we are in the beginning of the buffer (no
   newlines seen previously), this line is too long.
5. If EOF is seen when filling the buffer, return the buffer as a single
   segment

Signed-off-by: Andrii Sultanov <[email protected]>
@last-genius
Copy link
Collaborator Author

Figured out the benchmarks error, only the examples one is left now...

@last-genius
Copy link
Collaborator Author

Buffering certainly improved performance from the one-byte read implementation, but there's some way to go to get the libc i/o perf.

Table of bash ratios:

|        Case       | Baseline | One-Byte Read | Buffering |
|:-----------------:|:--------:|:-------------:|:---------:|
| abuild-print-help |   1.30   |      3.95     |    1.45   |
| configure.tcc     |   1.04   |      1.24     |    1.04   |
| hello-world       |   2.45   |      2.66     |    2.61   |

Buffering benchmarks: https://op.oilshell.org/uuu/github-jobs/10880/benchmarks3.wwz/_tmp/osh-runtime/index.html
One-byte read: https://op.oilshell.org/uuu/github-jobs/10810/benchmarks3.wwz/_tmp/osh-runtime/index.html
Baseline: https://op.oilshell.org/uuu/github-jobs/10895/benchmarks3.wwz/_tmp/osh-runtime/index.html

@andychu
Copy link
Contributor

andychu commented Nov 26, 2025

Ah OK interesting ... I think looking at end-to-end benchmarks is good, but we could also write some more focused benchmarks

Although maybe first, I'm not sure why we can use our own buffering, but we can't use libc's buffering?


BTW buffering may relate to our io.stdin API, which we need to figure out what to do with

@andychu
Copy link
Contributor

andychu commented Nov 26, 2025

Hm, so at a high level, would this lead to a fix for both

  • the ulimit bug with descriptor 100 versus 10
  • the set -x and redirects

? I may have paged the details out of memory ...

@last-genius
Copy link
Collaborator Author

Hm, so at a high level, would this lead to a fix for both

* the `ulimit` bug with descriptor 100 versus 10

* the `set -x` and redirects

? I may have paged the details out of memory ...

ulimit has a separate fix, but currently it leads to additional seeking (since on dup, we create a new FILE and lose the old buffer, need to seek back to our actual position in the file), which was raised as a concern. Having our own buffering implementation would allow us to implement a CFile.dup() (instead of posix.dup()), which would avoid additional seeking, preserving the buffer while changing the underlying file descriptor - this is not yet implemented but shouldn't be too hard once we agree on the foundations.

set -x bug will benefit from the current state of this PR since there's no buffering for writes, so Aidan's /dev/stderr fix works in C++ (but doesn't in Python, since it still has buffered writes, it would still require this change: #2541 (comment)); and if /dev/stderr fix is not taken up, it could benefit from the "persistent" fds work for the ulimit bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants