-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-19400: Expand specification and contract test coverage for InputStream reads. #7367
base: trunk
Are you sure you want to change the base?
Conversation
Hi @steveloughran ! This is the follow-up you requested for more specification/test coverage of input stream reads. I've confirmed this is passing on local, HDFS and S3A (GCS as the back-end). I can't easily check the others unfortunately. |
These new tests also pass for |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all code is good; comments about the spec. Nice work!
@@ -175,6 +175,24 @@ must block until at least one byte is returned. Thus, for any data source | |||
of length greater than zero, repeated invocations of this `read()` operation | |||
will eventually read all the data. | |||
|
|||
#### Implementation Notes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- is there a way to add these specification statements in the python statements? as that's designed to be what people write tests off.
- Please you the SHALL/MUST/MAYR than other terms "is expected to" etc. Yes, yours is the better prose, but we want no ambiguity here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review, Steve.
- I pushed up a change trying to move some of this up. A few of the implementation notes remain to state the unique implementation choices of HDFS.
- The RFC verbiage is important. Thanks!
`IndexOutOfBoundsException` is thrown. | ||
1. A read of `length` 0 is a no-op, and the returned `result` is 0. No exception is thrown, assuming | ||
all other arguments are valid. | ||
1. Reads through any method are expected to return the same data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooh, good one this. It's implicit in the model of data as an array of bytes, but yes, nice to call out.
dataset(len, 0x40, 0x80)); | ||
try (FSDataInputStream is = fs.openFile(path).build().get()) { | ||
Assertions.assertThatThrownBy(() -> is.read(null, 0, 10)) | ||
.isInstanceOfAny(IllegalArgumentException.class, NullPointerException.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not seen this before, interesting.
I wonder if we could extend intercept() to take a list of classes.
I prefer intercept because it does two things assertj doesn't
- returns the assert for future analysis
- if there is no exception raised, returns the result of any lambda-expression which doesn't return void.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's an interesting idea. I just tried going down the path of writing an interceptAny
accepting a list of exception classes. The problem we run into though is that we want to parameterize on the exception type E
and return the specific type. If the list can contain any exception type, then we can't put a meaningful bound on the returned E
. It seems all we can do is devolve back to Throwable
.
I also don't currently have the use case (yet) of chaining additional analysis on that returned exception.
I think I'll hold off on this.
3a8608b
to
617d9b5
Compare
🎊 +1 overall
This message was automatically generated. |
Description of PR
Enhance the FS specification and contract tests to cover expected semantics of the InputStream single-byte and multi-byte read methods:
InputStream
class and HDFSDFSInputStream
.How was this patch tested?
I ran all subclasses of
AbstractContractOpenTest
for local, HDFS and S3A (connecting to GCS's S3-compatible XML API).These new tests also pass for
GoogleHadoopFileSystem
.I generated the site locally and confirmed the new specification content.
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?