-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-19364. IoStats support for AAL. #8007
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
@steveloughran this PR address some of your comments on the original IoStats PR. The way the AAL code works currently means it's quite hard to report on a cache hit accurately, so I've skipped that for now. It's something we should report, but will need a bit of a rewrite on our end. I'll see how we can do that. Also quite hard to report on durations (I couldn't think of a way, but it would be nice to do that). We'll need someway so that when the GET request starts, it creates a duration tracker, and then when it finishes, that tracker is closed. but since these callbacks are implemented at a stream level, it doesn't seem possible to track durations for each individual request. any suggestions? Other than that this PR is now ready for another review. |
though will need an AAL release, corresponding AAL PR: awslabs/analytics-accelerator-s3#358 |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
regarding tracking, we can tune the relevant statistic api. Maybe a factory function which returns and AutoCloseable that passed down and closed(). But that isn't enough to measure failure counts. Maybe:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
really like this, especially how the cost tests show the savings in http IO.
some minor changes. regarding duration tracking, it would be nice, but let's not make a blocker.
Key change is to put all new statistics you want the FS to aggregate into Statistics enum, declaring type. Instrumentation scans that, creates fs metrics from it
/** | ||
* Bytes that were prefetched by the stream. | ||
*/ | ||
public static final String STREAM_READ_PREFETCHED_BYTES = "stream_read_prefetched_bytes"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add entries in org.apache.hadoop.fs.s3a.Statistic
; these are scanned and used to create the full filesystem instance stats which the input stream updates in close()
package org.apache.hadoop.fs.s3a.impl.streams; | ||
|
||
import org.apache.hadoop.fs.s3a.statistics.S3AInputStreamStatistics; | ||
import software.amazon.s3.analyticsaccelerator.util.RequestCallback; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: import ordering
import static org.apache.hadoop.fs.s3a.S3ATestUtils.skipForAnyEncryptionExceptSSES3; | ||
import static org.apache.hadoop.fs.contract.ContractTestUtils.*; | ||
import static org.apache.hadoop.fs.s3a.Constants.ANALYTICS_ACCELERATOR_CONFIGURATION_PREFIX; | ||
import static org.apache.hadoop.fs.s3a.S3ATestUtils.*; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer not the .* except for lots of constants; can you stop the IDE from auto-enabling it.
@MethodSource("params") | ||
public class ITestS3AContractAnalyticsStreamVectoredRead extends AbstractContractVectoredReadTest { | ||
|
||
private static final int ONE_KB = 1024; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
org.apache.hadoop.io.Sizes.S_1K
Configuration conf = super.createConfiguration(); | ||
// Set the coalesce tolerance to 1KB, default is 1MB. | ||
conf.setInt(ANALYTICS_ACCELERATOR_CONFIGURATION_PREFIX + | ||
"." + "physicalio.request.coalesce.tolerance", 10 * ONE_KB); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create a new S_10K in Sizes for this, then use.
// request will be is 128KB. Since the file being read is 128KB, we need to use this here to demonstrate that | ||
// separate GET requests are made for ranges that are not coalesced. | ||
conf.setInt(ANALYTICS_ACCELERATOR_CONFIGURATION_PREFIX + | ||
"." + "physicalio.readbuffersize", 32 * ONE_KB); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
S_32K
protected Configuration createConfiguration() { | ||
Configuration conf = super.createConfiguration(); | ||
// Set the coalesce tolerance to 1KB, default is 1MB. | ||
conf.setInt(ANALYTICS_ACCELERATOR_CONFIGURATION_PREFIX + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add new strings in Constants, and use removeBaseAndBufferOverrides to make sure there's no manual overrrides there to break tests
|
||
// Total file size is: 21511173, and read starts from pos 5. Since policy is WHOLE_FILE, the whole file starts | ||
// getting prefetched as soon as the stream to it is opened. So prefetched bytes is 21511173 - 5 = 21511168 | ||
verifyStatisticCounterValue(ioStats, STREAM_READ_PREFETCHED_BYTES, 21511168); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explicitly do the maths in the code "len - 5" for future maintenance. Leave that explanation. In fact, we should plan for the nightmare scenario of "file goes away" by not having any assumptions. We also need to handle test setups where its on a third-party store.
- grab its length
- if too short, fail the test meaningfully
- calculate the relevant values
inputStream.seek(3 * S_1M); | ||
inputStream.read(new byte[512 * S_1K]); | ||
|
||
verifyStatisticCounterValue(ioStats, ACTION_HTTP_GET_REQUEST, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
really nice to see this.
💔 -1 overall
This message was automatically generated. |
Description of PR
Based off of #7763
Adds stats for
How was this patch tested?
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?