-
Notifications
You must be signed in to change notification settings - Fork 253
CLDSRV-780: ServerAccessLogs json schema and functional tests #6020
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLDSRV-780: ServerAccessLogs json schema and functional tests #6020
Conversation
Hello leif-scality,My role is to assist you with the merge of this Available options
Available commands
Status report is not available. |
Incorrect fix versionThe
Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:
Please check the |
❌ 1 Tests Failed:
View the top 1 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
47d0be4 to
51d233b
Compare
schema/server_access_log.schema.json
Outdated
| "type": "string" | ||
| }, | ||
| "startTime": { | ||
| "description": "Timestamp in seconds.milliseconds, recorded when the server first routes the request. Represents the AWS server access log 'Time' field. String type because to be compatible with clickhouse DateTime64(3) type.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| "description": "Timestamp in seconds.milliseconds, recorded when the server first routes the request. Represents the AWS server access log 'Time' field. String type because to be compatible with clickhouse DateTime64(3) type.", | |
| "description": "Timestamp in seconds.milliseconds, recorded when the server first routes the request. Represents the AWS server access log 'Time' field. String type to be compatible with clickhouse DateTime64(3) type.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
schema/server_access_log.schema.json
Outdated
| "const": "0" | ||
| }, | ||
| "loggingEnabled": { | ||
| "description": "If true it means that the target bucket has bucket logging configured and enabled.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| "description": "If true it means that the target bucket has bucket logging configured and enabled.", | |
| "description": "True if the target bucket has bucket logging configured and enabled.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
schema/server_access_log.schema.json
Outdated
| "minimum": 0 | ||
| }, | ||
| "requester": { | ||
| "description": "Represents the AWS server access log 'Requester' field.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| "description": "Represents the AWS server access log 'Requester' field.", | |
| "description": "Represents the AWS server access log 'Requester' field. https://docs.aws.amazon.com/AmazonS3/latest/userguide/LogFormat.html", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
tests/functional/aws-node-sdk/test/serverAccessLogs/testServerAccessLogFile.js
Show resolved
Hide resolved
| const result = tv4.validateResult(logEntries[logEntryIdx], schema); | ||
| assert.strictEqual(result.valid, true, | ||
| `Log entry should match schema: ${JSON.stringify(result.error)}`); | ||
| assert.strictEqual(logEntries[logEntryIdx].operation, operation.expectedOperations[operationIdx], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we can add checks for more fields (maybe not this PR) ?
I am thinking at least bytesReceived, bytesDeleted, objectSize, versionID.
There are some checks that you can probably add easily in this PR: bucketName, bucketOwner (?), objectKey, loggingEnabled, loggingTargetBucket, loggingTargetPrefix, awsAccessKeyID, httpMethod, httpCode.
My point is: let's try to have as much test coverage as possible for the log record.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now check more properties.
- Time properties are not tested because we can't control them, the type is checked by the schema
- Bytes*/contentLength are not tested because I think they may change so better do it after. The schema checks their types
- VersionID we don't know what version value is going to be used
- TLS is not tested because we don't have an HTTPS setup
// List of all the log properties.
// Commented properties are set by the test or not tested.
// UNKNOWN: The property is not accessible by the test or is non determistic.
// DYNAMIC: The propery is set depending on the operation.
// STATIC: Shared among all tests.
// TODO: The property need to be tested.
const commonProperties = {
// 'time': '', // UNKNOWN
// 'hostname': '', // UNKNOWN
// 'pid': '', // UNKNOWN
'action': 'REQUIRED', // DYNAMIC
'accountName': 'Bart', // STATIC
'accountDisplayName': 'Bart', // STATIC
'userName': null, // TODO: Add test with IAM user to get a non null userName.
// 'clientPort': '', // UNKNOWN
'httpMethod': 'REQUIRED', // DYNAMIC
// 'bytesDeleted': '', // TODO
// 'bytesReceived': '', // TODO
// 'bodyLength': '', // TODO
// 'contentLength': '', // TODO
// 'elapsed_ms': '', // UNKNOWN
// 'httpURL': '', // TODO
// 'startTime': '', // UNKNOWN
'requester': '79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be', // STATIC
'operation': 'REQUIRED', // DYNAMIC
// 'requestURI': '', // TODO
'errorCode': null, // DYNAMIC
// 'objectSize': '', // TODO
// 'totalTime': '', // UNKNOWN
// 'turnAroundTime': '', // UNKNOWN
'referer': null, // TODO: Add test that sets the referer.
// 'userAgent': // UNKNOWN
// 'versionID': '', // UNKNOWN
'signatureVersion': 'SigV4', // STATIC
'cipherSuite': null, // TODO: Add https tests.
'authenticationType': 'AuthHeader', // STATIC
// 'hostHeader': '', // UNKNOWN
'tlsVersion': null, // TODO: Add https tests.
'aclRequired': null, // TODO: Add https tests.
'bucketOwner': '79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be', // DYNAMIC
bucketName, // DYNAMIC
// 'req_id': '', // UNKNOWN
// 'bytesSent': '', // TODO
// 'clientIP': '', // UNKNOWN
'httpCode': 200, // DYNAMIC
'objectKey': null, // DYNAMIC
'logFormatVersion': '0', // STATIC
'loggingEnabled': false, // DYNAMIC
'loggingTargetBucket': null, // DYNAMIC
'loggingTargetPrefix': null, // DYNAMIC
'awsAccessKeyID': 'accessKey1', // STATIC
'raftSessionID': null, // UNKNOWN
};
Incorrect fix versionThe
Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:
Please check the |
b7b0eff to
e19d4bc
Compare
dvasilas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you need an Arsenal bump?
schema/server_access_log.schema.json
Outdated
| "startTime": { | ||
| "description": "Timestamp in seconds.milliseconds, recorded when the server first routes the request. Represents the AWS server access log 'Time' field. String type to be compatible with clickhouse DateTime64(3) type.", | ||
| "type": "string", | ||
| "minimum": 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minimum is probably not needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The startTime cannot be negative, so it is just an extra sanity check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I mean that it may not apply because statTime is a string. Not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot it was a string, you are right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
e19d4bc to
acb14ed
Compare
schema/server_access_log.schema.json
Outdated
| @@ -0,0 +1,248 @@ | |||
| { | |||
| "$id": "https://scality.com/cloudserver/server_access_log.schema.json", | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather see a github.com link pointing to the actual file in a given branch of the Cloudserver repository here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| "type": "object", | ||
| "properties": { | ||
| "time": { | ||
| "description": "Epoch timestamp in seconds, recorded when the log record is created.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: some descriptions end with a period, some not, they should be consistent. I would think no period is better if they are not complete sentences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added periods to all the lines
| "type": "string" | ||
| }, | ||
| "accountName": { | ||
| "description": "Requester account display name.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| "description": "Requester account display name.", | |
| "description": "Requester account name", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to the correct description
schema/server_access_log.schema.json
Outdated
| "enum": ["GET", "HEAD", "OPTIONS", "TRACE", "PUT", "DELETE", "POST", "PATCH", "CONNECT"] | ||
| }, | ||
| "bytesDeleted": { | ||
| "description": "For DeleteObject bytesDeleted is the size in bytes of the deleted object, for DeleteObjects it is the sum of all the deleted objects.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion for a bit less verbosity and more accuracy:
| "description": "For DeleteObject bytesDeleted is the size in bytes of the deleted object, for DeleteObjects it is the sum of all the deleted objects.", | |
| "description": "For DeleteObject: size in bytes of the deleted object, for DeleteObjects: byte size sum of all the deleted objects", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
schema/server_access_log.schema.json
Outdated
| "minimum": 0 | ||
| }, | ||
| "bytesReceived": { | ||
| "description": "Size of the object in bytes of PutObject and UploadPart", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| "description": "Size of the object in bytes of PutObject and UploadPart", | |
| "description": "For PutObject and UploadPart: size of the object in bytes", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
schema/server_access_log.schema.json
Outdated
| "type": "string" | ||
| }, | ||
| "req_id": { | ||
| "description": "Represents the AWS server access log 'Request ID' field. Matches the req_id of the stdout logs.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "stdout" is an implementation detail (and it ends up in log files on S3C transparently) so I'd rather just say "matches the req_id field in logs"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
lib/utilities/serverAccessLogger.js
Outdated
| 'listParts': 'OBJECT', | ||
| 'completeMultipartUpload': 'UPLOAD', | ||
| 'initiateMultipartUpload': 'UPLOAD', | ||
| 'listMultipartUploads': 'UPLOADS', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking if it's not a typo and you mean plural here, since others are all singular.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is upload in this case, but I fixed other issues in the map
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: generally speaking, it should be doable to merge tests for groups of API functions that put, get and delete something, in a single test that would put, then get, then delete the API object. Just to reduce the number of tests since I believe the test coverage would be identical.
Not blocking for me, just a suggestion.
| const properties = operation.expected[operationIdx]; | ||
| for (const [key, val] of Object.entries(properties)) { | ||
| assert.strictEqual(logEntries[logEntryIdx][key], val, | ||
| `Invalid value for ${key}, action ${properties.action}`); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be replaced by:
assert.deepStrictEqual(logEntries[logEntryIdx], operation.expected[operationIdx])
A failed assertion would show a diff, so I think it would give the same amount of debug info, if not more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because some fields we don't control like timestamps.
Output with the suggested change
+ actual - expected
{
accountDisplayName: 'Bart',
accountName: 'Bart',
aclRequired: null,
action: 'DeleteBucket',
authenticationType: 'AuthHeader',
awsAccessKeyID: 'accessKey1',
+ bodyLength: 0,
bucketName: 'xxx',
bucketOwner: null,
+ bytesDeleted: null,
+ bytesReceived: 0,
+ bytesSent: 197,
cipherSuite: null,
+ clientIP: '::ffff:127.0.0.1',
+ clientPort: 52912,
+ contentLength: 0,
+ elapsed_ms: 0.32016,
errorCode: 'NoSuchBucket',
+ hostHeader: '127.0.0.1:8000',
+ hostname: 'xps',
httpCode: 404,
httpMethod: 'DELETE',
+ httpURL: '/xxx',
logFormatVersion: '0',
loggingEnabled: false,
loggingTargetBucket: null,
loggingTargetPrefix: null,
objectKey: null,
+ objectSize: null,
operation: 'REST.DELETE.BUCKET',
+ pid: 10496,
raftSessionID: null,
referer: null,
+ req_id: '5061ab0fdc4b9cf4db78',
+ requestURI: 'DELETE /xxx HTTP/1.1',
requester: '79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be',
signatureVersion: 'SigV4',
+ startTime: '1764951661.826',
+ time: 1764951661,
tlsVersion: null,
+ totalTime: '0',
+ turnAroundTime: '0',
+ userAgent: 'aws-sdk-nodejs/2.1692.0 linux/v22.14.0 promise',
userName: null,
+ versionID: null
}
+ expected - actual
"aclRequired": [null]
"action": "DeleteBucket"
"authenticationType": "AuthHeader"
"awsAccessKeyID": "accessKey1"
- "bodyLength": 0
"bucketName": "xxx"
"bucketOwner": [null]
- "bytesDeleted": [null]
- "bytesReceived": 0
- "bytesSent": 197
"cipherSuite": [null]
- "clientIP": "::ffff:127.0.0.1"
- "clientPort": 52912
- "contentLength": 0
- "elapsed_ms": 0.32016
"errorCode": "NoSuchBucket"
- "hostHeader": "127.0.0.1:8000"
- "hostname": "xps"
"httpCode": 404
"httpMethod": "DELETE"
- "httpURL": "/xxx"
"logFormatVersion": "0"
"loggingEnabled": false
"loggingTargetBucket": [null]
"loggingTargetPrefix": [null]
"objectKey": [null]
- "objectSize": [null]
"operation": "REST.DELETE.BUCKET"
- "pid": 10496
"raftSessionID": [null]
"referer": [null]
- "req_id": "5061ab0fdc4b9cf4db78"
- "requestURI": "DELETE /xxx HTTP/1.1"
"requester": "79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be"
"signatureVersion": "SigV4"
- "startTime": "1764951661.826"
- "time": 1764951661
"tlsVersion": [null]
- "totalTime": "0"
- "turnAroundTime": "0"
- "userAgent": "aws-sdk-nodejs/2.1692.0 linux/v22.14.0 promise"
"userName": [null]
- "versionID": [null]
}
lib/server.js
Outdated
|
|
||
| try { | ||
| try { | ||
| logger.info('ServerAccessLogger config', { 'config': _config.serverAccessLogs }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| logger.info('ServerAccessLogger config', { 'config': _config.serverAccessLogs }); | |
| logger.info('ServerAccessLogger config', { config: _config.serverAccessLogs }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
| "description": "Represents the AWS server access log 'ACL Required' field.", | ||
| "type": ["null"] | ||
| }, | ||
| "bucketOwner": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bucketOwner and bucketName should never be null.
This is what is breaking integration tests.
For some operations like CreateBucket, deleteObjects, uploadPart, the bucket
here are the server access logs for:
- deleteobjects:
{"time":1764928092,"hostname":"nicolas2be-952-store-4","pid":81,"action":"DeleteObjects","accountName":"jul","accountDisplayName":"jul","userName":null,"clientPort":51762,"httpMethod":"POST","bytesDeleted":null,"bytesReceived":129,"bodyLength":129,"contentLength":129,"elapsed_ms":183.005601,"httpURL":"/bucket0?delete","startTime":"1764928092.696","requester":"b4e50bcfa5a588f74c82becb4476a0d7442892ba27d05baa847f0528b1076f72","operation":"REST.POST.OBJECT","requestURI":"POST /bucket0?delete HTTP/1.1","errorCode":null,"objectSize":null,"totalTime":"182","turnAroundTime":"100","referer":null,"userAgent":"aws-cli/1.38.10 md/Botocore#1.37.10 ua/2.1 os/macos#24.5.0 md/arch#arm64 lang/python#3.13.2 md/pyimpl#CPython m/N cfg/retry-mode#legacy botocore/1.37.10","versionID":null,"signatureVersion":"SigV4","cipherSuite":null,"authenticationType":"AuthHeader","hostHeader":"51.21.225.168:8080","tlsVersion":null,"aclRequired":null,"bucketOwner":null,"bucketName":null,"req_id":"151a943d0741b0e6144c","bytesSent":183,"clientIP":"10.160.120.218","httpCode":200,"objectKey":null,"logFormatVersion":"0","loggingEnabled":false,"loggingTargetBucket":null,"loggingTargetPrefix":null,"awsAccessKeyID":"GRRX6ENBPIXH0XXUPZ1P","raftSessionID":null}
- CreateBucket:
{"time":1764841467,"hostname":"nicolas2be-952-store-1","pid":109,"action":"CreateBucket","accountName":"jul","accountDisplayName":"jul","userName":null,"clientPort":31794,"httpMethod":"PUT","bytesDeleted":null,"bytesReceived":0,"bodyLength":0,"contentLength":0,"elapsed_ms":268.683911,"httpURL":"/bucket0","startTime":"1764841467.173","requester":"b4e50bcfa5a588f74c82becb4476a0d7442892ba27d05baa847f0528b1076f72","operation":"REST.PUT.BUCKET","requestURI":"PUT /bucket0 HTTP/1.1","errorCode":null,"objectSize":null,"totalTime":"268","turnAroundTime":"224","referer":null,"userAgent":"aws-cli/1.38.10 md/Botocore#1.37.10 ua/2.1 os/macos#24.5.0 md/arch#arm64 lang/python#3.13.2 md/pyimpl#CPython m/N cfg/retry-mode#legacy botocore/1.37.10","versionID":null,"signatureVersion":"SigV4","cipherSuite":null,"authenticationType":"AuthHeader","hostHeader":"51.21.225.168:8080","tlsVersion":null,"aclRequired":null,"bucketOwner":null,"bucketName":null,"req_id":"cbfa47daf640b6b6ec00","bytesSent":null,"clientIP":"10.160.120.218","httpCode":200,"objectKey":null,"logFormatVersion":"0","loggingEnabled":false,"loggingTargetBucket":null,"loggingTargetPrefix":null,"awsAccessKeyID":"GRRX6ENBPIXH0XXUPZ1P","raftSessionID":null}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed by b361882
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bucketOwner and bucketName should never be null.
For ListBuckets they are going to be null
acb14ed to
1d89a88
Compare
dvasilas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also include tests about error cases (AccessDenied, NoSuchBucket, ..).
1d89a88 to
a7fbdf4
Compare
I added one test with |
204cfdf to
90dbbab
Compare
|
ping |
Incorrect fix versionThe
Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:
Please check the |
Also added error test for PutObject and GetObject |
anurag4DSB
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is a new standard which will be used by future updates to S3 analytics, we should also write a document to guide engineers on whats the right way to add new fields to server access logs. This will be helpful for everyone to adhere to the standards thought by the creators of server access logs.
| "operation": { | ||
| "description": "AWS server access log 'Operation' field.", | ||
| "type": "string" | ||
| }, | ||
| "requestURI": { | ||
| "description": "AWS server access log 'Request URI' field.", | ||
| "type": "string" | ||
| }, | ||
| "errorCode": { | ||
| "description": "AWS server access log 'Error Code' field.", | ||
| "type": ["string", "null"] | ||
| }, | ||
| "objectSize": { | ||
| "description": "AWS server access log 'Object Size' field.", | ||
| "type": ["integer", "null"] | ||
| }, | ||
| "totalTime": { | ||
| "description": "AWS server access log 'Total Time' field.", | ||
| "type": "string" | ||
| }, | ||
| "turnAroundTime": { | ||
| "description": "AWS server access log 'Turn Around Time' field.", | ||
| "type": ["string", "null"] | ||
| }, | ||
| "referer": { | ||
| "description": "AWS server access log 'Referer' field.", | ||
| "type": ["string", "null"] | ||
| }, | ||
| "userAgent": { | ||
| "description": "AWS server access log 'User Agent' field.", | ||
| "type": ["string", "null"] | ||
| }, | ||
| "versionID": { | ||
| "description": "AWS server access log 'Version ID' field.", | ||
| "type": ["string", "null"] | ||
| }, | ||
| "signatureVersion": { | ||
| "description": "AWS server access log 'Signature Version' field.", | ||
| "type": ["string", "null"] | ||
| }, | ||
| "cipherSuite": { | ||
| "description": "AWS server access log 'Cipher Suite' field.", | ||
| "type": ["string", "null"] | ||
| }, | ||
| "authenticationType": { | ||
| "description": "AWS server access log 'Authentication Type' field.", | ||
| "type": ["string", "null"] | ||
| }, | ||
| "hostHeader": { | ||
| "description": "AWS server access log 'Host Header' field.", | ||
| "type": "string" | ||
| }, | ||
| "tlsVersion": { | ||
| "description": "AWS server access log 'TLS Version' field.", | ||
| "type": ["string", "null"] | ||
| }, | ||
| "aclRequired": { | ||
| "description": "AWS server access log 'ACL Required' field.", | ||
| "type": ["null"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all the "AWS server access logs 'XXX' field." descriptions, have we considered using actual descriptions as AWS docs change without version control? Or is it intentional that we want engineers to refer to https://docs.aws.amazon.com/AmazonS3/latest/userguide/LogFormat.html for what the fields actually represent?
It's worth adding the actual description, and we can add a code comment that these are AWS server access log fields.
Example for aclRequired, AWS server access log 'aclRequired' field indicating whether ACL authorization was required. would be more helpful than "AWS server access log 'ACL Required' field."
Also if we want to be AWS compatible, the proper description will also help us check if we are actually compatible. For example for aclRequired allows null, but the actual AWS schema should accept "Yes" or "-".
Example correction:
"aclRequired": {
"description": "AWS server access log 'aclRequired' field indicating whether ACL authorization was required.",
"type": "string",
"enum": ["Yes", "-"]
}
if we want to allow nulls in our own internal representation, we could explicitly combine them with something like this
"aclRequired": {
"description": "AWS server access log 'aclRequired' field indicating whether ACL authorization was required.",
"type": ["string", "null"],
"enum": ["Yes", "-"]
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will let you check and change for others as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inlined descriptions
| try { | ||
| return lines.map(line => JSON.parse(line)); | ||
| } catch (err) { | ||
| // FIXME: readFileSync may read partial lines making JSON.parse fail, so we need to retry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The FIXME comments acknowledge that readFileSync may read partial lines, causing JSON.parse to fail. This is a real race condition that will cause flaky tests.
We could use a more robust approach with file position tracking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We prevent flaky test by retrying, cloudserver should not take more than 5 seconds to write the logs
I already create a ticket to track this CLDSRV-800
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add the ticket number to the comment please?
(for people just looking at the code)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| "type": "string", | ||
| "enum": ["GET", "HEAD", "OPTIONS", "TRACE", "PUT", "DELETE", "POST", "PATCH", "CONNECT"] | ||
| }, | ||
| "bytesDeleted": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bytesDeleted, bytesReceived, bodyLength, contentLength all have type: ["integer", "null"] with minimum: 0, wouldn't null values fail the minimum constraint?
We could do something like this":
{
"bytesDeleted": {
"description": "For DeleteObject: size in bytes...",
"oneOf": [
{"type": "integer", "minimum": 0},
{"type": "null"}
]
}
}
Or better, make these fields optional and omit them when null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The minimum is only applied when the value is an integer
| @@ -0,0 +1,247 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For future iterations, we could also add a pattern for these fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
| { | ||
| ...commonProperties, | ||
| operation: 'REST.PUT.BUCKET', | ||
| bucketOwner: null, | ||
| action: 'CreateBucket', | ||
| httpMethod: 'PUT', | ||
| }, | ||
| { | ||
| ...commonProperties, | ||
| operation: 'REST.PUT.CORS', | ||
| action: 'PutBucketCors', | ||
| httpMethod: 'PUT', | ||
| }, | ||
| { | ||
| ...commonProperties, | ||
| operation: 'REST.DELETE.CORS', | ||
| action: 'DeleteBucketCors', | ||
| httpCode: 204, | ||
| httpMethod: 'DELETE', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: There is a lot of code duplication, worth extracting into a helper function.
Something like this
function expectedCreateBucket(overrides = {}) {
return {
...commonProperties,
operation: 'REST.PUT.BUCKET',
bucketOwner: null,
action: 'CreateBucket',
httpMethod: 'PUT',
...overrides,
};
}
// Usage:
expected: [
expectedCreateBucket(),
expectedPutObject({ objectKey }),
expectedDeleteObject({ objectKey, httpCode: 204 }),
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah looking back we can do better, the log format should change so I can add the improvements when I update the tests in the next iteration
| }); | ||
|
|
||
| for (const operation of operations) { | ||
| it(`check ${operation.methodName} log entry`, async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth being more descriptive here
| it(`check ${operation.methodName} log entry`, async () => { | |
| it(`should log correct ${operation.methodName} operation with all required fields`, async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| "awsAccessKeyID", | ||
| "raftSessionID" | ||
| ] | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The schema doesn't specify "additionalProperties": false, allowing undocumented fields to be added.
This ensures strict validation and prevents accidental field additions.
{
"properties": { /* ... */ },
"required": [ /* ... */ ],
"additionalProperties": false
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| bu = bucketUtil; | ||
| const s3 = bucketUtil.s3; | ||
| const logFilePath = config.serverAccessLogs.outputFile; | ||
| const bucketName = 'test-server-access-log-bucket'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If tests run in parallel or are interrupted, could this cause conflicts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests expect the server-access.log file to be empty, if tests are run in parallel this is no longer true so they will break.
| httpMethod: req.method === undefined ? null : req.method, | ||
| bytesDeleted: params.analyticsBytesDeleted === undefined ? null : params.analyticsBytesDeleted, | ||
| bytesReceived: req.parsedContentLength === undefined ? null : req.parsedContentLength, | ||
| bodyLength: req.headers['content-length'] === undefined ? null : parseInt(req.headers['content-length'], 10), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nitpicking here: There are some hard-to-read and sometimes inconsistent complex null coalescing here.
Could we standardize them?
example:
accountDisplayName: authInfo?.getAccountDisplayName() ?? null,
clientPort: req.socket?.remotePort ?? null,
bytesDeleted: params.analyticsBytesDeleted ?? null,
Waiting for approvalThe following approvals are needed before I can proceed with the merge:
The following reviewers are expecting changes from the author, or must review again: |
Waiting for approvalThe following approvals are needed before I can proceed with the merge:
|
|
Removing blocker, as per discussion in daily, might be addressed later in other tickets. |
lib/utilities/serverAccessLogger.js
Outdated
| } | ||
|
|
||
| return len; | ||
| return Number(len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not very nice to encapsulate a primitive into its object constructor instance.
Besides the constructor might convert '' to 0 instead of NaN. Is it the expected behavior to default to 0 for empty string ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to
if (request && response && objectSizeGetMethods[request.apiMethod]) {
const len = Number(response.getHeader('Content-Length'));
if (len === undefined || len === null || len === '') {
return null;
}
const numLen = Number(len);
if (isNaN(numLen)) {
return null;
}
return numLen;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't change the use of Number, this double the usage. And Number cannot return null or undefined. It returns a number
BourgoisMickael
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getBytesSent could be simplfied to
function getBytesSent(res, bytesSent) {
if (bytesSent !== undefined && bytesSent !== null) {
return bytesSent;
}
return res?.getHeader('Content-Length') || null;
}b21fd94 to
a20fa05
Compare
a20fa05 to
f756789
Compare
|
ping |
|
/approve |
|
I have successfully merged the changeset of this pull request
The following branches have NOT changed:
Please check the status of the associated issue CLDSRV-780. Goodbye leif-scality. The following options are set: approve |
Uh oh!
There was an error while loading. Please reload this page.