Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node api audit #1186

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Node api audit #1186

wants to merge 7 commits into from

Conversation

jianoaix
Copy link
Contributor

Why are these changes needed?

  • Enhance the API validation
  • Improve error handling with right gRPC categorization (better UX, more accurate accounting of API availability)
  • Place heavy computing (e.g. auth) after cheap/light validation

Checks

  • I've made sure the tests are passing. Note that there might be a few flaky tests, in that case, please comment that they are not relevant.
  • I've checked the new test coverage and the coverage percentage didn't drop.
  • Testing Strategy
    • Unit tests
    • Integration tests
    • This PR is not tested :(

@jianoaix jianoaix requested review from samlaf and dmanc January 29, 2025 21:42
Copy link
Contributor

@samlaf samlaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like the approach overall, and big fan of having better error handling. That being said the api.errors functions are growing, lots of them are public, and they dont have super clear semantic, so I feel it could proliferate in the code base to very unsound error wrapping when people other than you who aren't as familiar with your abstraction try to use them and just lazy out and use whatever method they can find that looks ok.

Don't have a super clean proposal for making this better though... so will let you decide.

Comment on lines +26 to +28
// WrapAsGRPCError wraps an error with a gRPC code and message. If the error already has a gRPC code,
// it retains the code but combines the messages. Otherwise, it uses the provided code and combined message.
func WrapAsGRPCError(err error, code codes.Code, msg string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When should this happen? Feels weird to me to have to wrap a GRPC error in another one, because a grpc status code wrapping should be terminal and returned to the user. Otherwise don't we need to define the overwrite/precedence logic for which error to return to user (eg 400 comes before 500 or not?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also should prob make private so force people to use the other specific ones you wrote for each error code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a server uses other services as backend, this can happen.

If the server calls HandleFooWithServiceX(params) error, and an error returned, this error by default should be INTERNAL error, but it's also possible the params are no valid which results in an error (the caller may not have sufficient information to validate params before calling).

In such case you may want to WrapAsGRPCError(err, INTERNAL, "...."): if HandleFooWithServiceX has an opinion about the error type, then we should use that; otherwise, it should be INTERNAL.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see! Thanks for the explanation.

So right, if an external service returns a 400, that should be, from our server's client's pov, converted to a 500 (we are doing something wrong).

So I feel like something simpler would be:

  • rename this function something clearer (convertExternalServiceError?) and your explanation should be added as documentation comment)
  • or just get rid of this function altogether, and force people to use the other methods (unwrap the error string from external service's error and use NewErrorInternal(err.Err()))

I personally like to keep any package's public method surface small, so would prefer to remove functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not just external service error, it can be the validation with more info from external service failed

// it retains the code but combines the messages. Otherwise, it uses the provided code and combined message.
func WrapAsGRPCError(err error, code codes.Code, msg string) error {
if err == nil {
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldnt this just use newErrorGRPC in this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explained above

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't really like this semantic though, would prefer at each call site the more explicit

if err != nil {
   WrapAsGRPCError(err, code, msg)
}

instead of wrapping by default and forcing reader to go make sure that the function returns nil if no error was present

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's defensive coding to handle nil parameter, why is it an issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its not defensive coding, defensive coding would return an error or panic, this just silently returns nil, which is the opposite of defensive coding.

Comment on lines +35 to +36
// If it's already a gRPC error, use the same code but combine the messages
return status.Error(s.Code(), combinedMessage)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why use the old code? Why not the new one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explained above

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't match your description above though?
If externalService returns 400, we want to return 500 to our users, not 400. So you should use the code passed to wrapAsGRPCError always I think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above explanation is basically "if lower level function has opinion of error, preserve it"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that is just wrong...

return status.Error(s.Code(), combinedMessage)
}

return errors.New(combinedMessage)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function has weird semantics, its called WrapGRPCError but can accept a non-grpc error and return a non-grpc error. Can we simplify it instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's exactly the desired semantics. If there is a better concise name I'm happy to use it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WrapGRPCError indicates that only GRPC errors should be allowed to be wrapped, yet your documentation says that its fine to pass another error and itll just get wrapped normally. Error wrapping also typically means wrapping an error, not just extending the error with some other msg string. So it feels convoluted and unclear to me.

I like when functions have a single small, and clear purpose. Just like golang's errors api, only has .Is() and .As(), and errors.join() for combining errors, thats it. Forces users to use it in the same way, and not too much cognitive overload when trying to use it.

How about we mention that this should ONLY be used for wrapping grpc errors? I don't think any function should return either a GRPC error or not right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

golang's api needs to be small as it has no context

the functions are not expected to return grpc error, only the APIs do; and the wrapper here works no matter if a function returns grpc error or not.

node/auth/authenticator.go Show resolved Hide resolved
@@ -120,7 +121,7 @@ func (a *requestAuthenticator) AuthenticateStoreChunksRequest(

key, err := a.getDisperserKey(ctx, now, request.DisperserID)
if err != nil {
return fmt.Errorf("failed to get operator key: %w", err)
return api.WrapGRPCError(err, "failed to get operator key")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we know for sure that err here is a grpc error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of this wrapper is to make it easier for APIs to return canonical errors, without having to read all underlying code

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally think this is a bad idea though. It's better to be SUPER clear on the semantics on functions. If a function returns a grpc error, then ALL of its returned errors should be GRPC. If not, then the function is too large and should be broken down to respect this invariant. This way error handling semantics and documentation is simple, and you can't go wrong.

Otherwise if "everything is easy" people just don't know what they are dealing with, and to extend a function or return a new error code they need to go read the entire code path to understand what kind of errors they could be receiving in different cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to code for easy static checkability

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the issue is you cannot ask all functions to return grpc error which fits only at interface level

node/grpc/server_v2.go Show resolved Hide resolved
@jianoaix jianoaix requested a review from anupsv January 30, 2025 19:00
@jianoaix
Copy link
Contributor Author

Like the approach overall, and big fan of having better error handling. That being said the api.errors functions are growing, lots of them are public, and they dont have super clear semantic, so I feel it could proliferate in the code base to very unsound error wrapping when people other than you who aren't as familiar with your abstraction try to use them and just lazy out and use whatever method they can find that looks ok.

Don't have a super clean proposal for making this better though... so will let you decide.

Would it be better if we drop sugars like

func WrapAsInvalidArg(err error, msg string) error {
	return WrapAsGRPCError(err, codes.InvalidArgument, msg)
}

and let the caller directly use WrapAsGRPCError(err, codes.InvalidArgument, msg)?

@samlaf
Copy link
Contributor

samlaf commented Jan 31, 2025

Would it be better if we drop sugars like

func WrapAsInvalidArg(err error, msg string) error {
	return WrapAsGRPCError(err, codes.InvalidArgument, msg)
}

and let the caller directly use WrapAsGRPCError(err, codes.InvalidArgument, msg)?

I personally think so yes. Even the original ones

func NewErrorUnauthenticated(msg string) error {
	return newErrorGRPC(codes.Unauthenticated, msg)
}

I feel like I prefer just having a single function used everywhere, with call site specifying the status code.
Maybe we can define our own code enum to only allow the ones that we want, so the api would look like

api.newErrorGRPC(api.CodeUnauthenticated, msg)

?

@@ -198,13 +201,24 @@ func (s *ServerV2) StoreChunks(ctx context.Context, in *pb.StoreChunksRequest) (

// validateStoreChunksRequest validates the StoreChunksRequest and returns deserialized batch in the request
func (s *ServerV2) validateStoreChunksRequest(req *pb.StoreChunksRequest) (*corev2.Batch, error) {
if req.GetDisperserID() != api.EigenLabsDisperserID {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment why this would result in an invalid store chunks request?

@jianoaix
Copy link
Contributor Author

jianoaix commented Feb 4, 2025

Taking a step back I think most (if not all) of the api/errors.go belongs to common/, they are relevant only for internal uses in error handling; users only needs to know the errors they'd see.

@samlaf
Copy link
Contributor

samlaf commented Feb 5, 2025

@jianoaix what are you proposing here? I feel like we fundamentally agree about a lot of points, and are talking past each other. I propose we have a live discussion to sync on this.

grug update: let's move on, you're the owner of this code, we'll do it your way. Will stamp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants