-
Notifications
You must be signed in to change notification settings - Fork 1.1k
feat(grpc): Add protobuf
codegen
#2320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d39e0f5
to
de384b4
Compare
9e45ab4
to
dfc5213
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, lets get these last few things addressed and CI passing then lets merge.
# Share repository cache between workflows. | ||
repository-cache: true | ||
module-root: ./protoc-gen-rust-grpc | ||
- name: Build protoc plugin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this step take a long time? Generally, speaking I would rather that we build the plugin in each test job to ensure we are building the latest code and there is no caching issues. Could you explain the reasoning for breaking it up like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment in the yaml. Building the protoc plugin from scratch takes 6–14 minutes, depending on the OS. This delays the execution of workflows that use the plugin in build.rs files.
Example workflow execution: https://github.com/hyperium/tonic/actions/runs/16217230891/job/45789317103
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay yeah if this works its fine for now, we probably want to improve this in the future..
.unwrap(); | ||
} | ||
let crate_mapping_path = if self.generate_message_code { | ||
self.output_dir.join("crate_mapping.txt") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to have multiple codegens override the same file? Do we need to add a warning incase say you do that by accident and its surprising?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to have multiple codegens override the same file?
Yes it's possible. A bigger problem when having multiple protoc
invocations using the same output directory is the generated.rs
file getting overwritten. The generated.rs
file exports the generated message symbols for all the proto files that were part of the input protos list.
Do we need to add a warning incase say you do that by accident and its surprising?
@acozzette wanted to get your views on this. Have you considered this issue?
}; | ||
|
||
// Generate the service code. | ||
let mut cmd = std::process::Command::new("protoc"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably want to allow users to set the path to protoc via an env var as well. We do this in the prost code so we should follow how it does that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are the relevant prost docs: https://docs.rs/prost-build/latest/prost_build/index.html#sourcing-protoc
The message codegen also doesn't support setting the path to protoc using an env var. I think protobuf-codegen
should support this before gRPC. Do you want me to file a feature request for protobuf?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that would be good I think it shouldn't be too contentious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type Error = Status; | ||
|
||
fn encode(&mut self, item: Self::Item, buf: &mut EncodeBuf<'_>) -> Result<(), Self::Error> { | ||
let serialized = item.serialize().map_err(from_decode_error)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a todo here mentioning that we want to figure out how to remove this extra copy that happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment. Do you want me to file an issue for the protobuf-rust?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That might be good, we discussed this with @acozzette and co last week but probably good to track it somewhere as well. I would be okay if this was also just a tonic issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened a tonic issue with context about why protobuf doesn't provide the required API: #2345
let item = U::parse(slice).map_err(from_decode_error)?; | ||
buf.advance(slice.len()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its possible that the incoming buf slice is larger than the actual message, we should see if we can pull the len from the decoded amount of bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding, each field in an encoded proto message either has a length prefix or a fixed length. The parser should continue reading fields until it reaches the end of the buffer. Fields not present in the proto descriptor should be ignored to ensure forward compatibility—for example, when a new field is added to the proto, but the receiver is using an older version. A parsing error should be returned if the bytes cannot be parsed. Since there is no length prefix for the entire message, the parser must consume all the bytes it is given.
I think its possible that the incoming buf slice is larger than the actual message
Based on the above, I believe it may be incorrect to pass a slice to a parser but not consume the entire buffer. Can you clarify when this situation arises?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm that is a good question, I will look into it, maybe we also want to actually provide some type safe wrapper that enforces this invariant. I wouldn't worry about this for now I think your assumption is probably right on what we actually do.
Co-authored-by: Lucio Franco <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I tested this locally, the absolute paths is what solved it for me. We can merge this but I want to merge #2321 first and then rebase this ontop of that before we merge this.
This check was meant to enforce that the version of the protobuf crate exactly matches the version of the codegen crate, but this is not strictly necessary. Tonic would also like to re-export our crate from the `tonic-protobuf` crate, and the `DEP_UPB_VERSION` check is causing problems with that since the environment variable is set only for crates that directly depend on the `protobuf` crate (see [discussion](hyperium/tonic#2320 (comment))). While I was at it I also removed the code in the protobuf build script that sets the `DEP_UPB_INCLUDE` variable, which is unnecessary now that we no longer generate any C code which would need to know where to find upb's headers. PiperOrigin-RevId: 786745312
This check was meant to enforce that the version of the protobuf crate exactly matches the version of the codegen crate, but this is not strictly necessary. Tonic would also like to re-export our crate from the `tonic-protobuf` crate, and the `DEP_UPB_VERSION` check is causing problems with that since the environment variable is set only for crates that directly depend on the `protobuf` crate (see [discussion](hyperium/tonic#2320 (comment))). While I was at it I also removed the code in the protobuf build script that sets the `DEP_UPB_INCLUDE` variable, which is unnecessary now that we no longer generate any C code which would need to know where to find upb's headers. PiperOrigin-RevId: 786745312
This check was meant to enforce that the version of the protobuf crate exactly matches the version of the codegen crate, but this is not strictly necessary. Tonic would also like to re-export our crate from the `tonic-protobuf` crate, and the `DEP_UPB_VERSION` check is causing problems with that since the environment variable is set only for crates that directly depend on the `protobuf` crate (see [discussion](hyperium/tonic#2320 (comment))). While I was at it I also removed the code in the protobuf build script that sets the `DEP_UPB_INCLUDE` variable, which is unnecessary now that we no longer generate any C code which would need to know where to find upb's headers. PiperOrigin-RevId: 786760264
This check was meant to enforce that the version of the protobuf crate exactly matches the version of the codegen crate, but this is not strictly necessary. Tonic would also like to re-export our crate from the `tonic-protobuf` crate, and the `DEP_UPB_VERSION` check is causing problems with that since the environment variable is set only for crates that directly depend on the `protobuf` crate (see [discussion](hyperium/tonic#2320 (comment))). While I was at it I also removed the code in the protobuf build script that sets the `DEP_UPB_INCLUDE` variable, which is unnecessary now that we no longer generate any C code which would need to know where to find upb's headers. PiperOrigin-RevId: 786760264
This check was meant to enforce that the version of the protobuf crate exactly matches the version of the codegen crate, but this is not strictly necessary. Tonic would also like to re-export our crate from the `tonic-protobuf` crate, and the `DEP_UPB_VERSION` check is causing problems with that since the environment variable is set only for crates that directly depend on the `protobuf` crate (see [discussion](hyperium/tonic#2320 (comment))). While I was at it I also removed the code in the protobuf build script that sets the `DEP_UPB_INCLUDE` variable, which is unnecessary now that we no longer generate any C code which would need to know where to find upb's headers. PiperOrigin-RevId: 786760264
Hi @dfawley, @LucioFranco can you please have another look? |
This PR includes the following:
grpc-build
crate that helps generate code during cargo builds.To keep the CI fast, the protoc plugin binary for each OS is cached. The plugin binaries are re-built only if the plugin's code or build files are updated.
Notes