Skip to content

Channels should be encapsulated by procs in the IR #869

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
grebe opened this issue Feb 23, 2023 · 11 comments
Closed

Channels should be encapsulated by procs in the IR #869

grebe opened this issue Feb 23, 2023 · 11 comments
Labels
enhancement New feature or request ir

Comments

@grebe
Copy link
Collaborator

grebe commented Feb 23, 2023

Right now, channels are global and procs can arbitrarily reference them. This freedom doesn't necessarily buy that much as codegen needs to plumb them up and down the hierarchy eventually anyways. It seems easier to understand what channels do if they were encapsulated by the proc that instantiates them (and would also allow procs to be invoked, and perhaps aid linking, etc.).

Also, it might make sense to make channel declarations nodes- right now channels are kind of their own weird thing in the IR and it's occasionally annoying that they aren't nodes.

@grebe grebe added enhancement New feature or request ir labels Feb 23, 2023
@meheff
Copy link
Collaborator

meheff commented Feb 24, 2023

I like the idea. About channel declarations being nodes, I don't think it makes sense for them to be actual xls::Nodes, rather they could be a proc-scoped construct similar to registers in blocks. Making them xls::Nodes implies a level of dynamism we don't support. For example, can you pass them around? put them in arrays? select among them?

Proc-scoped channels would also help reduce name collisions. Right now channels are global object so no two channels can be named the same. This is annoying and this constraint does not reflect any underlying constraint in the generated code. Theese channels often map to verilog module ports and there it's fine if two different modules have ports with the same name.

@meheff
Copy link
Collaborator

meheff commented Feb 24, 2023

Spit balling a possible syntax:

proc foo<ch: chan(bits[8])>(tok: token, state: State) {
  ...
}

proc bar<ch0: chan(bits[32]), ch1: chan(bits[16])>(tok: token, state: State) {
  // Declare proc-scoped channels and spawning of other procs.  
  chan ch2(bits[8], id=0, kind=streaming, ops=send_receive, flow_control=none);
  spawn foo<ch3>;

  // Normal logic of the proc. ch0, ch1, and ch2 can all be used in send/receives.
  ...
}

@meheff
Copy link
Collaborator

meheff commented Feb 24, 2023

For the purpose of referring to things, it might be good to give spawn statements names:

  baz: spawn foo<ch3>;

This gives you a string path for referring to a specific instantation (spawning) of a proc.

@proppy
Copy link
Member

proppy commented Jun 9, 2023

@tmichalak bf28838 fyi (for some reason it seems that I can't mention @antmicro/xls)

mczyz-antmicro added a commit to antmicro/xls that referenced this issue Dec 22, 2023
Signed-off-by: Michal Czyz <[email protected]>
@mczyz-antmicro
Copy link
Contributor

Hey, I have a few points to contribute to this conversation from a designer's point of view. I am currently engaged in an effort to design new modules in DSLX and have the following experiences:

  1. As previously reported in [IR optimization] Over eager IR optimization that leads to incorrect graph #959, the IR fails to discover all channel connection in top levels with empty next() function. This causes the Verilog generation to produce incomplete implementations and effectively limits designs that can be tested.
  2. It was at some point mentioned that if you select the inner proc as the entrypoint, the proc discovery in IR may perform correctly (workaround). I observed a case, where this does not help and I provide an example on branch, commit. File xls/examples/passthrough.x defines 2 processes: Passthrough and Wrapper. Information about spawned Passthrough is lost at IR level (presumably due to empty next).
  3. A workaround for this issue can be used at the cost of adding dummy channels on top-level. On the same branch, I also prepared a PassthroughDummy process, which adds dummy channels thorough the hierarchy and performs redundant empty send/recvs. This forces the IR tool to recognize the hierarchy and produce valid verilog.
  4. I attach files from IR/Verilog generation to this message.

After analysis of the proc scoped channels proposal, I believe that this approach would solve my problems and help a lot. I wonder, however, if this is the best solution. I can imagine that during analysis of the DSLX code, an algorithm could traverse through the hierarchy of processes (analyze proc,spawn relationships) and build a hierarchy graph, which could be inserted as metadata in the IR. It seems to me that it would be a cheaper solution than "proc scoped channels", unless there are some limitations with IR manipulation that I am not aware of.

I encountered also another problem mentioned in the "proc scoped channels" docs, which is inability to create unconnected ports (equivalent to .clk(clk), .unused_output()). The workaround I am currently using is to create a wrapper process and perform "empty" sends and receives so that tools can detect that each channel has an associated send/recv operation. Even though this approach works, it is inconvenient at scale.

What do you think?

gh_869_example_logs.tar.gz

@meheff
Copy link
Collaborator

meheff commented Dec 22, 2023

@mczyz-antmicro You're correct that proc-scoped channels will solve the issues you describe. Regarding implementing this as metadata, that could be done but I don't think it is the most robust solution. Keeping metadata on the side which must be kept in sync with the IR is a potential source of complexity and bugs (where the metadata does not match the IR) Sometimes it may be necessary, but ideally you avoid it and only have to keep one representation at a time. You're right in that changing channels to being proc-scoped is a significant amount of work and the metadata approach would likely be cheaper, but I think it is worth the investment now to get the right representation. The work is already underway, you can now see that there is a bit on procs ("new_style") which indicates whether it is the new kind of proc which can have channels declared in it. I'll keep adding more and more functionaliity to these "new style" procs until they can do everything the existing procs can do, and then switch existing frontends and designs over and remove the old proc representation.

Regarding the use case of an unused port, I think the proc-scoped channel approach makes this much easier to support. With proc-scoped channels, there is an explicit declaration of the interface of the proc, for example:

proc foo<x: bits[32] in, y: bits[32] out>(t: token, st: ...) {
   ...
}

Here x and y are the interface of the proc. To create an unused port you would just not connect, say, x to any send/receive node. You'd probably also want to annotate x in some way to indicate that it doesn't need flow control (ready/valid) and maybe some other things. Without proc-scoped channels, the interface is implicitly defined on the proc via global channel declaration and a use of that channel in the proc. However, in your example you don't want a use of the channel in the proc so how do you represent it? With proc-scoped channels the proc interfaces, channel declarations, and proc instantiations explicitly define a statically-known (at compile time) network of connectivity throughout the design. This approach makes it easier to specify connectivity between different levels in the hierarch. For example, you may have top-level input port which gives some configuation value that you want to propagate to multiple leaf blocks in the design. Proc-scoped channels makes that easy. Doing the equivalent with global channels is trickier. The compiler has to reconstruct the desired connectivity. With proc-scoped channels the desired connectivity is clearly stated.

Channels are proc-scoped in the frontend (DSLX) and also in the verilog representation where you have a hierarchy of instantations and all connectivity (ie, channels) are encapsulated as reg/wire/logic declarations within a module, so I think it makes sense for the IR to have a similar representation.

@mczyz-antmicro
Copy link
Contributor

Thanks for a great explanation @meheff , I am now convinced that it is worth to wait for the new implementation. Happy to hear that work on new style has already started!

I have one more idea I would like to share: correct me if I'm wrong, but I believe that currently to combine design with an SRAM macro, one needs to expose the RAM channels all the way to the top-level, right? Have you given it thought if proc based channels could improve this situation? It would be neat if we could have something like a proc decorator so that the tool is capable of generating verilog with an exposed interface,e.g:

#[sram_proc]
proc RAM_2048_32<>(){}

@mczyz-antmicro
Copy link
Contributor

Unfortunately, I've found that the workaround I had proposed does not always work:

3. A workaround for this issue can be used at the cost of adding dummy channels on top-level. On the same branch, I also prepared a `PassthroughDummy` process, which adds dummy channels thorough the hierarchy and performs redundant empty send/recvs. This forces the IR tool to recognize the hierarchy and produce valid verilog.

I modified the proc_iota example to use the dummy channels antmicro@7397835 and the IR optimization is still "over eager".

Attaching files from the verilog generation flow
proc_iota_v.tar.gz

@lpawelcz
Copy link
Contributor

lpawelcz commented Jan 5, 2024

Hi, I recently encountered the same problems as @mczyz-antmicro. My attempts to generate verilog for RleBlockDecoder (part of #1213) as well as BlockDecoder (part of #1214) were both unsuccessful.
IRs generated from linked DSLX files contain top procs which look something like this:

top proc __rle_block_dec__RleBlockDecoder_0_next(__token: token, __state: (), init={()}) {                                                    
  literal.17: bits[1] = literal(value=1, id=17)                                                                                               
  tuple.18: () = tuple(id=18, pos=[(0,349,31)])                                                                                               
  after_all.19: token = after_all(__token, id=19)                                                                                             
  next (after_all.19, tuple.18)                                                                                                               
}   

Performing IR optimization and codegen with this default top proc results in empty verilog module, like so:

module RleBlockDecoder(
  input wire clk,
  input wire rst
);

endmodule
  1. It was at some point mentioned that if you select the inner proc as the entrypoint, the proc discovery in IR may perform correctly (workaround).

I decided to follow this approach and I've manually set top level proc for IR opt as one of the internal procs (for RleBlockDecoder example it was BatchPacker proc to be specific)
This caused codegen to generate verilog logic only for the BatchPacker proc.

  1. A workaround for this issue can be used at the cost of adding dummy channels on top-level. On the same branch, I also prepared a PassthroughDummy process, which adds dummy channels thorough the hierarchy and performs redundant empty send/recvs.

I've also tested this workaround and it didn't work for me. It resulted in generation of verilog logic that only receives and sends data through those dummy channels.

I've come up with an idea for a possible workaround which involved:

  • Removing direct connections between top-level proc input/output channels and input/output channels of internal procs which are normally specified in config() function
  • Manually handling receiving input data, sending it to first internal proc, receiving output data from last internal proc and sending it out to top-level output channel in next() function of top-level proc

This can be visualized by the following diagram:
Workaround proc hierarchy drawio

I tested the idea on antmicro@5a72b99 and this caused codegen to generate verilog code for RleBlockDecoder which only receives data from input ports and sends it out on other ports which are the inputs to internal proc which is outside of the generated module. The result for this solution is very similar to the one we get from using dummy channels.

I've also looked at the existing examples in XLS and it seems like in the whole repository there is only one example which is a proc that spawns other procs and has rules for verilog generation. This is Delay32x2048_init3 which is a specification of parameterized Delay.
Here is verilog generation rule: LINK. For IR optimization it uses first internal proc as top proc. That results in generating verilog only for that specific internal proc which happens to be DelayInternal which is not correct.

@lpawelcz
Copy link
Contributor

Just for the record. As stated in #1253 (comment) there is in fact a workaround which allows to generate valid verilog.
For complex procs which spawn multiple internal procs it is important to inline internal procs by passing additional argument to bazel rule for optimizing the IR:

"inline_procs": "true"

For procs which have empty next() function it is also required to explicitly set top argument to the last internal proc in the data flow, e.g.:

"top": "__xls_modules_zstd_dec_mux__BlockDecoder__DecoderMux_0_next"

Additionally, all internal procs must have fifo_depth explicitly set to 1, e.g.:

proc BlockDecoder {
...
    config (...) {
    ...
        let (demux_raw_s, demux_raw_r) = chan<BlockDataPacket, u32:1>;
    ...
    }
...
}

@meheff
Copy link
Collaborator

meheff commented Apr 29, 2025

The IR has supported proc-scoped channels for a while. Optimizations and codegen also generally support the feature though there may be some corner cases or missing features. The big missing component is the front end which I'll file a separate issue for.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request ir
Projects
None yet
Development

No branches or pull requests

5 participants