Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for child documents #261

Open
p-gw opened this issue Feb 20, 2025 · 11 comments
Open

Support for child documents #261

p-gw opened this issue Feb 20, 2025 · 11 comments
Assignees
Labels
enhancement New feature or request speculative

Comments

@p-gw
Copy link

p-gw commented Feb 20, 2025

Hi,

I am trying to figure out if child documents are supported by QuartoNotebookRunner.
The following simple example is supported by quartos native include mechanism.
The complex example I could not get to work yet.

Simple example

I want to render the top-level document defined below.

---
title: Top-Level document
engine: julia
---

Test.

{{< include _partial.qmd >}}

where _partial.qmd is defined as follows.

Partial document. 

It includes julia code, 1+1=`{julia} 1+1`.

Crucially the partial document has some code in it.
Including it evaluates the code and produces the correct output.

Complex example

In this example the I have the same setup as before, but the partial document is included/evaluated several times with different parameters.

For example,

---
title: Top-level document
engine: julia
---

Test.

```{julia}
for i in 1:3
    # code for including the child document
end
```

and the child document references the variable i.

# Heading `{julia} i`

`{julia} 1+i`

Clearly the child documents have to be rendered using the current value of i in the parent document.
I am not sure how I can render the child documents dynamically and include them in the top-level document.
The knitr engine provides the knit_child function that does exactly that.

Maybe the expand mechanism can already be used to achieve this goal?
Happy to get some suggestions.

@MichaelHatherly
Copy link
Collaborator

Crucially the partial document has some code in it.
Including it evaluates the code and produces the correct output.

Yes, so includes splice the source into the main document prior to QNR getting it, so that's all going to work fine and we don't need to do anything about it in QNR at all. Maybe a custom lua filter that takes a list of files and includes them all? I've not spent too much time looking to the lua APIs though.

In this example the I have the same setup as before, but the partial document is included/evaluated several times with different parameters.

We don't have a way to do that unless I'm overlooking something. Once QNR gets hold of the document source it doesn't then have a way to recursively call into itself and I'd prefer that we don't add a mechanism to do that unless absolutely needed, it would add a large additional complexity.

Maybe the expand mechanism can already be used to achieve this goal?

It may be able to approximate the behaviour you're looking for, though it's not going to be a one-to-one mapping and can't replicate normal qmd source code exactly.

I did just open source https://github.com/PumasAI/QuartoTools.jl which builds on top of expand to provide an official API to the expand function. That might be useful to you perhaps. We'll be registering it in General once it's properly documented.

@jkrumbiegel you have any thoughts on this?

@MichaelHatherly
Copy link
Collaborator

So for the specific example you gave something like

```{julia}
#| echo: false
QuartoTools.Expand([
    QuartoTools.MarkdownCell(
        """
        # Heading $i

        $i
        """
    ) for i in 1:3
])
```

will work, but perhaps your real use case is more complex than this and QT's current options won't work?

@p-gw
Copy link
Author

p-gw commented Feb 20, 2025

Yes, in practice the partial documents are basically fully fledged quarto reports without front matter. So I need to be able to evaluate inline code, code chunks and access variables from the parent document. Like you said, to make this work there probably needs to be some kind of recursive rendering strategy which seems exactly what knitr is doing.

I'll take a took at QuartoTools and the functions in QuartoNotebookRunner / QuartoNotebookWorker. Maybe I can figure something out to make this work.

@jkrumbiegel
Copy link
Collaborator

I agree with @MichaelHatherly that because quarto does a transformation step before it even passes markdown to us, we cannot fully recreate inclusion of partial documents within QuartoNotebookRunner. For example, a partial cannot again use {{< include _partial.qmd >}} because resolving that is quarto's job.

But I assume that knitr can also not do this, so it probably expects markdown without those special commands.

@p-gw
Copy link
Author

p-gw commented Feb 20, 2025

For example, a partial cannot again use {{< include _partial.qmd >}} because resolving that is quarto's job.

I am not too concerned about the quarto specific syntax. You are probably right wrt knitr.
This is mostly about executing julia code in the partial document.

I already was able to execute basic inline code as well as code blocks by manually calling QuartoNotebookRunner.render in the parent document. However, I was not able to pass down variables from the parent document to the partial document, probably because QuartoNotebookRunner.render starts a new server/worker? It also takes quite a bit of time to render, presumably because of the same reason.

@MichaelHatherly
Copy link
Collaborator

Have you looked into trying this with https://quarto.org/docs/extensions/filters.html?

@jkrumbiegel
Copy link
Collaborator

jkrumbiegel commented Feb 20, 2025

Once QNR gets hold of the document source it doesn't then have a way to recursively call into itself and I'd prefer that we don't add a mechanism to do that unless absolutely needed, it would add a large additional complexity.

I was wondering if the complexity of this would really be that large. Suppose there was a special type defined in the worker code which when returned from a cell would trigger a special branch in the results handling and send the filename(s) to include back to the server where otherwise it would get the dictionary of resolved MIME outputs. Then within evaluate_raw_cells!

function evaluate_raw_cells!(
once this special return object is detected, one might be able to call evaluate_raw_cells! again, but with slightly different input data. We'd reuse the metadata of the main file because an included file is not allowed to have its own frontmatter. Instead of the File struct we'd create something like Subfile or so which holds a reference to the same worker as the main File, but to the included file instead of the main one. The methods for refresh! etc. would do nothing on the subfile, and I think the remaining code in evaluate_raw_cells! would then work as is.

All the cells that the inner evaluate_raw_cells! returns would then be appended to the vector being built in the parent and the parent would continue executing the other chunks of the main file. This would also be nestable automatically.

So the difference to knitr would be that one wouldn't expose a function in the worker that can suddenly directly affect the execution pipeline, but instead signal via the type of returned objects that this other branch should be taken in the server.

@MichaelHatherly
Copy link
Collaborator

That code in server.jl is already more complex than it should be. Here's an alternative approach that helps to refactor and clean up the server code while adding this feature.

We already have a feature that provides arbitrary nesting and creation of fake cells, expand, as noted in #261 (comment). What it is lacking is the ability to parse a .qmd file, since the code that does that is in the server. We don't want to have to reimplement that code.

But we now have QuartoTools open sourced, so we move the notebook parsing code over to that package and then add it as a dep to QuartoNotebookRunner and use it from there.

In QuartoTools we add a type called Notebook that implements expand. It takes a path to a .qmd, parses it, turns it into a list of fake cells that get run directly in the worker and their expanded result is returned in a single trip between worker and server.

That avoids adding complexity directly to the server code for a feature that isn't a core requirement and leaves it as an opt-in extra for the users that do want it.


I'll note that neither of these approaches seem to be able to handle quarto shortcodes etc that run on files prior to us parsing them, so notebook "inclusion" isn't going to be an exact duplication of running the file directly. If we're fine with that that's fine, but it's worth noting.

@MichaelHatherly MichaelHatherly added enhancement New feature or request speculative labels Feb 20, 2025
@jkrumbiegel
Copy link
Collaborator

In QuartoTools we add a type called Notebook that implements expand. It takes a path to a .qmd, parses it, turns it into a list of fake cells that get run directly in the worker and their expanded result is returned in a single trip between worker and server.

That would also be an option, yes. I thought that it might be nice to just reuse all the machinery so we don't have it in two places, but factoring it out could also work. I guess even inline code would not really be a problem in this case as it could be detected and evaluated on the worker as well.

@p-gw
Copy link
Author

p-gw commented Feb 21, 2025

Have you looked into trying this with https://quarto.org/docs/extensions/filters.html?

I am not quite sure how this applies here. There is https://github.com/pandoc/lua-filters/tree/master/include-files but I don't see how that is any different than {{< include ... >}}.

With regards to the proposed feature I obviously don't know the code in QuartoNotebookRunner well enough to make any suggestions for the implementation. But if you want to move forward with it, I am glad to help in some capacity.

@MichaelHatherly
Copy link
Collaborator

I am not quite sure how this applies here.

Was asking whether what you wanted to achieve was doable by writing yourself a custom Pandoc filter that would pre-process your document to insert all the extra content you wanted to. If what you want to add is dependent on runtime information that is only available inside the julia process in the notebook then it's not a viable route, but wanted to make sure whether you'd looked into that route already.

With regards to the proposed feature

I've been planning on refactoring that server code anyway, so if I do get around to it, I'll keep this feature request in mind and see what can be done to get it working.

@MichaelHatherly MichaelHatherly self-assigned this Feb 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request speculative
Projects
None yet
Development

No branches or pull requests

3 participants