Replies: 2 comments 2 replies
-
Thank you for looking into this. I did a similar profile a while back and optimized the prefix transformer a bit (because it was the slowest part back then), but apart from this, there is still a lot of potential! I should also clarify my "we do care about the startup speed" statement. What I mean is the following: I really like command line applications that feel snappy. To be honest, most people will probably not care if the Numbat CLI takes 50 ms or 5 ms. But there is a threshold. Startup times tend to be noticeable once you get closer to 100 ms or 200 ms. And 500 ms already feels slow. I can feel a difference between Also, since the startup time is dominated by the interpreter, speeding up startup time means optimizing the interpreter, which is also relevant for executing larger amounts of code.
👍
Okay. Here it's probably worth taking a closer look. This consists of two parts. The actual compilation step (AST => bytecode). And then running the bytecode VM.
Yeah. That's a special part of Numbat. Maybe it would be possible to integrate this into the parser, but I found it easier to implement as a separate stage. But that also comes at a cost (traversing and rewriting the full AST). The job of the prefix parser is to do name resolution. And in particular: to distinguish between variables and units. If we see an identifier like
The typechecker also does a full rewrite of the parse tree (currently:
Yes. Like pythons |
Beta Was this translation helpful? Give feedback.
-
Out of curiosity, what're you using for benchmarking in those screenshots? |
Beta Was this translation helpful? Give feedback.
-
Hey, in #210, you mentioned that « we do care about the startup speed ».
I thought about it a little bit. First of all, I profiled the prelude import here. You can see a quick overview here:
![image](https://private-user-images.githubusercontent.com/7032172/277117022-ca8d0969-70a6-4d1f-86f3-aa55712a8ba6.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkyNzc0MTQsIm5iZiI6MTczOTI3NzExNCwicGF0aCI6Ii83MDMyMTcyLzI3NzExNzAyMi1jYThkMDk2OS03MGE2LTRkMWYtODZmMy1hYTU1NzEyYThiYTYucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxMSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTFUMTIzMTU0WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9YjhjNmNiNTNkZmY0ZDM4ZThiNTkyNTFlMTY3YzAxMDlmYzViZWQ5ODJkMTM3MTBkY2Y0ZGMxNTQyZjIzNmM4MSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.mHSV8hbEUXeau-UxzSjJYbQW4F0E9RJ3lXPYfBMEoD4)
As expected, we pretty much spent on the interpreter and not on the importer or something else unrelated; that’s nice.
Now, let’s dive deep into the interpret method:
![image](https://private-user-images.githubusercontent.com/7032172/277117126-2d453161-ce45-49b8-b8d7-7163ba7798c4.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkyNzc0MTQsIm5iZiI6MTczOTI3NzExNCwicGF0aCI6Ii83MDMyMTcyLzI3NzExNzEyNi0yZDQ1MzE2MS1jZTQ1LTQ5YjgtYjhkNy03MTYzYmE3Nzk4YzQucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxMSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTFUMTIzMTU0WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9YmUzMGM5YjU5Yzc1MGRkYzg3ZTM0OGIyNzZiZWE0YWRiMDgxNGIwMWY4NWU1NWNmNGNiZTRhYjAxYjNiYzc5MyZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.5zIVsrv4qudzBzFAIzIDjRGpV0giUP7_LvJkw20hCUo)
The interesting parts are:
prefix_transformer
. I have no idea what it is and don't know if it's expected (and I would love to know what it is 👀)One idea to optimize the startup time could be to pre-compile the prelude to bytecode and add the ability to import bytecode directly to a context.
That should halve the startup time straight away.
Beta Was this translation helpful? Give feedback.
All reactions