Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating default make flags for make-settings in profiling section. #170

Merged
merged 2 commits into from
Sep 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions topics/debugging.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,11 @@ production environment.

## Compiling Valkey without optimizations

By default Valkey is compiled with the `-O2` switch, this means that compiler
optimizations are enabled. This makes the Valkey executable faster, but at the
same time it makes Valkey (like any other program) harder to inspect using GDB.
By default, Valkey is compiled with the `-O3` optimization flag, which enables
a high level of compiler optimizations that aim to maximize runtime performance.
Valkey is also compiled with the `-fno-omit-frame-pointer` flag by default, ensuring that
the frame pointer is preserved across function calls. This combination allows for
precise stack walking and call stack tracing, which is essential for debugging.

It is better to attach GDB to Valkey compiled without optimizations using the
`make noopt` command (instead of just using the plain `make` command). However,
Expand Down
26 changes: 13 additions & 13 deletions topics/performance-on-cpu.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,23 +37,23 @@ For a proper On-CPU analysis, Valkey (and any dynamically loaded library like
Valkey Modules) requires stack traces to be available to tracers, which you may
need to fix first.

By default, Valkey is compiled with the `-O2` switch (which we intent to keep
during profiling). This means that compiler optimizations are enabled. Many
compilers omit the frame pointer as a runtime optimization (saving a register),
thus breaking frame pointer-based stack walking. This makes the Valkey
executable faster, but at the same time it makes Valkey (like any other program)
harder to trace, potentially wrongfully pinpointing on-CPU time to the last
available frame pointer of a call stack that can get a lot deeper (but
impossible to trace).
By default, Valkey is compiled with the `-O3` optimization flag (which we intent to keep
during profiling). This means that compiler optimizations are enabled which significantly
enhance the performance. Valkey is also compiled with the `-fno-omit-frame-pointer` flag
by default, ensuring that the frame pointer is preserved across function calls.
This combination allows for precise stack walking and call stack tracing,
which is essential for accurate profiling and debugging. Keeping the frame pointer
intact helps profiling tools like `perf`, `gdb`, and others correctly attribute on-CPU
time to deeper call stack frames, leading to more reliable insights into performance bottlenecks
and hotspots. This setup strikes a balance between maintaining a highly optimized executable
and ensuring that profiling and tracing tools provide accurate and actionable data.
Comment on lines +48 to +49
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GPT generated :D ?

Copy link
Member Author

@roshkhatri roshkhatri Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah :P for formatting the content better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So verbose though 🫠


It's important that you ensure that:
- debug information is present: compile option `-g`
- frame pointer register is present: `-fno-omit-frame-pointer`
- we still run with optimizations to get an accurate representation of production run times, meaning we will keep: `-O2`
- we still run with optimizations to get an accurate representation of production run times, meaning we will keep: `-O3`

You can do it as follows within redis main repo:
You can do it as follows within valkey main repo:

$ make SERVER_CFLAGS="-g -fno-omit-frame-pointer"
$ make SERVER_CFLAGS="-g"

## A set of instruments to identify performance regressions and/or potential **on-CPU performance** improvements

Expand Down