adding missing openmp compile guards #1

ct-clmsn · 2023-07-28T17:56:11Z

What

These 3 files are missing compile guards around openmp pragmas.

./src/tools/perf/perftest.c
./src/tools/perf/lib/libperf.c
./test/apps/test_init_mt.c

This patch is the derived from the work done by Chris Taylor and John Leidel (from this PR: https://github.com/openucx/ucx/pull/8246/commits) The ways in which this patch differs from that PR are: * Use __clear_cache() for i-cache updates with correct syscall fallback (need to pass 0 for all CPUs) (myself) * Use appropriate fence instructions for IO and CPU fences (myself) * Coding standard fixes (myself) * Removal of unused rv64/atomic.h (myself) * Header guard fixes (myself) * Removal of unused UCS_CPU_MODEL_RV64IMAFDC (myself) * Corrected patch address immediate calculation (Evan Green) * Handling of zero for clz (Evan Green) * Use generic time functions (Evan Green) Original-author: Chris Taylor <[email protected]> Original-author: John Leidel <[email protected]> Co-authored-by: Evan Green <[email protected]> Signed-off-by: Rob Bradford <[email protected]> Signed-off-by: Evan Green <[email protected]>

Using UCX_MEM_MMAP_HOOK_MODE=reloc was crashing when running on RISC-V because ucm_dl_populate_symbols() was reaching through a symbol table using exactly the value it found out of the ELF dynamic section entry. This entry is a virtual addres if and only if the image in question has been loaded at address 0 (which is architecture dependent and not the case on RISC-V). Treating it as as virtual address when not loaded at 0 is therefore not appropriate (the pointer is dangling.) Add the dlpi_addr to dynamic entries that are treated as virtual addresses to compute their correct value. Signed-off-by: Evan Green <[email protected]> Co-developed-by: Rob Bradford <[email protected]>

Rather than relying on RTLD_NEXT to return the symbol and falling back to RTLD_DEFAULT instead lookup the symbol from the current shared library directly. This resolves an issue where on RISC-V in some circumstances the symbol returned by RTLD_NEXT is null and RTLD_DEFAULT then provides a pointer into the PLT stub. Since the PLT stub is too small to be patched by directly resolving the symbol here the PLT is avoided and the C library function can be directly patched. Signed-off-by: Rob Bradford <[email protected]>

The RV64 port does not have a custom memcpy() implementation. Signed-off-by: Rob Bradford <[email protected]>

Signed-off-by: Rob Bradford <[email protected]>

Use ucm_trace over ucm_debug and use lower case throughout. Signed-off-by: Rob Bradford <[email protected]>

And relocate to top of the function definition. Signed-off-by: Rob Bradford <[email protected]>

The arguments were wrongly labelled. Signed-off-by: Rob Bradford <[email protected]>

Signed-off-by: Rob Bradford <[email protected]>

On RISC-V for an atomic update the destination must be naturally aligned with the size of the type. Since the destinations in this case are addresses of library functions to patch we cannot control the addresses. Fortunately the instructions must be aligned to a 16-bit alignment which also matches the size of the patch on RISC-V (compressed J instruction.) For other architectures the size of patch is either 4 bytes (aarch64) or 2 bytes (x86-64) which are both handled by this patch. Signed-off-by: Rob Bradford <[email protected]>

Counting the trailing zeroes only works if the input value is strictly a power of two. Signed-off-by: Rob Bradford <[email protected]>

rbradford and others added 12 commits July 27, 2023 17:25

UCS/ARCH/RV64: Remove unnecessary memcpy options

86442ca

The RV64 port does not have a custom memcpy() implementation. Signed-off-by: Rob Bradford <[email protected]>

UCS/ARCH/RV64: Add ucs_ prefix to ctz_safe macro

93dfdd4

Signed-off-by: Rob Bradford <[email protected]>

UCM/UTIL/RELOC: Fix debug messages

bfc6df5

Use ucm_trace over ucm_debug and use lower case throughout. Signed-off-by: Rob Bradford <[email protected]>

UCM/UTIL/RELOC: Make the dlopen flags static const

32cd421

And relocate to top of the function definition. Signed-off-by: Rob Bradford <[email protected]>

UCM/BISTRO/RV64: Fix comments on assembly instructions

defd8c8

The arguments were wrongly labelled. Signed-off-by: Rob Bradford <[email protected]>

UCM/BISTRO/RV64: Enable ucm_bistro_apply_patch_atomic

c1f14ee

Signed-off-by: Rob Bradford <[email protected]>

UCS/ARCH/RV64: Use clz to calculate the log2 of a number

1af06d0

Counting the trailing zeroes only works if the input value is strictly a power of two. Signed-off-by: Rob Bradford <[email protected]>

adding missing openmp compile guards

e6fc914

rbradford force-pushed the 20230628-riscv-enabling branch 3 times, most recently from 8628a4a to f60b36c Compare August 7, 2023 15:54

rbradford force-pushed the 20230628-riscv-enabling branch from 64f6f4b to c8e9502 Compare August 23, 2023 11:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adding missing openmp compile guards #1

adding missing openmp compile guards #1

ct-clmsn commented Jul 28, 2023

adding missing openmp compile guards #1

Are you sure you want to change the base?

adding missing openmp compile guards #1

Conversation

ct-clmsn commented Jul 28, 2023

What