Skip to content

Replace MACROS with hashmap #209

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 2, 2025
Merged

Replace MACROS with hashmap #209

merged 1 commit into from
Jun 2, 2025

Conversation

icgmilk
Copy link

@icgmilk icgmilk commented May 23, 2025

This PR introduced 'MACROS_MAP' to replace the original 'MACROS' structure, and refactored related logic with hashmap utilities.

To make this change improves memory flexibility and future extensibility, I set MAX_ALIASES to 128.
The original value (1024) is excessive, as uftrace analysis shows that the number of aliases rarely exceeds 64 in practice(measured via uftrace record ./out/shecc ./src/main.c), and the total number of #defines in shecc is typically below 70.
In shecc's hashmap implementation, a rehash is triggered when the number of entries exceeds 75% of the capacity.
With this load factor of 0.75, a capacity of 128 supports up to 96 entries before rehashing, which comfortably covers observed usage while providing sufficient headroom.
In contrast, a capacity of 64 only allows 48 entries, increasing the likelihood of rehashing.
uftrace confirms this: setting MAX_ALIASES to 64 triggered one more hashmap_rehash compared to 128.
Therefore, I chose 128 over 64 to avoid unnecessary rehashing while still keeping memory usage significantly lower than the original default.

Performance analysis for out/shecc src/main.c

set `MAX_ALIASES` to 128

max_aliases_128

51.233 us 32.695 us 4 hashmap_rehash

set `MAX_ALIASES` to 64

max_aliases_64

56.218 us 35.418 us 5 hashmap_rehash

Summary by Bito

This pull request enhances the code by replacing the static 'MACROS' structure with a dynamic hashmap, 'MACROS_MAP', improving memory usage and lookup performance from O(n) to O(1). The maximum number of aliases has been set to 128 to optimize performance and reduce rehashing.

Copy link
Collaborator

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Address the benefits of the proposed changes in commit message.

@fennecJ
Copy link
Collaborator

fennecJ commented May 24, 2025

The idea of using a hashmap to improve macro lookup is good, but I wonder if it actually makes the program faster.

The original implementation:

MACROS = malloc(MAX_ALIASES * sizeof(macro_t)); // MAX_ALIASES = 1024 by default
  • Preallocates a MACROS pool with a single malloc, while the new implementation calls malloc once for each inserted macro.
  • The MACROS pool is linearly allocated in memory, which is cache-friendly. In contrast, the hashmap uses linked lists, with pointers pointing to scattered memory locations across different cache lines — which is not cache-friendly.

Based on these two points, I don't think the new implementation improves compiler performance.
Perhaps preallocating a MACROS pool and using a hash-based index could be a better compromise.
Alternatively, have you benchmarked the new implementation to confirm it's actually faster?

@icgmilk
Copy link
Author

icgmilk commented May 26, 2025

The idea of using a hashmap to improve macro lookup is good, but I wonder if it actually makes the program faster.

The original implementation:

MACROS = malloc(MAX_ALIASES * sizeof(macro_t)); // MAX_ALIASES = 1024 by default
  • Preallocates a MACROS pool with a single malloc, while the new implementation calls malloc once for each inserted macro.
  • The MACROS pool is linearly allocated in memory, which is cache-friendly. In contrast, the hashmap uses linked lists, with pointers pointing to scattered memory locations across different cache lines — which is not cache-friendly.

Based on these two points, I don't think the new implementation improves compiler performance. Perhaps preallocating a MACROS pool and using a hash-based index could be a better compromise. Alternatively, have you benchmarked the new implementation to confirm it's actually faster?

Thanks for pointing that out, you're right that this implementation doesn't improve compiler performance.
That said, I believe it's still beneficial in terms of memory flexibility and future extensibility.
I'll update the description accordingly. Thanks for the feedback!

@jserv jserv requested a review from ChAoSUnItY May 30, 2025 17:31
@jserv
Copy link
Collaborator

jserv commented May 30, 2025

this implementation doesn't improve compiler performance. That said, I believe it's still beneficial in terms of memory flexibility and future extensibility. I'll update the description accordingly. Thanks for the feedback!

Could you show the real benefits for memory flexibility with specific numbers?

Copy link
Collaborator

@ChAoSUnItY ChAoSUnItY left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall logic looks good to me, but please provide a analysis on adjustment of MAX_ALIASES.

@@ -974,7 +979,7 @@ void global_init(void)
{
elf_code_start = ELF_START + elf_header_len;

MACROS = malloc(MAX_ALIASES * sizeof(macro_t));
MACROS_MAP = hashmap_create(MAX_ALIASES);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide a analysis on MAX_ALIASES's configuration, that is, which power of 2 to represent it would be more efficient in this case? As you stated, this patch provides memory flexibility for macro_t structure allocation, however, it's obvious that 1024 is overkill for this purpose.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used uftrace to analyze the impact of different MAX_ALIASES configurations and found that setting it to 128 resulted in the fewest hashmap_rehash calls during execution while maintaining the smallest stable size.
The detailed analysis and results are documented in the commit message for reference.

Introduced 'MACROS_MAP' to replace the original 'MACROS' structure,
and refactored related logic with hashmap utilities.

To make this change improves memory flexibility and
future extensibility, I set 'MAX_ALIASES' to 128.

The original value (1024) is excessive, as uftrace analysis
shows that the number of aliases rarely exceeds 64
in practice(measured via 'uftrace record ./out/shecc ./src/main.c'),
and the total number of '#defines' in 'shecc' is typically below 70.

In shecc's hashmap implementation, a rehash is triggered when
the number of entries exceeds 75% of the capacity.

With this load factor of 0.75, a capacity of 128 supports up to 96
entries before rehashing, which comfortably covers observed
usage while providing sufficient headroom.

In contrast, a capacity of 64 only allows 48 entries, increasing
the likelihood of rehashing.

uftrace confirms this: setting 'MAX_ALIASES' to 64 triggered one
more 'hashmap_rehash' compared to 128.

Therefore, I chose 128 over 64 to avoid unnecessary rehashing while
still keeping memory usage significantly lower than
the original default.
@jserv jserv requested a review from ChAoSUnItY June 2, 2025 00:40
@jserv jserv merged commit e3154e3 into sysprog21:master Jun 2, 2025
6 checks passed
@jserv
Copy link
Collaborator

jserv commented Jun 2, 2025

Thank @icgmilk for contributing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants