Skip to content

Conversation

Julia-Garland
Copy link
Contributor

@Julia-Garland Julia-Garland commented Aug 26, 2025

Motivation

"There are several points in the codebase that use bson_malloc(sizeof(T) * N) to allocate arrays of objects. This should not do an in-situ multiplication, since a large value of N will cause integer overflow and result in either an allocation failure or a bogus allocation size. Also, N = 0 can cause issues since malloc(0) is undefined/unspecified."

Summary

Array allocations of the form bson_malloc(sizeof(T) * N) now use a new function bson_array_alloc() which handles multiplication of the two terms internally. Likewise, equivalent calls to 'bson_malloc0(sizeof(T) * N) now usebson_array_alloc0()`.

@Julia-Garland Julia-Garland self-assigned this Aug 26, 2025
@Julia-Garland Julia-Garland force-pushed the audit-array-allocations.cdriver-6055 branch from 6cdb790 to b4c8bb8 Compare August 27, 2025 14:04
@Julia-Garland Julia-Garland marked this pull request as ready for review August 27, 2025 16:17
@Julia-Garland Julia-Garland requested a review from a team as a code owner August 27, 2025 16:17
@Julia-Garland Julia-Garland force-pushed the audit-array-allocations.cdriver-6055 branch from 0342bfe to fcd73e6 Compare August 27, 2025 17:33
Comment on lines +51 to +53
bson_array_alloc(size_t type_size, size_t num_elems);
BSON_EXPORT(void *)
bson_array_alloc0(size_t type_size, size_t num_elems);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is very similar to calloc, I recommend consolidating these into a single new function bson_calloc that uses the calloc function on the BSON vmemtable, rather than adding two new public APIs. This will also defer the calloc logic to the underlying allocator, which may have more smarts based on object size.

@@ -27,7 +27,7 @@ mongoc_set_new(size_t nitems, mongoc_set_item_dtor dtor, void *dtor_ctx)
mongoc_set_t *set = (mongoc_set_t *)bson_malloc(sizeof(*set));

set->items_allocated = BSON_MAX(nitems, 1);
set->items = (mongoc_set_item_t *)bson_malloc(sizeof(*set->items) * set->items_allocated);
set->items = (mongoc_set_item_t *)bson_array_alloc(sizeof(*set->items), set->items_allocated);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the pattern everywhere is almost always (T*)bson_array_alloc(sizeof(T), N), this is a good time to use a function-like macro:

// Plain function
void* _bson_alloc_n_impl(size_t item_size, size_t count);
// Macro that does the right thing every time:
#define bson_alloc_n(Type, Count) \
  ((Type*)_bson_alloc_n_impl(sizeof(Type), (Count))

This also ensures the returned pointer type is correctly used, rather than allowing the implicit-cast from void*.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since most of the original function calls I changed were to bson_malloc, not bson_malloc0, would consolidating into a calloc wrapper where memory is always zeroed out be cause for any efficiency concerns?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather keep a distinct non-zero alloc for cases that are performance sensitive. For example allocations in mongoc-set.c might be arbitrarily large. Maybe add a bson_alloc_n0?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants