File tree 1 file changed +25
-0
lines changed
1 file changed +25
-0
lines changed Original file line number Diff line number Diff line change 14
14
#include <ddb_delta.h>
15
15
#include <ddb_cmph.h>
16
16
17
+ /* Idea:
18
+ DiscoDB's memory footprint can be huge in the worst case. Consider e.g.
19
+
20
+ DiscoDB((title, str(i)) for i, title in enumerate(file('wikipedia-titles')))
21
+
22
+ which is pretty much the worst case: all keys and values are unique, so
23
+ keys_map and values_map just waste space for nothing. Of course there's no way
24
+ DiscoDB could know this in advance.
25
+
26
+ We could provide an alternative interface where the user can maintain the
27
+ key/value -> id mapping and hence use all the domain information to conserve
28
+ memory. The interface could look as follows:
29
+
30
+ uint64_t value_id = ddb_cons_new_value(const struct ddb_entry *value);
31
+ uint64_t key_id = ddb_cons_new_key(const struct ddb_entry *key);
32
+ int ret = ddb_cons_add_id(struct ddb_cons *db, uint64_t key_id, uint64_t value_id);
33
+
34
+ In this scenario DiscoDB does not need to maintain internal mappings at all,
35
+ only two flat arrays for keys (id -> deltalist) and (id -> key) and one for
36
+ values (id -> value).
37
+
38
+ This would be especially convenient in the situations where keys and/or values
39
+ are unique or grouped - neither user nor discodb needs to maintain a mapping,
40
+ just a one-time id would suffice.
41
+ */
17
42
18
43
#define BUFFER_INC (1024 * 1024 * 64)
19
44
You can’t perform that action at this time.
0 commit comments