Tree-based APC storage engine #37
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Initial pass at implementing the directed acyclic graph as described in #36 . This code also contains an upper-bound on how many failed calls to
apcu_cas()
in a tight loop will be allowed before an exception is thrown, to prevent an infinite loop arising from some unforeseen APCu misbehavior. I'm not strongly attached to this loop-catcher for the general case if anybody is opposed, but for my needs it was deemed important to have, as a defensive-coding pattern.The DAG is a tree structure, starting with a root node, which contains a serialized array of all the metadata keys stored in APCu. Each metadata key contains a collection of the typical metadata (name, help, metric type) as well as an array of labelNames associated with the metric being measured, and if the metric is a histogram then an array of buckets. The metadata APCu key names end with ":meta".
For each labelNames item, a ":label" APCu key is created, listing all labels for each labelName. Then for each permutation of labels, a ":value" APCu key is created once the label-tuple contains a value. Similarly, for each histogram bucket assigned to a label-tuple, a bucket-specific ":value" APCu key is created once the bucket contains a value. This also applies for the special buckets "sum" and "+Inf", which are not stored as part of the metadata APCu key for the metric.
The described structure allows all APCu keys containing values to be quickly enumerated, simply by reading the root node, loading each metadata APCu key referenced by the root node, and programmatically generating every ":label" and ":value" APCu key that could exist based on the contents of that metadata key. Wiping all data follows a similar pattern: enumerate all keys which contain data and delete them, starting at the leaf nodes and working backward toward the root, deleting the root node last of all.
Note there is a small race condition in
addItemToKey()
, between reading the array and writing it back, another thread could add an element to the contents of the key. For my purposes this is acceptable, because if a metadata pointer gets deleted from the root node, for instance, the next thread to write to that metric will re-create the missing array pointer and will likely succeed. When hundreds of threads per second are writing metrics, the window where this pointer is missing will be exceedingly small -- and when traffic is light, chances are good that this race condition will not be triggered. However, it could be improved -- I'm still thinking about ways to serialize access to the critical part of that function -- ideas are welcome!