forked from gluster/glusterfs
-
Notifications
You must be signed in to change notification settings - Fork 0
Bitd daemon refactoring new #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
gaurav36
wants to merge
6
commits into
master
Choose a base branch
from
bitd_daemon_refactoring_new
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Change-Id: I747e1c4eef4f147977da40eafceb9fec5d869a1c Signed-off-by: Venky Shankar <[email protected]>
This patch introduces rotational buffers aiming at the classic multiple producer and multiple consumer problem. A fixed set of buffer list is allocated during initialization, where each list consist of a list of buffers. Each buffer is an iovec pointing to a memory region of fixed allocation size. Multiple producers write data the these buffers. A buffer list starts with a single buffer (iovec) and allocates more when required (although this can be preallocatd in multiples of k). rot-buffs allow multiple producers to write data parallely with a bit of extra cost of taking locks. Therefore, it's much suited for large writes. Multiple producers are allowed to write in the buffer parallely by "reserving" write space for selected number of bytes and returning pointer to the start of the reserved area. The write size is selected by the producer before it starts the write (which is often known). Therefore, the write itself need not be serialized -- just the space reservation needs to be done safely. The other part is when a consumer kicks in to consume what has been produced. At this point, a buffer list switch is performed. The "current" buffer list pointer is safely pointed to the next available buffer list. New writes are now directed to the just switched buffer list (the old buffer list is now considered out of rotation). Note that the old buffer still may have producers in progress (pending writes), so the consumer has to wait till the writers are drained. Currently this is the slow path for producers (write completion) and needs to be improved. Currently, there is special handling for cases where the number of consumers match (or exceed) the number of producers, which could result in writer starvation. In this scenario, when a consumers requests a buffer list for consumption, a check is performed for writer starvation and consumption is denied until at least another buffer list is ready of the producer for writes, i.e., one (or more) consumer(s) completed, thereby putting the buffer list back in rotation. [ NOTE: I've not performance tested this producer-consumer model yet but using it changelog for event notification. The list of buffers (iovecs) are directly passed to the RPC layer. Before performance tests are done, the slow path needs to be improved and some other safety checks needs to be added such as write sizes exceeding iovec allocation size. ] Change-Id: I88d235522b05ab82509aba861374a2312bff57f2 Signed-off-by: Venky Shankar <[email protected]>
This patch imports timer-wheel[1] algorithm from the linux
kernel (~/kernel/time/timer.c) with some modifications.
Timer-wheel is an efficent way to track millions of timers for
expiry. This is a variant of the simple but RAM heavy approach
of having a list (timer bucket) for every future second.
Timer-wheel categorizes every future second into a logarithmic
array of arrays. This is done by splitting the 32 bit "timeout"
value into fixed "sliced" bits, thereby each category has a
fixed size array to which buckets are assigned.
A classic split would be 8+6+6+6 (used in this patch) which
results in 256+64+64+64 == 512 buckets. Therefore, the entire
32 bit futuristic timeouts have been mapped into 512 buckets.
[
NOTE:
There are other possible splits, such as "8+8+8+8", but
this patch sticks to the widely used and tested default.
]
Therfore, the first category "holds" timers whose expiry range
is between 1..256, the next cateogry holds 257..16384, third
category 16385..1048576 and so on. When timers are added,
unless it's in the first category, timers with different
timeouts could end up in the same bucket. This means that the
timers are "partially sorted" -- sorted in their highest bits.
The expiry code walks the first array of buckets and exprires
any pending timers (1..256). Next, at time value 257, timers
in the first bucket of the second array is "cascaded" onto
the first category and timers are placed into respective
buckets according to the thier timeout values. Cascading
"brings down" the timers timeout to the coorect bucket
of their respective category. Therefore, timers are sorted
by their highest bits of the timeout value and then by the
lower bits too.
[1] https://lwn.net/Articles/152436/
Change-Id: I1219abf69290961ae9a3d483e11c107c5f49c4e3
Signed-off-by: Venky Shankar <[email protected]>
This patch introduces RPC based communication between the changelog translator and libgfchangelog. It replaces the old pathetic stream based interaction that existed earlier (due to time constraints :-/). Changelog, upon initialization starts a RPC server (rpcsvc) allowing clients to invoke a probe API as a bootup mechanism to request for event notifications. During probe, clients can choose an event filter specifying the type(s) of events they are interested in. As of now there is no way to change the event notification set once the probe RPC call is made, but that is easier to implement. The actual event notifications is done on a separate RPC session. The client (libgfchangelog) itself starts and RPC server which the changelog translator "connects back" during probe. Notifications are dispatched by a bunch of threads from the server (translator) and the client optionally orders them if ordered notifications are requried. FOPs fill in their respective event details in a buffer (rot-buffs to be particular) and a bunch of threads (consumers) swap the buffers out of roatation and dispatch them via RPC. To avoid writer starvation, then number of dispatcher threads is one less than the number of buffer list in rot-buffs.x libgfchangelog becomes purely callback based -- upon event notification from the server (and re-ordering them if required) invoke a callback routine specified by the consumer. A major part of the patch is also aimed at providing backward compatibility for geo-replication, which was one of the main consumer of the stream based API. Also, this patch does not\ "turn on" event notifications for all fops, just a bunch which is currently in requirement. Another pain point is that the server does not filter events before dispatching it to the clients. That load is taken up by the client itself (although it's done at the library layer rather than making it hard on the callback implementor). This needs improvement and care needs to be taken to not load the server up with expensive filtering mechanisms. Change-Id: Ibf60a432b68f2dfa60c6f9add2bcfd37a9c41395 Signed-off-by: Venky Shankar <[email protected]>
Bitrot stub implements object versioning required for identifying signature freshness. More details about versioning is explained as a part of the "bitrot feature doc" patch. Change-Id: I2ad70d9eb109ba4a12148ab8d81336afda529ad9 Signed-off-by: Venky Shankar <[email protected]>
* Implement the skeleton of bit-rot xlator Change-Id: If33218bdc694f5f09cb7b8097c4fdb74d7a23b2d Original-Author: Raghavendra Bhat <[email protected]> Signed-off-by: Venky Shankar <[email protected]> Signed-off-by: Gaurav Kumar Garg <[email protected]> Signed-off-by: Anand nekkunti <[email protected]>
gaurav36
pushed a commit
that referenced
this pull request
Apr 13, 2015
Coverity CID 1288822 (#1 of 2) strncpy executed with a limit equal to the target array size potentially leaves the target string not null terminated. In this case the strncpy is not needed due to the snprintf with the same target buffer which follows immediately. This patch also removes the now unneeded scratch_dir argument to gf_changelog_init_history(), which is semantically correct, since scratch_dir has previously been filled into jnl->jnl_working_dir by the caller, and this is now used to fill hist_scratch_dir. Change-Id: Ib1ed3a1058e80e34191758921b49c29030d6c9db BUG: 789278 Signed-off-by: Michael Adam <[email protected]> Reviewed-on: http://review.gluster.org/10058 Reviewed-by: Kotresh HR <[email protected]> Tested-by: Gluster Build System <[email protected]> Reviewed-by: Vijay Bellur <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.