-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement TCP-like messaging #10
base: main
Are you sure you want to change the base?
Conversation
* Made all errors end in newline * Added macros for important constants
@khollbach , let me know what you think about a60dd7f -- it consists of a lot of minor cosmetic changes, but the main thing is |
Those changes look good to me! Sorry for the delay, been hard to find time during the move. I also stacked some changes on top of these -- in #13 -- mostly docs improvements. Let me know what you think of those, and maybe we can then merge everything into main (even if it's yet-untested ^^') |
Ok; merged those changes into here; hopefully everything still builds ^^' We can look at testing and merging this branch on Wednesday! |
It doesn't yet call the reliable versions of send/recv, but it works with the normal ones at least. Next step is to make this call the TCP code instead and see what breaks.
lib/tcp.c
Outdated
json_object_get(m->msg); | ||
fprintf(stderr, "(send) msg: %s\n", | ||
json_object_to_json_string(m->msg)); | ||
msg_send(m->msg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@khollbach , here's what I think is going on re: duplicate self-messages (with an example at the bottom):
- Send a queued "ping" (it was placed into the send queue by the application code)
- Receive the "ping" on L136
- Process it as a data message, so send an "ACK" message, return the data message to application code (prompting "pong" to be enqueued to the send queue).
msg_recv_listener
is called again, "ping" is sent again before the next message (ACK from Step 3) is processed- Process the ACK from Step 3 accordingly (importantly, dequeuing the "ping" from the send queue) and stay in
msg_recv_listener
while
loop... - ...which now sends the enqueued "pong" to itself and receives the ACK for the second "ping" (from Step 4), ignoring it and putting us back at the top of the
msg_recv_listener
while loop... - ...which sends "pong" again, but what get's processed on L136 is the first "pong."
- Process it as a data message, so send an "ACK" message, return the data message to application code (prompting the client response to be sent).
- ...but we still have things in STDIN! Next up is the second "pong", but before that, we send another "pong" before processing the second "pong" (which generates another "ACK" message).
- ...
Here's an example of these steps as seen in The Actual Data:
c12 → n1 {:echo "Please echo 17", :type "echo", :msg_id 7}
n1 → n1 {:in_reply_to 7, :type "ping", :original_msg {"id" 274, "src" "c12", "dest" "n1", "body" {"echo" "Please echo 17", "type" "echo", "msg_id" 7}}, :seq_msg_index 2, :msg_id 45}
n1 → n1 {:in_reply_to 45, :type "ACK", :seq_msg_index 2, :msg_id 46}
n1 → n1 {:in_reply_to 7, :type "ping", :original_msg {"id" 274, "src" "c12", "dest" "n1", "body" {"echo" "Please echo 17", "type" "echo", "msg_id" 7}}, :seq_msg_index 2, :msg_id 47}
n1 → n1 {:in_reply_to 45, :type "pong", :original_msg {"id" 274, "src" "c12", "dest" "n1", "body" {"echo" "Please echo 17", "type" "echo", "msg_id" 7}}, :seq_msg_index 3, :msg_id 48}
n1 → n1 {:in_reply_to 47, :type "ACK", :seq_msg_index 2, :msg_id 49}
n1 → n1 {:in_reply_to 45, :type "pong", :original_msg {"id" 274, "src" "c12", "dest" "n1", "body" {"echo" "Please echo 17", "type" "echo", "msg_id" 7}}, :seq_msg_index 3, :msg_id 50}
n1 → n1 {:in_reply_to 48, :type "ACK", :seq_msg_index 3, :msg_id 51}
n1 → n1 {:in_reply_to 45, :type "pong", :original_msg {"id" 274, "src" "c12", "dest" "n1", "body" {"echo" "Please echo 17", "type" "echo", "msg_id" 7}}, :seq_msg_index 3, :msg_id 53}
n1 → c12 {:in_reply_to 7, :type "echo_ok", :echo "Please echo 17", :msg_id 52}
n1 → n1 {:in_reply_to 45, :type "pong", :original_msg {"id" 274, "src" "c12", "dest" "n1", "body" {"echo" "Please echo 17", "type" "echo", "msg_id" 7}}, :seq_msg_index 3, :msg_id 54}
n1 → n1 {:in_reply_to 50, :type "ACK", :seq_msg_index 3, :msg_id 55}
n1 → n1 {:in_reply_to 45, :type "pong", :original_msg {"id" 274, "src" "c12", "dest" "n1", "body" {"echo" "Please echo 17", "type" "echo", "msg_id" 7}}, :seq_msg_index 3, :msg_id 56}
n1 → n1 {:in_reply_to 53, :type "ACK", :seq_msg_index 3, :msg_id 57}
n1 → n1 {:in_reply_to 54, :type "ACK", :seq_msg_index 3, :msg_id 58}
n1 → n1 {:in_reply_to 56, :type "ACK", :seq_msg_index 3, :msg_id 59}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow, excellent sleuthing! It's kind of surprising that only self-messages triggered this problem... thanks for getting to the bottom of it!
As far as how to fix it... I have some ideas (timeouts!) but the details are fuzzy. Let's discuss next time we meet!
* swap out send/recv for their 'reliable' (auto-retrying) counterparts * change re-try timeout to 100ms (from 1 sec) * make conch queue maxlen longer in challenge 2 (5000, from 5)
{ Opening draft PR for tracking discussion; expect this to be heavily revised as we approach completion! }