Sunteți pe pagina 1din 2

Programming Assignment #2 Report

How to run the code:


To run the code first the files must be compiled. There is a node.c and a coord.c. The
following is the commands to compile the code so that the threads work correctly and
create an executable with the right name:
gcc o coord lpthread coord.c
gcc o node lpthread node.c
First the coordinator must be run. This will take in a port input for it to listen to (./coord
<port>). The node must be run with the ip address of the coordinator as the input and
the port (./node <ip address> <port>). This should successfully connect the nodes to the
coordinator.
How the code works:
The nodes are connected to the coordinator via a TCP connection and are run on
separate threads. The messages sent to the coordinator from the nodes are stored in
an array on the server. The values in this array are used to determine what the
coordinator will do (GLOBAL_COMMIT or GLOBAL_ABORT).
When the two nodes connect to the coordinator, the coordinator first sends a message
to the nodes asking for their VOTE_REQUEST. The nodes either respond with
VOTE_COMMIT or VOTE_ABORT. If both nodes locally commit, then the coordinator
does a GLOBAL_COMMIT. If either of the nodes does a local abort, then the
coordinator will do a GLOBAL_ABORT. And of course, if both of the nodes do a local
abort, then the coordinator will do a GLOBAL_ABORT.
When one node answers the coordinator, the coordinator will wait for the other node to
reply. If the node waits too long, the coordinator will timeout and do a GLOBAL_ABORT.
Unfortunately we did not have time to fully implement this part of the code. Our timeout
is buggy and works at times. We have not had enough time to fully debug this portion of
the project.
If an empty string is sent back to the coordinator from a node, then the node will be
disconnected from the coordinator. This is one way we can demonstrate a timeout sort
of effect.
We werent able to fully complete the last two test cases but if we had enough time to
complete them these are the steps we would take.
Test Case 4: I would begin a timer when both the coordinator and the nodes connect. I
would have a segment of code that would allow the nodes to timeout if they are finished

with their previous Global Commit/Abort and have not received the next Vote Request.
This would cause the nodes to send an Abort to the coordinator who in turn multicasts a
Global Abort. This test case is used to show that this type of situation would usually
indicate that either the coordinator was taking too long to recover or execute its
operation from the previous Vote or it has completely crashed and an Election Algorithm
would come into play to choose a new coordinator from a pool of nodes.
Test Case 5: In this case I would begin a timer again when the nodes connect to the
coordinator. I would send a Vote Request and receive Commit messages from the node.
I would then trick the coordinator based on different modes to only send the Global
Commit to one participant. This will then prompt the other participant after a certain time
to ask the other participants if they received a Decision. If they did, they would pass
along the message, if not the participant would abort. This case is used to show that
sometimes the coordinator runs slow and sends Global Commit/Abort messages too
slowly, so the participants realize it has been a long time since they sent their Vote so
they ask their fellow participants if the coordinator has gotten back to them. This
ensures that they make sure the coordinator hasnt crashed and also allows the system
as a whole to function correctly even when some errors or latency occurs.

S-ar putea să vă placă și