CS 475, Fall 2019, Homework 4, Due November 18, 9am
Start by downloading the handout:
Please post questions on Piazza
The purpose of this assignment is to introduce the concepts of replication and transactions. For this project, you’ll be enhancing your key value server so that keys and values are replicated to client-side caches. We’ll still have a single server, which will be in charge of maintaining locks and ensuring that new writes are replicated to all replicas. Our architecture will now look something like this:
Adding all of these replicas can increase the performance of our system when reading many large keys, since each client will always have every key’s value cached. When clients read keys, they will not need to read them from the server: they can directly read them from their local cache, and we will use a write-invalidate protocol to update those caches when a value is updated. We will be maintaining our traditional notion of consistency: each client will always read the most recent version of each file. This means that the master will need to have some protocol to ensure that a write deemed “successful” at the server level has indeed succeeded on every active client.
Your primary interface for debugging your KVStore will be the same shell driver that we provided for HW3. We have provided stubs for all of the various commands you will need to implement, with a wrapper so that you can interact with VCFS interactively. When you run the compiled jar file, you’ll get a command prompt, like this (again, note that there are fewer commands than previously):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
cs475sh help
AVAILABLE COMMANDS
Built-In Commands
clear: Clear the shell screen.
exit, quit: Exit the shell.
help: Display help about available commands.
script: Read and execute commands from a file.
stacktrace: Display the full stacktrace of the last error.
Command
get: Get a key's value
list-directory: List a directory
list-keys: List all keys
put-all: Update all keys in a directory to have the given content
set: Set a key's value
|
We have also provided a baseline suite of unit tests, which you can execute by calling mvn test
Summary of changes compared to HW3:
- All keys and values are now replicated to all clients with two-phase commit (read operations won’t require any network traffic)
- Removed operations (hooray): getAll, remove, removeDirectory
- While we’ll still use the notion of directories, you will no longer maintain any data structure to keep track of them. Instead, to implement listDirectory and putAll, your client will simply iterate over all of the keys, and find which keys match the same prefix (so these operations will appear to behave the same). The key here is that set will now be guaranteed to touch only a single key at a time (in past HW, calling set might result in creating parent directories too).
Here is a graphical summary showing how reads and writes will now work: when a client (replica) connects, the entire KVStore is copied to the client. Then, the client can read values directly (without going to the server). When a client wants to update a key, that update gets propagated to the master and to each replica.
General requirements:
You must use exclusively reentrant locks (e.g. synchronized or ReentrantReadWriteLock), with the single exception of lockKey and unlockKey, which must use StampedLocks.
Your KV Store will be compiled and tested using apache maven, which automatically executes the provided JUnit tests. Please install Maven on your computer. Unfortunately, Maven is not installed on Zeus, however you can download the most recent version (e.g. apache-maven-3.6.1-bin.zip) and unzip it in your home directory. Then, to run maven, type ~/apache-maven-3.6.1/bin/mvn in the snippets below (instead of just mvn). Note that you can easily import maven projects into eclipse and IntelliJ (we suggest IntelliJ over eclipse).
To compile and run your server, independently of the tests, run mvn -DskipTests package in the top level directory of the handout and then, in the server directory run java -jar target/kvstore-server-2019.3.4-SNAPSHOT.jar <portnumber> to start the server, and to start the client, in the client directory run java -jar target/kvstore-client-2019.3.4-SNAPSHOT.jar <portnumber>. You can specify any free port number on your computer over 1024; we have hardcoded the client to assume that the client runs on the same computer as the server. You’ll notice that the text-mode interface we’ve provided for you has a handy help command. To build the jar file without running the tests, run mvn -DskipTests package.
Your KV store will be automatically graded for correctness (note that there will be a manual grading phase to check hard-to-automatically-catch concurrency issues). Included with your handout is all of the automated tests that we will use to test your assignment. Upon submitting your assignment, our server will automatically compile and test your assignment and provide you with test results. We will also for this assignment use a state-of-the-art race detector to check for races in your program – this will run automatically in Autolab. You can resubmit 50 times before the deadline without penalty. To run these tests, simply execute mvn test (of course, if you do this first off, you’ll see that they all fail!)
Note: Your code must compile and run on the autograder, under Java 8. It is unlikely that you will have any difficulties developing on a Mac, Linux or Windows, but please keep in mind the possibility of portability problems. When you feel satisfied with implementing one phase of the assignment, submit to AutoLab and verify that AutoLab agrees.
General coding/grading requirements:
- You must use exclusively reentrant locks (e.g. synchronized or ReentrantReadWriteLock)
- You should feel free to add whatever additional classes you wish, or any additional methods to the existing edu.gmu.cs475.KeyValueServer and edu.gmu.cs475.KeyValueClient. You must not modify the edu.gmu.cs475.IKeyValueServer interface, the edu.gmu.cs475.AbstractKeyValueClient, any of the tests, or any of the internal classes.
- Your code should be thread-safe: concurrent calls to any of these methods (or any other method in IKeyValueServer) should not have any races. It should now be clearer how this can occur — you will potentially have multiple clients attempting to interact with the server simultaneously. It is OK for you to use coarse-grained synchronization here, and we will not test for concurrent operations on your KVStore (unlike in HW2, where this was the primary focus).
- You must not store any state in static fields
- All concurrency-related grading will account for a total 0f 10% of your grade (see Part 5). We will only consider concurrency-concerns for the first 4 parts to the extent that they are preventing your assignment from passing the given tests under normal circumstances.
Part 1: Client-side cache (35%)
For the very first part, you’ll configure the client and server code so that when they connect, the clients register themselves with the server, and the server provides the clients with a set of all of the keys and their values. This will form the initial cache on the client. Then, whenever the server receives a set request, it will forward that request to all registered clients, who will in turn update their caches. When a client is done, it will notify the server that it’s disconnecting, which will allow the server to stop sending updates to it.
Implement your server in the server project, by implementing the empty methods in edu.gmu.cs475.KeyValueServer, and your client in the client project, by implementing the empty methods in edu.gmu.cs475.KeyValueClient. You should feel free to reuse the code you had from HW3, or write something different (you’ll notice that the API changed slightly).
We’ve provided you a basic client that will automatically call the server’s registerClient method, and which will perform all read operations from its local cache. Our basic implementation of the server side registerClient will return the map to the client.
For part 1, your server’s set should: (1) take out a write-lock on the file, (2) call innerWriteKey on each cache client, passing transaction ID 0, and (3) then update the file locally. There is no need to implement heartbeats like in HW3 – lock and unlock can just use straightforward StampedLocks. Similarly, lockKey and unLockKey can assume that they are only ever used for write locks.
For part 1, in the server, you’ll implement:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
|
/**
* Sets a key to be the given value
*
* Creates a transaction (for part 1, always uses 0 as transaction id), then locks the key, then calls innerWriteKey on each replica.
* If all replicas succeed, commits the transaction and sets the key locally. Otherwise, aborts the transaction and throws an IOException.
*
* @param key key to set
* @param value value to store
* @throws NullPointerException if key or value is null
* @throws IOException if write fails to be replicated
*/
public void set(String key, String value) throws IOException;
/**
* Locks (for writing) the given key and returns the stamp
*
* @param name key
* @return stamp for lock
*/
@Override
public long lockKey(String name);
/**
* Unlocks the specified key, given the specified stamp
*
* @param name
* @param stamp
*/
@Override
public void unLockKey(String name, long stamp);
/**
* Registers that a client is joining the server. Performs whatever bookkeeping is necessary, and returns a copy
* of the current key/values set
*
* @param hostname the hostname of the replica talking to you (passed again at disconnect)
* @param portNumber the port number of the replica talking to you (passed again at disconnect)
* @param replica The RMI object to use to signal to the replica
* @return
*/
@Override
public HashMap registerClient(String hostname, int portNumber, IKeyValueReplica replica);
/**
* Notifies the server that a cache client is shutting down (and hence no longer will be involved in writes)
*
* @param hostname The hostname of the client that is disconnecting (same hostname specified when it registered)
* @param portNumber The port number of the client that is disconnecting (same port number specified when it registered)
*/
@Override
public void cacheDisconnect(String hostname, int portNumber);
|
For part 1 in the client, you’ll implement:
1
2
3
4
5
6
7
8
9
10
|
/**
* Write a key in our *local* map
*
* @param key Key to set
* @param content String representing the content desired
* @param xid Transaction ID, if this write is associated with any transaction, or 0 if it is not associated with a transaction
* If it is associated with a transaction, then this write must not be visible until the replicant receives a commit message for the associated transaction ID; if it is aborted, then it is discarded.
* @return true always, indicating that your client succeeded in making the update. Your server, however, must not assume that all clients return true (perhaps some other clients will return false if they have some error)
*/
public boolean innerWriteKey(String key, String content, long xid);
|
Hint: To prevent replicas from joining or departing during a write (but still allowing concurrent writes to different files), consider using a ReentrantReadWriteLock to guard your list of replicas. Code that is reading the list of replicas (e.g. the set method, and in part 3, setInTransaction) would need to acquire a read lock, while code that is changing the list of replicas (e.g. when registering or departing) would require a write lock.
When you are ready to check your work, you should run just the tests in the test class edu.gmu.cs475.P1Tests. To do so from maven, you should run mvn -Dtest=P1Tests test.
Precise grading breakdown:
- Automated functional tests (
edu.gmu.cs475.P1Tests): 32 points
- 8 JUnit tests, 4 points each
- Manual feedback: 3 points
Part 2: Server-initiated transactions on write (35%)
Next, you will implement a simple two-phase-commit protocol. The motivation for this is that our implementation so far does not guarantee that each client will always see the most recent file. In particular: consider the case where clients C1, C2 and C3 are connected to the server. Client C1 updates a key Foo by telling the server. The server sends the update to C1 (OK) and C2 (OK). When it tries to update C3, it is unable to contact C3 (perhaps the network is being really slow temporarily). At this point: what should we do? If C3 is crashed, then this is probably OK: but if C3 might show up again later, then for the period of time that C3 is out of communication, it has the wrong version of file foo!
We’re going to play it safe: the server will first try to reach all of the clients and tell them that they should get ready to do the update. Then, after all clients says “Yes, I’m ready to update that key” (by returning the value true, from the server’s call to the client’s innerWriteKey), the server will send a commit message (calling the method commitTransaction on each replica), which tells each client that it should perform the commit. If any one client is not able to do the update (by voting no, responding false), then the server will abort the update, canceling the change and returning an error to the original client that wanted to perform the update. Note that although all of the clients that you write will always vote yes (returning true), your server must assume that some clients may vote no. Note also that clients might throw a RemoteException instead of a vote! (which would be a vote not to commit)
To implement this portion, you’ll need to (1) update set on the server to generate a new transaction ID for each time that set is called (any number is fine as long as it doesn’t repeat) and pass that ID to each client’s innerWriteKey, (2) adapt innerWriteKey so that it stores transaction writes into a separate cache, and (3) implement commit and abort on the client to apply or abort that transaction. If every innerWriteKey successfully returns (no exception and returns true), your server should call commit on each client; if not, it should call abort on each client.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
|
/**
* Commit a transaction, making any pending writes that are part of this transaction immediately visible.
*
* If the specified transaction doesn't exist then does nothing
*
* @param id transaction id
*/
@Override
public void commitTransaction(long id);
/**
* Abort a transaction, discarding any pending writes that are associated with it
*
* If the specified transaction doesn't exist then does nothing
*
* @param id transaction id
*/
@Override
public void abortTransaction(long id);
|
We will grade your client and server side implementation separately. We have provided a set of tests, edu.gmu.cs475.P2Tests, which automatically mock a correctly functioning server — running your client against this fake server. To run this from maven, you should run mvn -Dtest=P2Tests test.
Precise grading breakdown:
- Automated functional tests (
edu.gmu.cs475.P2Tests): 32 points
- 4 JUnit tests, 8 points each
- Manual feedback: 3 points
Part 3: Client-initiated transactions on putAll (20%)
Finally, you’ll extend the notion of transactions, allowing clients to define themselves when a transaction will start. In particular, you’ll configure the client so that before a putAll starts, it creates a transaction. This way, even if writing to a single key on a single cache replica fails, the entire putAll can be aborted, and the invariant that putAll is atomic is preserved.
putAll should, in this order:- Acquire locks on all keys being set
- Start a transaction ( startNewTransaction)
- Update each key by telling the server to, passing that transaction ID ( setInTransaction)
- If all writes succeeded, then commit the transaction ( issueCommitTransaction), else, abort it ( issueAbortTransaction)
- Release all locks (regardless of exceptions that may have occurred in the above)
Note 11/6: Do NOT return “0” as an ID from startNewTransaction, since 0 is the special
The spec for all of the remaining server methods:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
|
/**
* Request a new transaction ID to represent a new, client-managed transaction
*
* @return Server-provided ID that will be used in the future to commit or abort this transaction
*/
@Override
public long startNewTransaction();
/**
* Sets a key to be the given value
*
* Does NOT create a transaction, but instead just calls innerWriteKey on each replica. No replica can join or depart
* during a call to setInTransaction.
* If all replicas succeed, returns true, otherwise returns false.
*
* @param key key to set
* @param value value to store
* @throws NullPointerException if key or value is null
*/
@Override
public boolean setInTransaction(String key, String value, long xid);
/**
* Broadcast to all replicas (and make updates locally as necessary on the server) that a transaction should be committed
*
* You must not allow a client to register or depart during a commit.
*
* @param xid transaction ID to be committed (from startNewTransaction)
* @throws RemoteException if any RemoteException occurs in the process of committing
*/
@Override
public void issueCommitTransaction(long xid) throws RemoteException;
/**
* Broadcast to all replicas (and make updates locally as necessary on the server) that a transaction should be aborted
*
* You must not allow a client to register or depart during a commit.
*
* @param xid transaction ID to be committed (from startNewTransaction)
* @throws RemoteException if any RemoteException occurs in the process of committing
*/
@Override
public void issueAbortTransaction(long xid) throws RemoteException;
|
And, the spec for all of the remaining client methods:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
|
/**
/**
* Sets the content for all keys that are directly contained by the given directory.
* Must acquire locks on each key such that each key being set can not be modified by another concurrent
* putAll.
*
* Given two concurrent calls to putAll, it will be indeterminite
* which call happens first and which happens last. But what you can (and
* must) guarantee is that all keys affected will have the *same* value (and not
* some the result of the first, and some the result of the second). Your
* code should not deadlock while waiting to acquire locks.
*
* Your implementation must follow exactly this scheme:
* 1. Acquire all locks
* 2. Start a transaction
* 3. Write all values
* 4. Commit the transaction if all writes succeeded
* 5. Release all locks
*
* @param directory Directory to do updates within
* @param content The content to write out to each key
* @throws IllegalArgumentException if the key does not represent a directory (does not end in a /)
* @throws IllegalArgumentException if the key does not start with a /
* @throws RemoteException if any RemoteException occurs in the underlying RMI operation
*/
public void putAll(String directory, String content) throws RemoteException, IllegalArgumentException;
|
Again, we will grade your client and server side implementation separately. We have provided a set of tests, edu.gmu.cs475.P3Tests, which automatically mock a correctly functioning client — running your server against this fake client and simulating error conditions. To run this from maven, you should run mvn -Dtest=P3Tests test.
Precise grading breakdown:
- Automated functional tests (
edu.gmu.cs475.P3Tests): 15 points
- 5 JUnit tests, 3 points each
- Manual feedback: 5 points
Part 4: Concurrency (10%)
To receive a top score on this assignment, you will also need to be sure that your code has no races. For this assignment, we will be using the tool, RV-Predict to detect races that may occur while running your tests. RV-Predict will give you precise feedback on the races that it detects, for instance:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
Data race on field account.Account.Bank_Total: {{{
Concurrent write in thread T12 (locks held: {})
---- at account.Account.Service(Account.java:98)
at account.BankAccount.Action(BankAccount.java:41)
at account.BankAccount.run(BankAccount.java:56)
T12 is created by T1
at account.Account.go(Account.java:46)
Concurrent read in thread T13 (locks held: {})
---- at account.Account.Service(Account.java:98)
at account.BankAccount.Action(BankAccount.java:41)
at account.BankAccount.run(BankAccount.java:56)
T13 is created by T1
at account.Account.go(Account.java:46)
}}}
|
AutoLab will automatically run RV-Predict on all of your submissions. You can run it on your own computer by downloading and installing it (it’s free for non-commercial use). When you run your tests with maven, use the command mvn -Drvpredict=/path/to/rv-predict.jar test (on Mac this would be mvn -Drvpredict=/Applications/RV-Predict/Java/lib/rv-predict.jar test).
We will not give you a direct equation to correlate from # of reports from RV-Predict -> a grade on this section. We will manually award up to 10 points for concurrency correctness based on no apparent races and no over-synchronization (again, one way to avoid races could be to force every operation to be serial; this would not be ideal or correct). Moreover, note that RV-Predict will find many races, but will not find all races, which we might by hand! Note also that if you make wide use of stamped locks (instead of reentrant locks or synchronized blocks), RV-Predict will report that races are possible because it can not analyze stamped locks.
Grading
Your assignment will be graded on a series of functional tests, making sure that it implements the specification above.
Hand In Instructions
You must turn in your assignment using Autolab (You MUST be on the campus network, or connected to the GMU VPN to connect to Autolab). If you did not receive a confirmation email from Autolab to set a password, enter your @gmu.edu (NOT @masonlive) email, and click “forgot password” to get a new password.
Create a zip file of the root directory in your assignment (please: .zip, not .tgz or .7z etc) — this is the root directory that includes the client, shared, server directories. When you upload your assignment, Autolab will automatically compile and test it. You should verify that the result that Autolab generates is what you expect. Your code is built and tested in a Linux VM. Assignments that do not compile using our build script will receive a maximum of 50%. Note that we have provided ample resources for you to verify that our view of your assignment is the same as your own: you will see the result of the compilation and test execution for your assignment when you submit it.
You can resubmit your assignment up to 50 times before the deadline. Note the course late-submission policy: assignments will be accepted up until 24 hours past the deadline at a penalty of 10%; after 24 hours, no late assignments will be accepted, no exceptions.
Note – You MUST be on the campus network, or connected to GMU VPN to connect to Autolab.
Decoding the output:
Note, AutoLab will run your code on the tests twice: once without RV-Predict (these are the scores used for parts 1-3), and once with RV-Predict (this is informational only). The outcomes should be the same with or without RV-Predict, but we wanted to make 100% sure that adding the tool doesn’t break your otherwise seemingly functioning code.
Questions
Please post questions on Piazza