How Scalaris stores your data
After installing Scalaris, running a few nodes, and playing with the API, I wanted to figure out how it actually stores the data. Using this diagram: I tracked down the source ( the db was the obvious clue).
The database is a gen_server that wraps calls to the underlying storage - an Erlang . What this means is the data is not actually saved to disk, but rather lives in memory. It appears the each chordnode has its own database.
When you store a Key - Value pair, the database actually records a structure like this in the gb_tree:
Key,{Value,WriteLock,ReadLock,Version}
On a write, “WriteLock” is set to true and the Version is incremented. On a read, “ReadLock” is set to true. Most of this logic appears to be controlled through the transaction layer.
Other stuff
- Search? There appears to be the ability to search for keys within a given range via a “get_range”. But I haven’t yet found how to call that from the transaction_api
- No delete. As I mentioned in my earlier write up on using the API, there is no way to actually “delete” a key once it’s set. But hey, the code is open-source and since gb_tree has a delete method, it should be possible to add.
One other thing of note, if you’ve implemented any distributed Erlang you know it’s not recommended to run a cluster of Erlang nodes around the Internet using the built in code. Scalaris implements its own TCP layer (the authors mention this in their Videos) for the nodes to communicate. Check out the comm_layer in the source for some ideas if you need to write your own.