[flud-devel] opaque data and non-auditability
alenlpeacock at gmail.com
Mon Jun 18 21:36:19 PDT 2007
It's time to finalize this.
It would indeed take very little extra overhead to move fs-specific
metadata out of the DHT and into the storage layer, and after more
reflection, I can't think of any reason not to proceed. This is the
right thing to do.
Just for the record, this is how it will work:
As before, the STOR primitive will be used to write chunks of encoded
files to other nodes. In addition, it will now also store an encoded
chunk of the file metadata. This will be accomplished by changing the
STOR primitive to simply upload two files instead of one (as supported
by the http protocol). In the case of aggregate STOR ops, the upload
of a single tarball that contains many files will still occur, but
each file in the tarball will have an encoded chunk following it.
(The protocol doesn't dictate how the node stores these two bits of
info, but it is likely that, at least initially, they will be stored
in a single file as a simple tar archive.)
RETRIEVE will also be modified to use multipart/related http, so that
a single request can generate and send the date+fs-metadata chunk.
fs-metadata will be encoded using the exact same scheme as the data
(currently 20 data + 20 parity blocks). As before, file metadata will
be encrypted with the node's public key before being encoded and
stored. Each metadata chunk will be associated with the file data and
with the originating node's ID on the server transparently. The only
other transparent data stored will be the encode scheme, i.e., 5/20/20
to indicate that this chunk is the 5th of 40 blocks, 20 of which are
data and 20 of which are parity. Also stored will be a short
identifier that ties this chunk to its other chunks (still thinking
about this one -- could use a subsequence of the hash of the file CAS
key, or, if we want this info to be opaque, could concatenate the file
CAS key with the chunk sequence number and then encrypt this with the
node's public key).
This scheme will allow a node to recover its data even in the unhappy
event that it loses its master metadata record. To recover in this
scenario, it simply queries nodes for any data that they might have
stored, the node can download these as they are found and start
reconstructing files as enough chunks become available (since nodes
should tend to limit themselves to a smallish subset of storage
trading partners, they shouldn't need to find too many different nodes
that have stored *some* data before being able to recover most data).
More information about the flud-devel