[flud-devel] DHT justifications (was: DHT Performance? Design?)

Alen Peacock alenlpeacock at gmail.com
Wed Nov 14 09:15:44 PST 2007


I think there is a disconnect here; flud *will* support versioning,
via the method outlined in the link I sent you. And I completely agree
that this is a necessary feature (we talked about this at the
beginning of this thread:
http://flud.org/pipermail/flud-devel_flud.org/2007-October/000058.html),
but it will be implemented *outside* of the flud protocol itself, and
the metadata/storage layer will remain ignorant of rather a chunk of
data is a delta or a complete file.  Because it is completely
independent of the storage layer, it can be implemented at a later
date (it isn't a priority while flushing out other vital
functionality).

As a hack until flud has out-of-the-box support for delta compression,
a user who is really determined to save versions /now/ could simply do
something like http://kitenet.net/~joey/svnhome/ , just backing up
their repo at whatever interval they find appropriate.

Regarding verify ops, flud will also do this statistically to get full
coverage of all peers and a significant portion of files (because,
obviously, doing an op on each block of each file every day is
unworkable), but it will be able to sample from any portion of any
file or file version.

Regarding data decay when verify ops cease: this is no different than
what you are calling a grace period.  Nodes can't begin to discard
data until they are pretty sure that the owner no longer cares about
it (otherwise they risk failing verify ops and losing trust, and
getting their own data discarded in turn).  Additionally, nodes will
do discards probabilistically (ala Pastiche), which extends the grace
period arbitrarily.

Regarding resource consumption symmetry. i.e, why #2 in
http://www.flud.org/wiki/Architecture#Versioning is important: you may
want to re-examine your assumptions about enforceability of /future/
contractual arrangements.  I think the Samsara paper has a good
discussion of this (if not, ask again and I'll explain an attack or
two).

One other invariant in flud to keep in mind: a node must provide as
many resources as it consumes.  Now, you might say that this is an
artificial requirement, that there are some computers that don't have
enough free space to offer symmetric resources but that want to backup
their data anyway -- and you would be right.  BUT, in a system that
doesn't have such a requirement, you may find yourself dealing with
the opposite problem: many nodes that want storage space, but not
enough storage space to go around.  And, it is actually very easy to
provide resources to nodes that lack them if you assume the flud
invariant, but very difficult if you don't (I can go into more detail
if you don't see how).

Alen




More information about the flud-devel mailing list