[flud-devel] private flud network and flud goals

Stuart Langridge sil at kryogenix.org
Wed Sep 5 12:01:06 PDT 2007


> > ...and they know how to tunnel ports through their firewall. This is the
> > big problem; the server isn't in my design because I want
> > centralisation, it's there because the moment you mention "open port
> > 4242 on your firewall and forward it to your flud server" you have lost,
> > in my opinion. I was pretty serious about the "just start the program"
> > thing.
>
>   Yes, that's one caveat I didn't mention, and it is a big one.
> Currently, we don't do any NAT traversal or hole punching, so users
> will have to open up ports.  But I do plan on implementing such at
> some point, likely through a STUNT approach (or, failing that, by
> dropping down to UDP and using STUN/ICE, which can apparently solve
> this issue for >90% of users behind firewalls/nats today).  We could
> also investigate using UPnP (as Bittorrent and others do), but I'm
> more leery of that approach.
>
>   At any rate, this will be fixed one day, because you are absolutely
> right -- it's too much trouble for the average user.  For now, the
> type of user who would be interested in running flud will need to be
> capable of setting up port forwarding etc.
>
>   (STUN/STUNT does require semi-centralized relay servers, btw, but
> these can be distributed as part of the generic flud node codebase).

Yeah...sort of. Put it like this: at least in the early days, before
flud takes over the world, you're going to have to ensure that (a)
someone somewhere is running a STUN server (b) there is always at
least one live node in the "global flud network". So, for all intents
and purposes, you're going to be putting up a server which everything
can fall back on: if you're a flud client then the "global flud
network" might not consist of anything other than "flud.org", but you
always know it'll be there. So, if that's the case, why not start
building that sort of support into it from the beginning?

>   In order to join a flud network, you'll need to know the address of
> at least one other node in that network.  Currently, this does need to
> be entered manually.  This adds a step 3.5 to your list.
>
>   In the future, we can use something very simple like distributed
> gwebcaches to get nodes introduced to one another.

Is this gwebcache as in the gnutella thing?

>   Additionally, in the future, nodes will use discovery methods to
> find others on their local network (bonjour style) as step #1, then
> query gwebcaches for contacts as step #2.  Bonjour-style discovery
> doesn't help your use case, but it does facilitate private flud
> networks on, for example, a company LAN.
>
>   Currently, the groupID is the secret password, and although you can
> certainly use "langridge-family," it would be better to choose
> something unguessable (groupIDs are hashed into the sha256 space,
> which is unguessable, but only as unguessable as the input).
>
>   For private flud networks, I lean towards the following:
>
> 1. You install flud and issue invitations to join your private flud
> network from the GUI, which results in an email being sent.
> 2. The recipient of the email is told where they can download flud,
> and given a block of text to cut-n-paste into the GUI on first run.
> The block of text is opaque-looking, but contains at least the
> groupID, and the IP address and nodeID of the sending node.
> 3. That's it!
>
>   If the recipient's node can't connect directly to the IP address
> sent in email, it connects to the flud network at large (via
> bonjour/gwebcaches) and does a lookup for the nodeID, which should
> allow it to connect even if the IP address changes.

This makes sense. It's not quite as neat as "type this group name and
password in", and it means that there have to be specific *invites* to
the group, because the mystic block of text must be generated by a
flud client rather than being something intelligible, but it's
basically the same principle. It fails the telephone test, but that's
not a *huge* issue.

>   This needs to be thought out a bit more thoroughly (especially bits
> about disjoint flud networks, non-transitivity, etc), but I think the
> basic scheme is workable, despite what I'm about to say about private
> flud networks in the next section, below.
>
>   Do you think that would be easy enough for most users?

It would, but is basically presupposes the existence of the global
flud network. This is why I'm inclined to suggest that there's at
least one server which the flud project run which can always be
assumed to be there, so that every flud client *can* assume that the
global flud network exists (even if it's just one box).

>   All storage resources in flud must be traded symmetrically.  In this
> scenario, Dr. Evil will fail to find enough storage resources unless
> Alice and Bob have decided to be very generous, or have some
> inclination for reserving vast amounts of future storage resources for
> themselves (both unlikely, because by default, flud nodes won't do
> either).  So while Alice and Bob will find plenty of offers for
> trading storage resources and back up their data without problems, Dr.
> Evil will get frustrated.

What does "symmetrically" mean? To take a more realistic example,
imagine that Alice, Bob, and Dr Evil all want to back up 2GB worth of
stuff. Does Alice have to allocate 4GB of space on her machine for
others to back up into? If Charlie now joins the network, does
everyone have to allocate another 2GB?

>   The solution is to not use private flud networks at all, but instead
> use the (not yet existent) public flud network.
>
>   This is understandably a hard argument for many to swallow, but I'll
> continue to make it: nodes are likely to find better trading partners
> in a large anonymous network than they are in a small
> private-but-trusted network.

I pretty much agree with you, except that the global flud network has
to be relatively big for this, because people will drop off it *all
the time*. There are also issues like: each flud server must have a
globally-unique and unchanging nodeID so it can be found again later.
If I reinstall my machine and reinstall flud, do I get the same nodeID
back again? If so, how? If not...then reinstalling flud means removing
yourself as a node, which means that to all intents and purposes
anyone else's backups which were stored on your node will have
vanished.

>   As for the time it would take Dr. Evil to backup 800GB, that's an
> issue that can only be solved by fatter pipes. [*1]  Either that, or
> convince Dr. Evil that his Santa Barbara penchant is destructive to
> his soul :)

Yeah, I know. It's a problem, though, because if non-technical people
are to use this then they'll have to understand the ramifications of
ticking "back up this folder" next to their 25GB of photos. This is a
generic issue with all distributed backup systems, not just flud, but
it is an issue.

> > Yeah. This sort of thing is not that much of a problem if you're backing
> > up small stuff like config files. It's also not much of an issue for
> > large multimedia collections, since a photo or mp3 or movie don't
> > *change* much once they're created. Differential backup within files
> > isn't hugely important, IMO, as long as flud doesn't back up files it's
> > already got, which is already sorted.
>
>   Right.  For most files, this isn't a big deal.  But for db-type
> files, it can be pretty painful, especially as the db grows (Outlook,
> for example, stores all email in a db, which can grow to be many GBs
> in size).  That's a problem that I think we have a good fix for, but
> is not urgent.

Really? What's the fix? That could be a problem, especially for
non-Linux software, which tends to do this sort of thing more -- look
at, say, all the Mac apps which store a black hole DB like iPhoto.

sil

--
New Year's Day --
everything is in blossom!
I feel about average.
  -- Kobayashi Issa




More information about the flud-devel mailing list