Federated Decentralization
2019 May 7
Federation is a type of network decentralization. Let's explore types of networks. I'll be focusing on the internet, which is basically a huge network of computers (a network of networks) that connects each computer to each other computer through some number of links.
Some Terminology
Here are some words I'll be using.
node: A node[1] is a device (think computer) on a network. We'll be focusing here on different models for how nodes connect to each other.
server: From Wikipedia: "[A] server is a computer program or a device that provides functionality for other programs or devices, called 'clients'."
client: A client then is the program or device that connects to a server.
As an example, of the client-server relationship, when you visit a website, your web browser on your computer acts as a client and connects to a server - a computer owned by the website host, which sends you the page.
I'll be working here at an abstract level where when I say two nodes are directly connected, I mean they have a logical connection to each other. Data sent from A may pass through other relays on the way to B (A and B need not be physically connected), but A can abstractly connect directly to B.
In a real-world example, consider delivering a letter. Alice has a physical connection with Bob if Alice can hand a letter straight to Bob. On the other hand, consider Charlie and Dave who do not cross paths often but have mutual friends. Charlie has a logical connection to Dave if Charlie can have a friend deliver the letter to Dave. In this case, the friend acts as a relay, not an end-node in this relationship, and the letter is still sent "directly" from Charlie to Dave. It will become important to this metaphor that all mail must be signed for; thus, when I say Alice sends a letter to Bob, Alice must hand the letter to someone, and the letter must be handed to Bob, so he can sign for it. (Note, Alice need not be the person who hands the letter to Bob, but the letter can only be transferred by hand.) In this metaphor, mailboxes can't exist because all mail must be hand-delivered.
Centralized Networks
This is mostly how we operate on the internet nowadays. There's a central server (or collection of servers, but all under the same control) to which each user connects. So consider some big services: most Google services, Facebook, Amazon... In these cases, you go to [company name].com and use the service. There's one central location to access it.
This falls under the common client-server model. There's a single site that you visit, and everyone who uses that service goes to that site (uses that server). In other words, there's one single authoritative node to which all other nodes connect.
Consider this example: Alice, Bob, Charlie, Dave, and Elise all want to use Facebook. Alice connects to Facebook. Bob connects to Facebook. Charlie connects to Facebook. Dave connects to Facebook. Elise connects to Facebook. When Alice sends a Facebook message to Bob, Alice sends the message to Facebook, and Facebook relays the message to Bob. Charlie, Dave, and Elise are in a group chat together. When Charlie sends a message to both Dave and Elise, Charlie sends one message to Facebook, and Facebook relays this message to each of Dave and Elise. In this model, Facebook is the central arbiter of all interactions.
In our real-world letter example, consider our five characters in a group with a leader, Frank, who insists upon processing everyone's mail (perhaps to ensure no one conspires against him). When Alice wants to send a letter to Bob, Alice must send the letter to Frank, and Frank will then send the letter on to Bob. (Alternatively, Frank can collect everyone's mail for the week and deliver it en masse in person when he sees them.)
Distributed Networks
This is a type of decentralized network, something I would think of as "truly decentralized". In this model, each node is connected to each other node, either directly or indirectly, but no node is considered to have more authority than another.
Instead of a client-server model, where the server acts as an authority, distributed networks use a peer-to-peer (P2P) model, where all nodes are "peers" - equals.
Consider this example: Alice, Bob, Charlie, Dave, and Elise are all friends with each other on a distributed P2P network like Retroshare. When Alice wants to send Bob a message, Alice sends the message directly to Bob. When Charlie, Dave, and Elise are in a group together and Charlie wants to send a message to both Dave and Elise, Charlie sends the message directly to Dave, and Charlie sends the message directly to Elise.
In our real-world example, these five characters are in a group together (having ousted Frank), and now they just send their letters directly to each other.
Federated Networks
It might seem obvious from these examples that a distributed network is the best choice, but there are some advantages to centralization. Consider our real-world example for a moment. Remember that letters can only be delivered by hand. What happens if Alice wants to send a letter to Bob, but Bob isn't home?
Centralization also makes things more efficient. Suppose in our centralization example that Frank collects everyone's mail and gives it to them when he sees them. This solves the problem of "What if Bob is out of town?" while also saving Bob the trouble of having to answer his door each time a piece of mail comes for him. He can go to Frank to get his mail when he has time to deal with it. It makes sense for everyone to get their mail from a central location; the problem is the unbalanced power this gives to that central location (Frank).
Federation is a way to pair some of the advantages of centralization with some of the advantages of distribution.
In our real-world example, consider our group of five. Alice and Bob live near each other, as do Charlie, Dave, and Elise.
Alice and Bob register with a service. We'll call it a local post office, and in this example, all post offices are local; there is no hierarchical system of post offices. A post office always has someone on-duty to receive mail and act as a relay for when the recipient comes to pick it up. This allows Alice and Bob to safely be away from home without missing letters. This also saves the delivery person work because when someone wants to send letters to both Alice and Bob, the person delivering these letters only needs to make one tripe to the post office.
Because Charlie and Dave live near each other, the same post office is close to them, and they register at that one. Elise could also register at that one, but she doesn't like one of the people who works there and doesn't want to have to see that person every time she goes to pick up her mail, so she sets up her own post office instead.
When Alice wants to send a letter to Bob, Alice addresses it to Bob and brings it to the local post office. Because Bob also uses that post office, the post office just hangs onto it until Bob comes to pick it up.
When Charlie wants to send a letter to Elise, Charlie addresses it to Elise and brings it to the local post office. Because Elise uses a different post office, Charlie's post office has someone take the letter to Elise's post office, where Elise can pick it up.
It's important to note that you don't necessarily have to use the post office nearest you, and someone can make their own post office. In this way, we avoid the mandatory nature of the centralized model ("My way or the highway"). If someone is dissatisfied with one of the post offices, they can use another one because the post officies all work with each other. They can still talk to their friends who use other post offices.
In the computer version of this story, let's look at email because it's a federated protocol most readers will already be familiar with.
There are many email servers, and you can even set up your own if you have the ability to host a server. These servers all interoperate because they use the same email protocol (SMTP) to talk to each other. These servers are always available to receive emails, which relieves users of the burden of having to always be available to receive emails. Some mail servers have features others don't have. Users have a choice of which server to use, and they will be able to talk to users of other servers. If one email provider causes problems, users of that provider's service can move to a different provider without making all of their contacts also move.
So suppose Alice and Bob are users on mailserver1.com, Charlie and Dave are users on mailserver2.net, and Elise runs and is the sole user on mailserver3.info. Alice can email Bob because they're on the same server. Alice drafts the email and sends it, and the server at mailserver1.com puts a copy of the email in alice@mailserver1.com's outbox and in bob@mailserver1.com's inbox.
If Charlie wants to email Elise, Charlie can send an email from charlie@mailserver2.net to elise@mailserver3.info. Charlie sends the email to mailserver2.net (with IMAP, POP3, or HTML), which sends the email to mailserver3.info (with SMTP), which sends the email to elise@mailserver3.info (with IMAP, POP3, or HTML). I only mention the protocols used to emphasize that the ways clients talk to the server are different from the way servers communicate with each other.
In this case, instead of a peer-to-peer model, this is a client-server model, or what might be extended to be called client-server-server-client.
The Problem with Federation
I might write about this at greater length later on, but the problem with federation is that it tends towards centralization. Federation confuses people. (Try explaining how to choose an email provider to someone who doesn't already understand email, or look at a list of public XMPP servers.) They don't know which server to sign up for.
While a decent way to handle this might be to pick a random server to try to spread out and ensure no single provider has too much power, what ends up happening is one server becomes the default option. ("Want to use email? Just sign up with Gmail!") There's a reason around 44% of people use Gmail as their primary email account (and let's be honest, it's not because Gmail is the best, even privacy violations aside). There's a reason mastodon.social (the Mastodon instance hosted by Eugen, Mastodon's developer) is so big. Federation is hard for people, and this is a problem I don't know how to address. But we need to move to federation where we still need servers (and ideally self-hosting, but at least actually decentralizing our federation, not just all flocking to one provider that happens to support federation wth other providers).