negative zero

How Does Tor Work?

2020 November 24

[info] [privacy] [tech] [tor]


This post is intended to be accessible to a not-super-technical audience. (Hopefully...)

Tor (properly stylized as written here with a lowercase o and lowercase r) is a network designed to make you anonymous when you use the internet.

The goal of Tor is basically this: When you use the internet, no one should be able to figure out both who you are and what you're doing.


Postal Service Analogy

The internet is like the postal service. Suppose you want to send a letter to someone, and you want them to be able to respond, but you don't want them to know who you are, and you don't want anyone to know that you're contacting them.


Sending the Letter

Now suppose there's a network of people who have agreed to forward mail for anyone who wants to use this network, no questions asked. You choose three names and addresses from the list of participants (let's say randomly, for simplicity). Let's call them Alice, Bob, and Carol. You take your letter, addressed to its recipient (possibly in an envelope like a normal letter, and possibly without an envelope, like a postcard), and you prepare it in this way:

  1. First, you put the letter into an additional envelope and address that envelope to Carol.
  2. Supposing we refer to the letter as m and an envelope addressed to Carol as C, you now have C(m) where C contains m.


  3. Second, you put this into another envelope and address it to Bob.
  4. Following form, you now have B(C(m)).


  5. Finally, you put this into yet another envelope, this time addressed to Alice.
  6. A(B(C(m)))

Now, you send the letter off to Alice.

When Alice gets the letter, A(B(C(m))), she opens only the outermost envelope. She now has an envelope, B(C(m)), addressed to Bob. Alice forwards it on to Bob.

When Bob gets the letter, B(C(m)), he opens only the outermost envelope. He now has an envelope, C(m), addressed to Carol. Bob forwards it on to Carol.

When Carol gets the letter, C(m), she opens only the outermost envelope. She now has the message m, addressed to the recipient. (It may or may not be in an envelope. If it's a postcard without an envelope, Carol can see what is written on it.) Carol forwards the message to the recipient.


Receiving a Response

When the recipient writes back, they address their letter to Carol. (Again, they may put it in an envelope, or they may mail it in the clear like a postcard. If it's not in an envelope, Carol can read it.

Let's call the response r. Carol puts the response into an envelope addressed to Bob, B(r), and she forwards it on to Bob.

When Bob gets the response, B(r), he puts it into an envelope addressed to Alice, A(B(r)), and he forwards it on to Alice.

When Alice gets the response, A(B(r)), she puts it into an envelope addressed to you, U(A(B(r))), and she forwards it to you.

When you get the response, you open all the envelopes, and you have your response.


Analysis

Again, the goal is that no one knows both who you are and what you're doing. So let's talk about what each party knows.


Alice

Alice knows who you are because she receives the letter from you, and she sends the response to you. But Alice doesn't know the nature of your letter or the recipient. The letters that pass through her possession are sealed in envelopes which only reveal your address and Bob's.


Bob

Bob knows the least. He only knows that he's forwarding a message between Alice and Carol. He doesn't know the sender, the recipient, or the contents of the letter.


Carol

Carol is the weakest link. Carol knows who the recipient is, and in some situations, she may also know what the letter says. (She also knows that Bob was the last forwarder before her.) But unless you say who you are in a letter/postcard without an envelope (which would also mean you're not anonymous to the recipient), Carol won't know who you are.


Recipient

It will seem to the recipient that they're talking to Carol, rather than to you. They may recognize Carol as part of this anonymity network, but they won't know whom she represents.


How Tor Works

So basically, that's how Tor works. Just replace an "envelope" with encryption, "addresses" with IP addresses (which allow computers to find each other over the internet, like addresses allow people to find each other in real life). and the people with computers. You want to connect to a server (which is another computer somewhere), but you don't want the server to know who you are, and you don't want anyone along the way to know both who you are and what you're doing.

So you wrap your message in three layers of encryption (in addition to the regular encryption you might use on the internet), and the Tor network routes it through three other computers (which we'll call "nodes"). Each node along the way removes one layer of encryption, until the original message is left at the last node (which we'll call the "exit node" because our message is exiting the Tor network), which sends it to the server.

When the server responds, it replies to the exit node. The process then works backwards, with each node adding a layer of encryption, then forwarding it to the next node (in the reverse order from before). It's called "onion" routing because of these layers of encryption. As we know from Shrek, onions have layers.

The key thing about the encryption is that only you and the intended node will be able to decrypt each layer. The mail metaphor is imperfect because in real life, it's possible for someone to open an envelope that's addressed for someone else. In this case, we're using strong cryptography that should make it impossible (or rather, computationally infeasible) for someone to decrypt a message that's not for them.


Analysis

So let's talk about who knows what.


First Node

Like Alice in our mail metaphor, the first node knows who you are, but not the contents or recipient of your message.


Second Node

Like Bob in our mail metaphor, the second node knows only the identities of the first and third nodes. It doesn't know who you are, to whom you're connecting, or what you're saying. It's simply a relay.


Third Node (Exit Node)

Like Carol in our mail metaphor, the third node is the weakest link. It knows what server is being contacted, and it might know the contents of the message. (HTTPS can be used to prevent the exit node from learning the contents of the message. But if the server doesn't support HTTPS, the exit node will learn what you're doing.) Regardless, it still should not learn who you are (unless your message reveals who you are).


Server

Like the recipient in our mail metaphor, the server will only know the last party in the chain: the exit node. The server may identify this computer as a Tor exit node, but it won't know which Tor user the exit node represents.


Conclusion and Final Notes

Hopefully this post is helpful in understanding how Tor works. Next, I'm planning to write a post comparing and contrasting VPNs with Tor (and primarily arguing that you should use Tor instead of a VPN in most cases).

How to use Tor is outside the scope of this post, but you can go to torproject.org to get started using Tor.