This topic will start as just a dump of my ideas around designing a new tunneling protocol.
“Tunneling” is a generic term. Here I’m specifically referring to the “ngrok case”, ie where you have a server running on a local device without a public IP address, and you want to make that server accessible on the internet by tunneling through a public server.
I maintain a list of solutions to this problem here:
https://github.com/anderspitman/awesome-tunneling
There are several different ways this is generally implemented, usually it’s one of:
- Use SSH tunnels orchestrated by another server program. This is what I do with boringproxy.
- Use a single TCP socket and multiplex over it with something like yamux, HTTP/2, WebSockets, etc.
- Use a UDP protocol. WireGuard is probably the most common here. Cloudflare recently got a QUIC implementation working, and I expect more solutions to use QUIC going foward. The main advantage of UDP is that it solves TCP head-of-line blocking.
The primary piece currently missing as I see it, is an open protocol that’s intended for interoperability. I think the reason this hasn’t happened yet is that usually you need to run both a server and client program in order to use one of these services, so they can each implement their own protocol that fits their needs. The one big exception is SSH, which specifies how to do tunneling (“forwarding”). And predictably, tunneling solutions on my list above which use SSH often support off-the-shelf SSH clients (boringproxy does).
I’m currently in the process of architecting a way to add a tunneling product to TakingNames.io. However, I want it to be based on a completely open protocol, to properly align my incentives with my customers and avoid vendor lock-in.
The obvious approach for me would be to make a modified version of the way boringproxy does it, which is to have a token-authenticated HTTPS API which generates an SSH key pair, adds an entry to ~/.ssh/authorized_keys
on the server to allow only tunneling on that port, then returns all the information necessary (including the private key of the tunnel) to the client, which then connects over SSH.
There are primarily 2 problems with this approach:
-
You have to shoehorn HTTPS auth semantics into SSH. This is particularly problematic if you want to create a scalable system that can handle connections from many people in different physical locations. You need to be able to give someone a server close to their location and load balance it with other servers. With SSH you need to maintain state on each of those nodes, so they know which private keys to trust. Either that or implement a custom SSH server (boringproxy currently leverages OpenSSH) that fits better. But at that point it might be simpler to implement a custom protocol.
-
SSH uses a single TCP socket, so is susceptible to head-of-line blocking across channels. This is particularly problematic in lossy network scenarios. A more forward-looking approach would be to use something like QUIC. But then there are concerns about networks blocking all UDP traffic, a problem not likely to be solved 100% anytime soon since HTTP/3 (based on QUIC) has excellent fallback to TCP, which means nothing is going to break by blocking UDP.
My tentative plan is to create a simple protocol either based on boringproxy’s SSH approach, or multiplexing over a single TCP socket. I would probably use WebSockets in the second case. Even though it adds some overhead, you need most of it anyway if you do a custom job. WS gives you some nice attributes such as not necessarily requiring more than one request (if the tunnel server and auth server are the same) and working in browsers, which opens up some interesting possibilities.
The goal with this protocol would be to get it working and deployed as quickly as possible. Once it’s been in production for a while, we can consider what a more performant option might look like.
So, all that said, these are the high-level design goals currently:
- Token-based auth with tunnel providers implementing a simple OAuth2 flow which clients can use to get a token. The idea is that you start a server application, it prints out an OAuth2 link, you load it in a browser, it takes you to the tunnel provider, you select a domain/subdomain, then it redirects you back to the server application with a token and information about where the tunnel node is.
- It’s implicit that tunnel providers will need to offer some level of control over DNS, either by being a DNS provider themselves or offering integration with other APIs. TakingNames.io falls into the first category. boringproxy might eventually fall into the second.
- Tokens can have very limited privileges. Basically just connect to a given server and expect TCP connections to start coming through.