Toward an open tunneling protocol

This topic will start as just a dump of my ideas around designing a new tunneling protocol.

“Tunneling” is a generic term. Here I’m specifically referring to the “ngrok case”, ie where you have a server running on a local device without a public IP address, and you want to make that server accessible on the internet by tunneling through a public server.

I maintain a list of solutions to this problem here:

There are several different ways this is generally implemented, usually it’s one of:

  • Use SSH tunnels orchestrated by another server program. This is what I do with boringproxy.
  • Use a single TCP socket and multiplex over it with something like yamux, HTTP/2, WebSockets, etc.
  • Use a UDP protocol. WireGuard is probably the most common here. Cloudflare recently got a QUIC implementation working, and I expect more solutions to use QUIC going foward. The main advantage of UDP is that it solves TCP head-of-line blocking.

The primary piece currently missing as I see it, is an open protocol that’s intended for interoperability. I think the reason this hasn’t happened yet is that usually you need to run both a server and client program in order to use one of these services, so they can each implement their own protocol that fits their needs. The one big exception is SSH, which specifies how to do tunneling (“forwarding”). And predictably, tunneling solutions on my list above which use SSH often support off-the-shelf SSH clients (boringproxy does).

I’m currently in the process of architecting a way to add a tunneling product to However, I want it to be based on a completely open protocol, to properly align my incentives with my customers and avoid vendor lock-in.

The obvious approach for me would be to make a modified version of the way boringproxy does it, which is to have a token-authenticated HTTPS API which generates an SSH key pair, adds an entry to ~/.ssh/authorized_keys on the server to allow only tunneling on that port, then returns all the information necessary (including the private key of the tunnel) to the client, which then connects over SSH.

There are primarily 2 problems with this approach:

  1. You have to shoehorn HTTPS auth semantics into SSH. This is particularly problematic if you want to create a scalable system that can handle connections from many people in different physical locations. You need to be able to give someone a server close to their location and load balance it with other servers. With SSH you need to maintain state on each of those nodes, so they know which private keys to trust. Either that or implement a custom SSH server (boringproxy currently leverages OpenSSH) that fits better. But at that point it might be simpler to implement a custom protocol.

  2. SSH uses a single TCP socket, so is susceptible to head-of-line blocking across channels. This is particularly problematic in lossy network scenarios. A more forward-looking approach would be to use something like QUIC. But then there are concerns about networks blocking all UDP traffic, a problem not likely to be solved 100% anytime soon since HTTP/3 (based on QUIC) has excellent fallback to TCP, which means nothing is going to break by blocking UDP.

My tentative plan is to create a simple protocol either based on boringproxy’s SSH approach, or multiplexing over a single TCP socket. I would probably use WebSockets in the second case. Even though it adds some overhead, you need most of it anyway if you do a custom job. WS gives you some nice attributes such as not necessarily requiring more than one request (if the tunnel server and auth server are the same) and working in browsers, which opens up some interesting possibilities.

The goal with this protocol would be to get it working and deployed as quickly as possible. Once it’s been in production for a while, we can consider what a more performant option might look like.

So, all that said, these are the high-level design goals currently:

  • Token-based auth with tunnel providers implementing a simple OAuth2 flow which clients can use to get a token. The idea is that you start a server application, it prints out an OAuth2 link, you load it in a browser, it takes you to the tunnel provider, you select a domain/subdomain, then it redirects you back to the server application with a token and information about where the tunnel node is.
  • It’s implicit that tunnel providers will need to offer some level of control over DNS, either by being a DNS provider themselves or offering integration with other APIs. falls into the first category. boringproxy might eventually fall into the second.
  • Tokens can have very limited privileges. Basically just connect to a given server and expect TCP connections to start coming through.
1 Like

What is the role of the tunneling server (with the public IP address)? Is it a network router or an http/s reverse proxy?

If it’s just a router, redirecting a port to the tunnel, then it seems like wireguard is already a good protocol for this. It’s easy to setup and there are clients for most operating systems. The downside of this approach is that you only get one port 80 and one port 443, so if you wanted to host multiple sites from the same IP address you would have to use a reverse proxy on the tunneling client. This allows for only one unique user of the IP address.

If it’s an http/s reverse proxy then you can have unlimited unique users per IP address, but now your users have to trust you a lot more. I wouldn’t trust a 3rd party to manage my TLS connections unless it were someone big and reputable like Cloudflare.

The oauth setup protocol could work with Wireguard: your application redirects you to a VPS/tunneling service which then gives you a wireguard public key and IP address to peer with. Your application could setup the wireguard locally and give the tunneling service its public key and IP address.

It seems like it could work with just oauth and wireguard. What do you think?

Yeah it’s intended to be a reverse proxy that forwards to tunnels instead of addresses.

I wouldn’t trust a 3rd party to manage my TLS connections

Agreed. boringproxy actually has end-to-end encryption. The server acts as an SNI proxy. The ACME/LetsEncrypt certs are retrieved and controlled by the client side of the tunnel. So neither the boringproxy server or the VPS it’s running on can decrypt the data. There’s one big caveat here, which is that the server operator could get their own cert from LetsEncrypt when you’re not looking, and use it to intercept some/all of the traffic before forwarding it on. I think one solution to this would be for the client to randomly check that the public key provided by the server matches the one it has. But an evil server might be smart enough to always provide the correct cert if the request comes from the same IP as the client, so you have to take that into account as well.

It seems like it could work with just oauth and wireguard.

I like WireGuard. It’s what I initially wanted to use for boringproxy. The one big downside is that you need admin privileges on the client side of the tunnel, in order to set up the network interfaces. Also, being UDP-based I think it’s more likely to be blocked.

Now I believe on some or all of the mainstream OSes like Android, iOS, OSX, Windows, there are ways to give permissions to WireGuard-based apps at install time. Unfortunately on Linux I think this is only possible if you package it for every distro. Either way it precludes the possibility of running a tunnel on a machine where you don’t have root, unless you get the admin to install the package for you.

If you were running a production service behind one of these tunnels, I think you would probably want to use WireGuard, or at least have the option to do so. But for the average self-hosted app, I don’t think it’s necessary. I think it would make sense for the protocol to have a single OAuth2 flow, but let it support different tunneling technologies under the hood.

I just learned what an SNI proxy is. This thought is surprising: doesn’t using a VPS imply the same amount of trust in the provider as using an SNI proxy. Theoretically, DigitalOcean could simply reassign my IP address and then issue a cert for it. They could also get hacked, and the attacker could do that. So maybe it’s no safer than using an SNI proxy. What do you think?

This doesn’t seem like a problem. People who want to self-host things need to be able to install programs/packages anyway. It seems unlikely that you would want to self-host a service on a device for which you don’t have root access (or, in the case of a phone, the ability to install apps).

This is a great point. You might want to set up a service at work, or behind some residential NAT, and those networks could kill your UDP traffic. So it makes sense to use a protocol that supports TCP. Since the TCP protocol would be a fallback, it would make sense to have the client connect over port 80 or 443 so that it’s sure to not be blocked by a firewall. Websockets probably make sense for that use case. I don’t know the protocol well enough to have a strong opinion on them. Would it be anything other than a re-invented VPN?

This thought is surprising: doesn’t using a VPS imply the same amount of trust in the provider as using an SNI proxy.

Yep. Not only could they get a cert and take over your traffic. They could switch back and forth between your cert and theirs, strategically only use theirs when you’re less likely to notice.

This doesn’t seem like a problem. People who want to self-host things need to be able to install programs/packages anyway.

This is a common response I get from the self-hosting community. For people self-hosting today, I agree, they’re all nerds. But what about people who want to self-host but don’t have the expertise? What about bringing self-hosting to the masses? I agree admin privileges is a pretty small issue, and if the gains are big enough it’s worth it, but it does make it a little less simple.

Would it be anything other than a re-invented VPN?

Anything we do is going to be a reinvention. The key is that there isn’t an open protocol that I’m aware of, other than SSH.

That said, the last couple days I have been leaning more towards the direction of using something off the shelf, not specifying a wire protocol, and allowing support for multiple transports. Basically libp2p but a much smaller surface area.

So maybe we only specify the high level “flow” of setting up a tunnel. For example:

  1. Client does OAuth2 flow to server, and selects a subdomain to map to the tunnel, and client gets back a token.
  2. Client makes an HTTPS request to a known endpoint. The client can use URL params to indicate what type of tunnel it wants/supports, ie ?type=ssh&type=tcp-yamux&type=wireguard.
  3. The server selects the first transport it supports and returns a JSON response indicating the type selected, and any supplemental information for connecting to the tunnel. For example, boringproxy essentially already works this way and returns the server address, port, private key, etc.

This is pretty simple, and flexible. For example the extra request lets you easily load balance.

You would need tiny specs for each transport to describe how they map to the flow and JSON representation.

Would it be anything other than a re-invented VPN?

This is a common opinion I heard about localhost tools. The problem is that some people don’t want to setup on the visitors’ side. They want anyone can access their services anywhere at anytime. So these tools do have their niche.

1 Like