Buffering and EventSource streams

I seem to be having a buffering problem with Boring Proxy. The web sites I’m proxy’ing with BP use a HTML5 EventSource to send a real-time data stream back to the browser. It is not a constant stream of data, just new sensor values that have changed. This might be every 1/2 second for a noisy sensor, or every 30 seconds if the sensor’s world is quiet. And multiple sensors send new values over the same EventSource.

This works great when connected directly to the web server, whether that’s in my lab or over the internet, but it’s not so good when connecting though BP. It seems to me that somewhere along the path there is a 4K buffer, and the real-time stream going through the EventSource connection is buffered until this 4K is full, and then the entire 4K is sent in one big lump. This negates the whole idea of a real-time stream, unfortunately.

Is this a known issue, or perhaps a design feature, or maybe I’m completely wrong. If it is a feature, is it possible to include a flag to turn the buffer off?

To add some detail, the BP server is running Ubuntu on Intel, the clients are Raspberry Pi Zero W’s running Raspios, and the web browsers are Safari, Firefox and Chrome running on a couple of iMac’s and Mini’s. The web server running on the PI Z W’s is apache2, and the target of the EventSource is C++ code written by me, as a cgi-bin under apache2, and it flushes it’s buffers after writing each new data line to the output stream.

Comments and help greatly appreciated.

Hey @sharkshead, thanks for moving your post here from GitHub.

This is an important use case. I use Server Sent Events (EventSource) myself quite a bit. boringproxy currently doesn’t have any intentional buffering. Since it’s built on OpenSSH for the tunneling, it’s possible that’s where the buffer is. I’ll try to do a little experimenting with SSE through boringproxy and get back to you tomorrow.

A couple points of clarification:

  1. You have multiple sensors sending their readings to a central location, and then a C++ program collects the data and makes it available over SSE behind Apache/CGI, which is then tunneled by boringproxy, correct?

  2. Are you self-hosting the boringproxy server or using the demo instance?

The sensors send their values via a small radio to a process on the PI Zero W that collects all the data and writes this to some shared memory. Processes like the cgi-bin code read the shared memory and send it back to the browser. Multiple browsers can read the same data, each with their own cgi-bin process. There is a mutex controlling access to the data structures in the shared memory, and semaphors are used so the main process writing to the shared memory can wake up all the cgi-bin processes when there’s new data to be had. There are other processes that can read the shared memory, too, but they are of no importance here.

I’m self-hosting the server.

If we can get some sort of DM happening I can send you links to a couple of these systems. One of which is both directly accessible over the internet and via BP.

I’ve done a little digging to see if I can show that there’s a buffering issue or not. I have one BP client that is very prone to whatever this problem is. So I ran tcpdump on both ends of the tunnel. On the client I watched all packets going to or from the lo interface, and on the server I watched all packets going to or from the random port allocated to this tunnel. And what I saw, detailed below, is the client sending 137 bytes into the tunnel every 10 to 20 seconds, and the server end receiving 4,127 bytes every 8 minutes or so. The number 137 and 4,127 were consistent over the hour or so I was watching this.

Here’s what the server saw. I’ve cut lots of lines out because they are virtually identical, and have included two examples where it received 4,127 bytes.

21:30:51.669974 IP localhost.33689 > localhost.52668: Flags [.], ack 1, win 512, options [nop,nop,TS val 1838887825 ecr 1838449798], length 0
21:31:06.773981 IP localhost.52668 > localhost.33689: Flags [.], ack 1, win 512, options [nop,nop,TS val 1838902929 ecr 1838887825], length 0
21:31:06.774087 IP localhost.33689 > localhost.52668: Flags [.], ack 1, win 512, options [nop,nop,TS val 1838902929 ecr 1838449798], length 0
21:31:18.604007 IP localhost.33689 > localhost.52668: Flags [P.], seq 1:4128, ack 1, win 512, options [nop,nop,TS val 1838914759 ecr 1838449798], length 4127
21:31:18.604033 IP localhost.52668 > localhost.33689: Flags [.], ack 4128, win 491, options [nop,nop,TS val 1838914759 ecr 1838914759], length 0
21:31:33.653909 IP localhost.52668 > localhost.33689: Flags [.], ack 4128, win 512, options [nop,nop,TS val 1838929809 ecr 1838914759], length 0
21:31:33.653971 IP localhost.33689 > localhost.52668: Flags [.], ack 1, win 512, options [nop,nop,TS val 1838929809 ecr 1838914759], length 0

---

21:38:36.566137 IP localhost.33689 > localhost.52668: Flags [.], ack 1, win 512, options [nop,nop,TS val 1839352721 ecr 1838914759], length 0
21:38:51.669887 IP localhost.52668 > localhost.33689: Flags [.], ack 4128, win 512, options [nop,nop,TS val 1839367824 ecr 1839352721], length 0
21:38:51.669932 IP localhost.33689 > localhost.52668: Flags [.], ack 1, win 512, options [nop,nop,TS val 1839367825 ecr 1838914759], length 0
21:39:00.578433 IP localhost.33689 > localhost.52668: Flags [P.], seq 4128:8255, ack 1, win 512, options [nop,nop,TS val 1839376733 ecr 1838914759], length 4127
21:39:00.578467 IP localhost.52668 > localhost.33689: Flags [.], ack 8255, win 491, options [nop,nop,TS val 1839376733 ecr 1839376733], length 0
21:39:15.733960 IP localhost.52668 > localhost.33689: Flags [.], ack 8255, win 512, options [nop,nop,TS val 1839391889 ecr 1839376733], length 0
21:39:15.734096 IP localhost.33689 > localhost.52668: Flags [.], ack 1, win 512, options [nop,nop,TS val 1839391889 ecr 1839376733], length 0

Here’s the tcpdump from the client at the same time as the 4,127 bytes on the server.

21:30:48.521112 IP localhost.34046 > localhost.81: Flags [.], ack 3288, win 1014, options [nop,nop,TS val 3049409429 ecr 3049409429], length 0
21:31:00.499564 IP localhost.81 > localhost.34046: Flags [P.], seq 3288:3425, ack 1, win 1024, options [nop,nop,TS val 3049421407 ecr 3049409429], length 137
21:31:00.499749 IP localhost.34046 > localhost.81: Flags [.], ack 3425, win 1014, options [nop,nop,TS val 3049421407 ecr 3049421407], length 0
21:31:18.521390 IP localhost.81 > localhost.34046: Flags [P.], seq 3425:3562, ack 1, win 1024, options [nop,nop,TS val 3049439429 ecr 3049421407], length 137
21:31:18.521564 IP localhost.34046 > localhost.81: Flags [.], ack 3562, win 1014, options [nop,nop,TS val 3049439429 ecr 3049439429], length 0
21:31:33.520933 IP localhost.81 > localhost.34046: Flags [P.], seq 3562:3699, ack 1, win 1024, options [nop,nop,TS val 3049454429 ecr 3049439429], length 137
21:31:33.521114 IP localhost.34046 > localhost.81: Flags [.], ack 3699, win 1014, options [nop,nop,TS val 3049454429 ecr 3049454429], length 0
21:31:48.520666 IP localhost.81 > localhost.34046: Flags [P.], seq 3699:3836, ack 1, win 1024, options [nop,nop,TS val 3049469428 ecr 3049454429], length 137

---

21:38:33.526327 IP localhost.34046 > localhost.81: Flags [.], ack 7535, win 1014, options [nop,nop,TS val 3049874434 ecr 3049874434], length 0
21:38:48.526355 IP localhost.81 > localhost.34046: Flags [P.], seq 7535:7672, ack 1, win 1024, options [nop,nop,TS val 3049889434 ecr 3049874434], length 137
21:38:48.526532 IP localhost.34046 > localhost.81: Flags [.], ack 7672, win 1014, options [nop,nop,TS val 3049889434 ecr 3049889434], length 0
21:39:00.501535 IP localhost.81 > localhost.34046: Flags [P.], seq 7672:7809, ack 1, win 1024, options [nop,nop,TS val 3049901409 ecr 3049889434], length 137
21:39:00.501724 IP localhost.34046 > localhost.81: Flags [.], ack 7809, win 1014, options [nop,nop,TS val 3049901409 ecr 3049901409], length 0
21:39:18.526281 IP localhost.81 > localhost.34046: Flags [P.], seq 7809:7946, ack 1, win 1024, options [nop,nop,TS val 3049919434 ecr 3049901409], length 137
21:39:18.526457 IP localhost.34046 > localhost.81: Flags [.], ack 7946, win 1014, options [nop,nop,TS val 3049919434 ecr 3049919434], length 0
21:39:33.525958 IP localhost.81 > localhost.34046: Flags [P.], seq 7946:8083, ack 1, win 1024, options [nop,nop,TS val 3049934434 ecr 3049919434], length 137

The client doesn’t see anything different at the time when the server receives the 4,127 bytes. The client just continues sending its 137 byte packets.

I do notice that the client has the PUSH flag set on all its 137 byte packets. So, maybe, something is not honoring this flag.

Thanks for the detailed information and debugging. This is very helpful.

You say one client in particular seems to have this problem more? Based on my understanding of your architecture, I would expect you to only be running a single BP client on the PI Zero W. Can you explain more where the multiple BP clients are running?

My current best guess is that the OpenSSH server is to blame. But I would be very surprised if OpenSSH doesn’t honor TCP PUSH properly. What version are you running?

Also can you share the settings you’re using when creating the tunnel? In particular I’m interested in TLS termination.

One additional thing you might try for debugging is using Server HTTPS termination, setting Allow External TCP true, and then connecting directly to the server port over HTTP (not HTTPS). This essentially cuts the server out of the equation.

Ok I did a few quick tests. I made a minimal SSE server that sends small messages. When using a naive approach, I observed buffering both through BP and directly connecting through localhost. When flushing manually (using http.Flusher.Flush in Golang), it works as expected both through boringproxy and localhost.

So I’m fairly confident whatever is causing this issue exists outside boringproxy.

Hey @anderss, thanks for the replies. Let’s see if I answer your questions adequately.

I have 5 tunnels on the BP server, and 5 PiZW’s as clients. The first 4 PiZW’s have quite a bit of data going over the stream, and although I noticed that the real-time data was slow to arrive, between 5 to 20 seconds, I put it down to “one of those things”.

When I added the 5th PiZW the problem was unmistakable. This 5th PiZW had no sensors attached, and so the only traffic going out was from a couple of internal timers sending out the occasional bit of data. On this one it was taking 7 or 8 minutes for the stream to get through to the browser. And so I started digging.

All the tunnels were created by a script so they all have the same makeup. They each have a unique username, unique token, unique client-name, and hence a unique tunnel. All the tunnels have the same options, the target is 127.0.0.1:81, TLS termination is client, external TCP is true, and password protection is true (and all tunnels have a unique account/passwd pair).

The tunnel creation script uses curl to create the user and token on the BP server. It passes the token to the PiZW (through MQTT) which then starts the BP client end. The script waits until it sees the client-name appear on the BP server, then it creates the tunnel, again using curl.

The unique user, client-name, account and password are all hashes of a unique serial number assigned to each PiZW, Each PiZW knows its serial number, as does the tunnel creation script, so they can generate these hashes themselves, which is why only the token needs to be passed to the PiZW.

Topology-wise, the server I create the tunnels from is in a Sydney datacentre, the BP server is in Paris, one of the PiZW’s is on a boat in Sydney harbour, and the other 4 PiZW’s are in my lab.

SSH versions are:

BP server: OpenSSH_8.3p1 Ubuntu-1ubuntu0.1, OpenSSL 1.1.1f 31 Mar 2020

PiZW: OpenSSH_7.9p1 Raspbian-10+deb10u2, OpenSSL 1.1.1d 10 Sep 2019

They both look a little old, so I’ll upgrade them to the latest in their respective distributions, then I’ll try it all again, along with Server HTTPS termination. I’ll try the other termination options as well, just because.

Following on from my previous post, the ssh on both the BP server and the PiZW’s are at their latest distribution release. Should I fetch the latest source and try that?

I tried Server HTTPS termination, and there was no change to the buffering issue. I get a connection refused message when going direct to the port, even though there is a listener on ::1 on that port. Weird.

I could not connect with Client raw TLS and Passthrough. Client raw TLS gave me back a message:

Get "http://localhost:33985/": EOF

Once again, thanks for the details. At this point I think the best thing to do is go all the way to one extreme and work our way back from there, adding pieces in small increments until we identify where the block is.

So I think step 1 is to cut boringproxy out of the equation entirely, and try a plain SSH remote forward. That’s what BP uses under the hood. If you’re not familiar with how to do that manually using OpenSSH, here’s a guide:

https://www.ssh.com/academy/ssh/tunneling/example

If your SSEs work as expected with plain SSH tunnels, then we can continue trying to debug boringproxy.

One caveat to raw tunnels is you lose TLS. Unless the data is extremely private I wouldn’t worry about that. If you absolutely must encrypt it over the internet then you can set up a reverse proxy like Caddy or nginx. But you’ll be introducing another piece which might be buffering…

Hi @anders,

I enabled GatewayPorts in the BP server’s sshd_config, then from one of the PiZW’s I started a remote tunnel 2345:localhost:81. Then a plain HTTP to port 2345 on the BP server tunnels me straight to the PiZW’s web server. And the SSE stream appears immediately.

He @anders,

I was able to connect directly to the tunnel port on the BP server, thus cutting out the BP server end of the tunnel, but the buffering issue remains.

I didn’t mention it in my previous post but I’m testing this on the PiZW that takes 7 to 8 minutes for the stream to appear due to the buffering. So in the HTTP to port 2345 test, cutting out BP entirely, the stream appeared immediately. And in today’s test where I cut out the BP server, the stream was back to 7 or 8 minutes.

Hope this info helps.

So you’re saying even with BP completely out of the picture you’re seeing the buffering sometimes? If that’s the case I really don’t know what the problem could be. You’ve already approached it pretty much exactly the way I would. You need to somehow identify exactly which stage is doing the buffering.

Hi @anders, seems I’m not being very clear.

  1. Tunnel through BP server - buffering, always.

  2. Tunnel bypassing BP server but through BP client - buffering, always.

  3. Tunnel through ssh tunnel with no BP involvement - no buffering, ever.

Seems to me that there are buffered reads or writes being used either in the BP code or the Golang libraries that you’re using.

Hi @anders,

I’ve been redoing my tests again to conform my previous post, and my point 2 in that post is wrong. It should be - no buffering, ever.

I made a simple goof-up when I did my earlier tests. I’ve now rebuilt the tunnel with server termination and GatewayPorts yes in sshd_config, and, now, talking HTTP directly to the random port that BP allocated does not have any buffering issues.

As for points 1 and 3, I’ve rechecked them and they remain unchanged.

And I can add a point 4:

  1. Tunnel through BP server with “server” TLS termination - buffering, always.

Hope this helps.

Hi @anders,

This issue is a show-stopper for me. When I first became aware of it I opened this dialog with you, and started looking at other solutions. I now have a custom solution using Apache, ssh and certbot, with a bit of MQTT thrown in for good measure, and lots of my own code.

My solution lives within a closed environment under my control, and it doesn’t, and can’t, solve the general problem that boringproxy solves. So “hat’s off” to you for making a really good piece of infrastructure. Well done.

Cheers,
Graeme.

Hey @sharkshead, sorry we weren’t able to find a solution yet. It’s tricky when SSE seems to be working for me. I’m glad you ended up with an alternative that works for you. Thanks for reporting the problem. We’ll definitely keep a close eye on it in the future and hopefully be able to determine the root cause eventually.