tjll.net

Tyblog | Building my ideal router for $50

First let me thank you for that great blog post!

Over the weekend I tried to follow you guide but I came across a curious, at least for me, problem. Ip forwarding seems only to work if I start the network service after the shorewall service. On the clients when I try to traceroute any webaddress, im stuck at the router, after stopping and starting systemd-networkd.service the route completes. Is this normal behaviour and I just did something wrong, or did you take care that the network service starts after the shorewall?

Hmm, I haven’t modified either the systemd-networkd service or the shorewall service. I believe that indicating the wan and br0 interfaces should be configured for IPv4 is sufficient - here’s my relevant files:

[root@router ~]$ cat /etc/systemd/network/br0.network
[Match]
Name=br0

[Network]
Address=192.168.1.1/24
IPForward=ipv4
IPMasquerade=yes
ConfigureWithoutCarrier=yes
[root@router ~]$ cat /etc/systemd/network/wan.network
[Match]
Name=wan

[Network]
IPv6AcceptRA=no
DHCP=ipv4
BindCarrier=eth0
IPForward=ipv4
[root@charon ~]#

As far as I know, this should set the interfaces into a forwarding state.

Any particular reason you went for classfull qdisc rather than e.g. cake?

The piece-of-cake recipe from openwrt works very well for me, but, otoh, I don’t really know what it’s doing.

I read up on traffic shaping strategies for a long time before finally just settling on using eqhmcow’s strategy primarily because I felt like I understood the class choices and it seemed to fit my use case.

I suspect that latest-and-greatest traffic shaping algorithms (used to be codel, now cake) may achieve the same results, honestly. My understanding is that the tradeoff is router load, but after my time with the espressobin, I certainly think that it could handle the additional processing power.

My hope is that I can provide a revisit of my router setup after a while and spend some time benchmarking cake versus other options to verify whether it works, but if anyone wants to experiment, my understanding is that the module(s) are readily available from the Arch repositories/AUR.

Thanks for a nice post!

I did something similar - built my router using an Espressobin. I used Ubuntu rather than arch.
Works very nicely. Today I’ve had fiber based Internet service installed - 1Gb/s downsteam (replacing the meager 50-60Mb/s I had previously). Suddenly I hit a performance wall. The board goes to high CPU when I speedtest it, and hits a ceiling at ~125Mb/s.
Looking at what’s going on I see pppoe using very high CPU. Tried to upgrade rp-pppoe, but to no avail.
Any thoughts on that one?

EDIT: Typically after giving up on solving it and posting, you hit the solution. So I did. I’ve been using userland pppoe - and sure enough, it hits userland limits. Once I moved to kernel pppoe, Performance skyrocketed to what I believe is the ISP limit - at this moment, well over 500Mb/s.

It seems I am also having speed issues as well with lan to wan traffic. Lan to Lan I am getting gigabit, from the Espressobin OS console outbound I am getting gigabit, but from a machine plugged into the lan to a machine hooked directly into the wan port I am only getting about a 3rd of a gig. Have you tried a bandwidth test on your router with something capable of gigabit speeds? I am getting these speeds from stock installs of Arch and Armbian, with no mods and no FW…

I saw doron’s comment, I do not believe the issue is pppoe on mine, at least I am not seeing that in the logs. Any ideas?

Sorry - didnt reply to doron correctly, reposing
If you don’t mind me asking, what did you do to upgrade it? I am running into the exact same issue, as in hitting a ceiling…

@RickS - what I did is change my pppoe configuration template. It had a “pty” line calling /usr/sbin/pppoe with some flags, which translates to running pppoe as a userland process. (When I was on VDSL, it worked just fine; when I connected to FTTH, CPU went to the roof and performance was blocked.)

Instead, I –

  1. Made sure my kernel is compiled with PPP and PPPOE (as modules, in my case, but that’s unimportant)
  2. Remove the pty lines from my pppoe configuration
  3. Add “plugin rp-pppoe.so” to that configuration

This makes pppoe use the kernel module. Performance upped 5-7 times once I did that.

Hope this helps!

@Andy : I’m now getting slightly over 500 Mb/s from lan1 to wan routed by the EB. But, read on.

There’s another, more “subtle” performance limit, which is based on the architecture of the EB.
As you probably know, the board has a SOC with one 1Gb/s port; and an Ethernet switch (topaz), with 4 1Gb/s ports, one of which internally connected to the SOC.

Now, as long as LAN switching can be done inside the switch, you can get full 1Gb in / 1Gb out performance between, say, wan and lan1. The board, and, mainly, the kernel code, are very smart in offloading functionality into the switch. So stuff like basic routing, and even some iptables filtering, - can mostly be offloaded into the switch so there’s little performance impact.

But: when you consider traffic between two of the switch’s ports, with processing that must be done at the CPU level, - all that traffic needs to flow into the CPU and then back into the switch. This will now hit the limits of both the internal port (each packet needs to go over it twice - in and out), and the kernel which now needs to deal with double the packet handling interrupts (on a single core). While the port is full duplex, this translates to high load processing those packets, and on Linux you may see Soft IRQ overload.

Case in point (mine): I needed to terminate the ISP PPPOE tunnel on the EB. This means that the board needs to perform both PPPOE tunneling and point-to-multipoint NAT (aka masquerading). These can’t be offloaded to the switch chip; hence all my wan port traffic flows into the CPU and then down to lan port. High interrupt load on CPU. Net effect: tops at somewhat over 500Mb/s.

If you will terminate your tunnel (and do your inevitable NATing) outside of (i.e. in front of) your EB wan port, you may probably get much closer to the 1Gb/s theoretical limit.

Hope this helps!

@doron - I actually don’t think I have the same issue as you, also I wouldn’t mind 2/3rds of a gig’s performance, but its far lower. Also I have a fiber box that I connect my router to, so I am not using PPPOE, and just to be sure, I made sure it wasn’t even installed. I did some searching, any my problem accurately matches this guys issue:

I’ve played around with the IRQ, but I cant get it to balance, its always running on either CPU 0 or 1, and at nearly 100% during a load, which is probably why its at 1/3rd performance. Again, not even sure if that is the issue, I just don’t know how to go about troubleshooting this…

Hi , can you share your 3d print file ?

Thanks

@m7mdcc the STL file from the picture of my router is this one from Thingiverse.

I’ve also printed this case which houses a drive; as I’ve started experimenting running the espressobin with a drive attached as the root volume instead of an SD card and the setup works fairly well.

Does anyone have any advice for what to do in case of a power outage? My espressobin was running great until a storm rolled through and killed the power for a few hours. My UPS isn’t large enough to run my network forever and eventually it all came down while I was away at work. I came back to a corrupt sd card and my attempts at fixing it resulted in kernel panics. I’m hoping I can recover my config and set it all back up again. Is there a clever way to shutdown cleanly?

APC makes an Arch linux package for a clean shutdown!

What version of Espressobin did you do this on? My guess that your build was the V5 just because of the timing of your post … I want to replicate what you have done. It seems to me that the only major difference between V5 and V7 is DDR3 => DDR4 … I think I will purchase the V7 Espressobin.

BTW, this was an amazing blog post! I have been reading about building my own firewall for a while and you seem to be the best resource for doing that :slight_smile:

Hi!
I bought a Espressobin V7 with the newest Arch Linux ARM 64bit seems to have dnsmasq as default DHCP an DNS Server… now im failing to set up! Can be someone be so frendly and post the hole dnsmasq.conf of his router!

Thanks Mates

Yep! This was on a V5. I think a V7 would be a great option for a router; the additional CPU horsepower means that you could probably use more intelligent QoS schedulers like CAKE without hitting any bottlenecks.

Hi @Morta! Here’s the relevant parts of my dnsmasq.conf. I have a couple of extra cname= and similar entries for some other names I have setup note that there are a few things specific to my setup here:

  • I start the DHCP range at 5 since I have a few reserved static IPs.
  • The last few lines are there only because I have dnsmasq configured for PXE booting, they’re not necessary for the functionality described in this blog post.
  • The conf-dir= option is also just optional, I have some dynamically-generated files there.
  • server=/consul/127.0.0.1#8500 lets any requests for foo.service.consul resolve from any host in my LAN.
domain-needed
bogus-priv
server=/consul/127.0.0.1#8600
no-hosts
domain=<my domain>
expand-hosts
conf-dir=/etc/dnsmasq.d
interface=br0
dhcp-range=192.168.1.5,192.168.1.250,255.255.255.0,24h
dhcp-option=option:router,192.168.1.1
address=/router/192.168.1.1
dhcp-boot=ipxe.a56af4e6a9a9.pxe
enable-tftp
tftp-root=/srv/ftp

Hi Tyler.
Very good article indeed. I am already running a router on arch linux myself on an APU1D4 and custom intel motherboard with dnsmasq,unbound and shorewall.
I came across your article while searching for a way to visualize network data.
I tried the instructions on elasti.co but I am unable to set it up and getting nowhere.
Would it be possible for you to post how you set it up on arch linux? The second machine where I want to send netflow data is also running arch.
Than you.

I need a dual-WAN solution. Think it’s possible to use this as a router, but just treat the “wan” interface as LAN and use eth0 and eth1 for my PPPoE connections?

Is there any reason why that wouldn’t work? Otherwise I suppose I could buy a USB-to-Ethernet adapter and use it for the second WAN port