Yet Another NixOS Router on the APU2
on nevi.devMotivation
I was having trouble with my previous all-in-one router-switch-AP (TP-Link Archer C7 running OpenWRT):
- I was running into this issue occasionally killing the network, but I could not troubleshoot it because my router could only be accessed through the network, which was dead.
- It could not run NixOS.
- I was dropping connections due to conntrack limits.
Hardware
After a bit of research, I settled on the APU2E5 with the wle600vx WiFi card, for the following reasons:
- low-powered and fanless
- x86_64 and plenty of memory
- tons of reports of people running NixOS successfully
I ordered the router pre-assembled, but it was damaged during shipping, bending the case and the board inside. The router itself still worked, so I left it as-is. I also wanted to use a 2.5" SATA SSD for storage, but it would not fit in the case,1 so I bought a separate SD card.
I could not screw in one of the corners because the holes no longer aligned without uncomfortably bending the board.
Initial configuration
I performed the initial installation with nixos-install --flake .#funi --root /mount/point
after partitioning the SD card. Later deployments (once the
network was up) were done with nix copy --to ssh://funi.nevi.network
before a
nixos-rebuild switch
on the APU.
I could have built the configuration from the router iself with just the
nixos-rebuild switch
, but my configuration requires rebuilding some large
packages, including the kernel, which would have taken forever on the low-power
SoC.
Update(2023-09-22): I came across this blog post which made me aware of the
--target-host
flag. My deployments now look likenixos-rebuild switch --flake .#funi --target-host root@funi.nevi.network
.
I also updated the firmware to the latest recommended version following the
instructions on the TekLager website (method 4, using my
existing NixOS system instead of a live USB). For the serial console, I used
picocom (picocom -b 115200 /dev/ttyUSB0
).
Network configuration
DHCP: dnsmasq
I configured dnsmasq to handle local DHCP (both v4 and v6) and RA, and DNS. I am not using SLAAC for IPv6, because I wanted proper DNS6 address resolution for hosts in my local network. This way, I can configure static assignments on the router, and let every host automatically configure itself with DHCP.
DNS: dnsmasq & unbound
Dnsmasq handles any local domains, overrides, filtering, and ad blocking. Unbound then receives any upstream queries, acting as a caching, recursing, validating resolver. By letting dnsmasq handle the local side, I get DHCP address resolution for free, both IPv4 and IPv6. At the same time, Unbound can handle DNSSEC (without having to worry about conflicts with local blocklists) and recursion (not supported by dnsmasq).
DNS blocking
I implemented DNS ad blocking by adding the hosts blocklist to dnsmasq. I packaged the hosts list itself here.
Initially, I had configured this by simply using addn-hosts
. However, I ran
into an issue where non-A or AAAA queries (such as HTTPS, used in Apple
systems) would still forward and respond to queries.2 To work
around this, I set both local=
and address=
options for each host in the
blocklist.
1{
2 services.dnsmasq.settings = {
3 conf-file = (pkgs.runCommand "dnsmasq-hosts" { } ''
4 < ${self.packages.${pkgs.system}.hosts}/hosts \
5 grep ^0.0.0.0 \
6 | awk '{print $2}' \
7 | tail -n+2 \
8 > hosts
9 awk '{print "local=/" $0 "/"}' hosts >> $out
10 awk '{print "address=/" $0 "/0.0.0.0"}' hosts >> $out
11 '').outPath;
12 };
13}
Firewall: nft
I wanted to have fine-grained control over my firewall, so I configured it manually instead of using the NixOS firewall module. My firewall (configuration) is a straightforward stateful firewall, but I tried using flowtables to let existing flows bypass the firewall once accepted.
You can see the effects of offloading here:
The middle period (where the graph is non~zero) is when I disabled flow offloading. However, despite the very obvious effect as seen on the graph, it did not measurably improve network performance or reduce CPU usage in my testing. I kept the flowtable configuration enabled on my system regardless, for the good feelings.
While configuring the firewall, I referenced the following resources:
- RFC-4890 sections 4.3 and 4.4.
- Kernel flowtable documentation
- nftables flowtable documentation
nft(8)
WiFi: hostapd
WiFi is cursed and so is hostapd. My configuration can be found here.
- The NixOS 23.05 hostapd module is severely lacking. In particular, it puts the the WPA password in the world-readable Nix store by default.
- There is a rewrite on nixpkgs-unstable with significant improvements, with easier configuration and an alternative to the world-readable password. The rewrite also enables a few important features such as OCV in the hostapd package.
For these reasons, I pulled in the unstable hostapd module and package into my otherwise mostly 23.05 system:
1{
2 disabledModules = [ "${nixpkgs}/nixos/modules/services/networking/hostapd.nix" ];
3 imports = [ "${nixpkgs-unstable}/nixos/modules/services/networking/hostapd.nix" ];
4 services.hostapd.package = pkgs.pkgsUnstable.hostapd;
5}
To get WiFi working properly on this hardware, I also had to apply a couple kernel configurations:
- The
ATH_USER_REGD
patch adapted from OpenWRT overrides the buggy firmware to allow changing the regulatory domain.- Without this change, I am unable to use the 5GHz band in AP mode, forcing me to use the 2.4GHz band.
1{ 2 networking.wireless.athUserRegulatoryDomain = true; 3}
- Enabling
ATH10K_DFS_CERTIFIED
allows me to use the DFS channels.1{ 2 boot.kernelPatches = [{ 3 name = "enable-ath-DFS-JP"; 4 patch = null; 5 extraStructuredConfig = with lib.kernel; { 6 EXPERT = yes; 7 CFG80211_CERTIFICATION_ONUS = yes; 8 ATH10K_DFS_CERTIFIED = yes; 9 }; 10 }]; 11}
Things to investigate
- VLANs
If I had done a tiny bit more research, I would have known that there was no way to fit a full 2.5" SSD in there. ↩︎
This behavior is quite recent, since dnsmasq 2.86, and is documented in the manpage since 2.87:
↩︎Note that the behaviour for queries which don’t match the specified address literal changed in version 2.86. Previous versions, configured with (eg) –address=/example.com/1.2.3.4 and then queried for a RR type other than A would return a NoData answer. From 2.86, the query is sent upstream. To restore the pre-2.86 behaviour, use the configuration –address=/example.com/1.2.3.4 –local=/example.com/