Bug story: getaddrinfo(3) and PBR

Bug story: getaddrinfo(3) and PBR January 01, 2024 [c, posix, openwrt, bugs]

A while ago I was working on wireless access points (APs) based on OpenWrt. One day I discovered that remote logging wasn't working, and the debugging that followed had some surprises.

On OpenWrt, there's a process called logread responsible for shipping the logs to another device via the syslog protocol. These APs don't persist their logs between boots, so sending logs to a system that can store them was essential for diagnosing problems. I noticed logread wasn't running, though it starts on boot, so I added something to the init script to restart logread if it crashed, and was going to call it a day. But I went to test it, and the logs weren't showing up; sometimes, the logs would show up right after the AP booted, but then at some point, it would stop working.

I had already spent a lot of time on the other side of this, the syslog that receives the logs, and was pretty sure the setup was correct there. So I ran logread by hand, and it failed with

failed to connect: Permission denied

What? Permission denied? I read the code to find out where this was happening, and it was in usock(), which is some socket code that's used all over OpenWrt, and there were no obvious calls that could fail with EACCES in it.

After checking some ACLs, making sure this couldn't possibly be a permission problem (it's running as root), I decided to strace logread (this required rebuilding the flash image for the AP, which is why I didn't do it earlier), and saw:

socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC, IPPROTO_UDP) = 8
connect(8, {sa_family=AF_INET, sin_port=htons(65535), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
close(8)                                = 0
socket(AF_INET6, SOCK_DGRAM|SOCK_CLOEXEC, IPPROTO_UDP) = 8
connect(8, {sa_family=AF_INET6, sin6_port=htons(65535), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = -1 EACCES (Perm
ission denied)
close(8)                                = 0

What the heck? First off, the connection logread is trying to make in this case is a TCP connection, and we're giving it an IP address; why is it making UDP connections to localhost? And why are those connections failing?

I had a guess on why this started happening – a little while before, IPv6 had been disabled on these devices. Maybe it hadn't been done thoroughly enough? I checked ip addr, and lo definitely did not have ::1 as an address, and IPv6 was disabled through the disable_ipv6 sysctl.

I decided that it was probably a call to getaddrinfo() making UDP connections – maybe it's trying to resolve DNS – but why port 65535? Is that just an ephemeral port it's choosing every single time?

I tested getaddrinfo from Lua (the only interpreted language on the device), but it worked fine, so there had to be something about how usock was calling it; did it want IPv6 addresses specifically or something?

musl is the libc of choice on these devices. Checking its implementation of getaddrinfo, we see this block of code near the top:

if (flags & AI_ADDRCONFIG) {
    /* Define the "an address is configured" condition for address
     * families via ability to create a socket for the family plus
     * routability of the loopback address for the family. */
    // …
    static const struct sockaddr_in6 lo6 = {
        .sin6_family = AF_INET6, .sin6_port = 65535,
        .sin6_addr = IN6ADDR_LOOPBACK_INIT
    };
    const void *ta[2] = { &lo4, &lo6 };
    // …
    for (i=0; i<2; i++) {
      // …
      int s = socket(tf[i], SOCK_CLOEXEC|SOCK_DGRAM,
                     IPPROTO_UDP);
      if (s>=0) {
        int cs;
        pthread_setcancelstate(
                               PTHREAD_CANCEL_DISABLE, &cs);
        int r = connect(s, ta[i], tl[i]);
        pthread_setcancelstate(cs, 0);
        close(s);
        if (!r) continue;
      }
      switch (errno) {
      case EADDRNOTAVAIL:
      case EAFNOSUPPORT:
      case EHOSTUNREACH:
      case ENETDOWN:
      case ENETUNREACH:
        break;
      default:
        return EAI_SYSTEM;
      }
      // …

And sure enough, usock always sets AI_ADDRCONFIG on flags. So this is a kind of probing connect musl is using to check the validity of IPv4 or IPv6 on the system. The connect is returning EACCES, but musl isn't handling it as part of the errors it considers "normal". It bails out early, and leaves errno set to EACCES where logread prints it out to mystify us.

But why would connect fail with EACCES? The man page doesn't list anything that makes sense for this.¹ Weirder still, I decide to check if there are any IPv6 addresses at all – and there is one, but for eth0, not lo. I delete it, and suddenly logread works.

At this point I start looking for information about musl's getaddrinfo and this issue, and find a patch posted to the mailing list with no replies, never applied.

Sweet! I head over to #musl on IRC and ask them if it wasn't applied for a reason, and they say it must have been overlooked. But then someone tries to reproduce with the instructions in the patch, and can't.

I dive into the kernel source trying to figure out what actually returns EACCES here. There is a lot of code under ip6_datagram_connect so I tried to grep and pray, but there were still too many possibilites to know for sure. This is an opportunity to use ftrace! I had to rebuild the kernel, since these are stripped down images for embedded devices, and I was worried trace-cmd might actually crash the device, but I got a capture fine. I could see clearly that the last useful function called under ip6_datagram_connect was fib6_rule_action, which can return EACCES, but why? What even are these rules?

I spent a while even trying to figure out what these rules are and how to manipulate them. It turns out they're for "policy-based routing" (PBR), which I hadn't really explored before. I didn't even realize some of these firewall-like policies could be handled at this level.

I was running the ip rule command and not seeing anything interesting, until I finally read the source for ip and noticed it defaults to IPv4 – I needed to run ip -6 rule, but that flag is in the ip(8) manpage, not the ip-rule(8) manpage for the subcommand. But running it on the AP, I saw:

# ip -6 rule
0:      from all lookup local
32766:  from all lookup main
4200000001:     from all iif lo lookup unspec 12
4200000002:     from all iif eth0 lookup unspec 12
4200000003:     from all iif eth1 lookup unspec 12

I'm not sure I fully understand these rules now, and it took a bit of looking (and strace to confirm the netlink message being sent) to see that this is action "12", which isn't one of the actions in the (mainline) kernel. But it was enough info to demonstrate that the issue could be reproduced on any Linux system with

ip -6 rule add from all iif lo lookup unspec prohibit

Some discussion on the musl mailing list revealed that action 12 is a special rule OpenWrt adds in their kernel. I discovered that netifd, which manages interfaces and rules on OpenWrt, was setting IPv6 policies like this, even when IPv6 was disabled, so I patched that out. And finally remote logging worked again.

This was a surprising set of interactions. Figuring it out was tractable thanks to having all the source for everything, and reasonable tools for introspection. Is there a moral to this story? Perhaps a few tidbits: strace and ftrace are good; getaddrinfo is bad; maybe don't disable IPv6; and blessed are those who update manpages.

Footnotes:

At the time this happened, the man pages on my system didn't list this, but looking now, I see Stefan Puiu added a note about this, debugging much the same situation as mine. What a time saver this would have been.

— JS