[pkg-netfilter-team] Bug#1032844: nftables / netfilter: catastrophic bug: totally wrong packet data logged in arp - filter - output chains
Binarus
lists at binarus.de
Sun Mar 12 17:30:12 GMT 2023
Package: nftables
Version: 0.9.8-3.1+deb11u1
Possible further package: linux-image-amd64
Version: 5.10.162-1
OS: Debian bullseye, amd64, vanilla installation, up to date at the time of writing.
NICs: Only one NIC, device enp3s0, working correctly and configured with a static IP address.
IPv6 is disabled at the kernel command line, IPv4 is fully enabled.
Dear nftables / netfilter maintainer,
this message is closely related to the both messages I have sent you a few minutes ago. As explained there, a ruleset like
table arp t_ARP
delete table arp t_ARP
table arp t_ARP {
chain output-filter {
type filter hook output priority -800; policy drop;
oifname enp3s0 arp ptype 0x0800 log prefix "Foo:" accept;
log prefix "arp-output-filter:" drop;
}
}
makes the machine not answer ARP broadcasts any more because the accept rule is not executed, probably due to bugs in the nft userspace program or the kernel.
But this time, let's examine what appears in /var/log/messages due to the drop rule. In my case, as soon as I tried to connect to the PC in question from another box that did not have the respective MAC address in the ARP cache, it was multiple instances of the following line:
[...] arp-output-filter:IN= OUT=enp0s3 ARP HTYPE=37 PTYPE=0x90bd OPCODE=21
Well, OK. The next 8 hours of that Sunday had to be spent for researching what hardware that should be (HFI hardware?), what opcode that should be (MARS-grouplist-reply) and what protocol type that should be. Of course, that led nowhere and was a waste of time.
Then I noticed something interesting. The ARP response packet that *should have been sent* (but had been dropped) would have begun with the following bytes:
00 25 90 bd b0 db 00 15 17 75 b2 04 08 06 00 01
The first 6 bytes are the destination MAC the response should be sent to, the next six bytes are the source MAC of the NIC in question. Now I observed that 0x25 = 37 (the HTYPE from the log entry), 90bd (the PTYPE thereof) are the third and the fourth bytes of the destination MAC, and 0x15 = 21 (the second byte of the source MAC) is the opcode in the log entry.
This clearly shows that something catastrophic is going on there. I am now absolutely sure that nftables or the kernel use totally wrong data to create the log entry. The log entry shown above doesn't make any sense, and it is totally impossible that it is correct for the system in question. Instead, the kernel outputs arbitrary data taken from *single bytes of the packet's MAC addresses as HTYPE, PTYPE and OPCODE.*
Wow, that's really an epic fail that hardly can be excused. It took me the whole day to find out, and it has totally destroyed any trust in nftables. It might even impact security: If we write a firewall rule and the kernel uses arbitrary data to check packets against the rule, packets that should be dropped will be accepted, and vice versa.
It would be great if you could let me know what you think about the situation as soon as your time allows (we are currently under pressure with a project that aims to replace all our iptables-based firewall scripts by nftables-based ones, but that's not possible now). We really don't know how to proceed. Dump debian? Dump linux? Stick with iptables, dump nftables and revisit them in five years? Not really good options ... Any advice would be greatly appreciated.
Thank you very much again, and have a nice Sunday (it's already evening here),
Binarus
More information about the pkg-netfilter-team
mailing list