[pkg-netfilter-team] Bug#1053564: Acknowledgement (nftables: nft freeze after some times, probably as a result of excessive use of named set)

Daniel Haryo Sugondo sugondo at hlrs.de
Fri Nov 24 10:21:39 GMT 2023


Hi,

with rt kernel from backports, the systemd got [D] status

[327659.296273] general protection fault, probably for non-canonical address 0xdef367ea15824066: 0000 [#1] PREEMPT_RT SMP PTI
[327659.296278] CPU: 2 PID: 874 Comm: bash Not tainted 6.5.0-0.deb12.1-rt-amd64 #1  Debian 6.5.3-1~bpo12+1
[327659.296281] Hardware name: FUJITSU PRIMERGY RX1330 M2/D3375-A1, BIOS V5.0.0.11 R1.31.0 for D3375-A1x                    02/22/2023
[327659.296282] RIP: 0010:___slab_alloc+0x5f0/0xaf0
[327659.296287] Code: 48 48 8b 74 24 38 48 89 46 10 e8 0b 04 81 00 41 8b 54 24 28 48 8b 74 24 38 4c 01 f2 48 89 d0 48 0f c8 49 33 84 24 b8 00 00 00 <48> 33 02 48 81 46 08 00 20 00 00 48 89 06 49 8b 1c 24 48 83 c3 20
[327659.296289] RSP: 0018:ffffc3be41d57af0 EFLAGS: 00010286
[327659.296292] RAX: c8fdfcb812544c17 RBX: 0000000000039b60 RCX: ffffffffb12093a9
[327659.296293] RDX: def367ea15824066 RSI: ffffa0f28fcb9b40 RDI: ffffffffb203a026
[327659.296295] RBP: ffffc3be41d57bc8 R08: ffffa0f28fcb9b40 R09: 0000000000000048
[327659.296296] R10: ffffa0ebc025f800 R11: 0000000000000000 R12: ffffa0eb40044d00
[327659.296297] R13: ffffa0eb40044d00 R14: def367ea15824036 R15: fffff58dc4046500
[327659.296298] FS:  00007f13340d3740(0000) GS:ffffa0f28fc80000(0000) knlGS:0000000000000000
[327659.296300] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[327659.296302] CR2: 00005616386d6f78 CR3: 0000000152b6a004 CR4: 00000000003706e0
[327659.296303] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[327659.296304] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[327659.296306] Call Trace:
[327659.296308]  <TASK>
[327659.296310]  ? die_addr+0x36/0x90
[327659.296315]  ? exc_general_protection+0x1c5/0x430
[327659.296319]  ? memcg_slab_post_alloc_hook+0x17d/0x250
[327659.296324]  ? asm_exc_general_protection+0x26/0x30
[327659.296329]  ? alloc_fdtable+0xb9/0x110
[327659.296334]  ? ___slab_alloc+0x5f0/0xaf0
[327659.296337]  ? alloc_fdtable+0xb9/0x110
[327659.296341]  ? migrate_enable+0xd9/0x150
[327659.296345]  ? alloc_fdtable+0xb9/0x110
[327659.296348]  __kmem_cache_alloc_node+0xf7/0x270
[327659.296352]  ? alloc_fdtable+0xb9/0x110
[327659.296355]  __kmalloc_node+0x50/0x1a0
[327659.296360]  alloc_fdtable+0xb9/0x110
[327659.296363]  dup_fd+0x210/0x2d0
[327659.296366]  copy_process+0x1046/0x1ce0
[327659.296372]  kernel_clone+0xc3/0x4a0
[327659.296374]  ? wp_page_reuse+0x4d/0x60
[327659.296378]  __do_sys_clone+0x66/0x90
[327659.296382]  do_syscall_64+0x5c/0xc0
[327659.296385]  ? __count_memcg_events+0x86/0xe0
[327659.296388]  ? memcg_stats_unlock+0xf/0x50
[327659.296391]  ? count_memcg_events.constprop.0+0x1a/0x30
[327659.296393]  ? handle_mm_fault+0x9e/0x350
[327659.296397]  ? do_user_addr_fault+0x18c/0x640
[327659.296400]  ? fpregs_assert_state_consistent+0x26/0x50
[327659.296402]  ? exit_to_user_mode_prepare+0x40/0x1d0
[327659.296406]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[327659.296409] RIP: 0033:0x7f13341aa193
[327659.296411] Code: 00 00 00 00 00 66 90 64 48 8b 04 25 10 00 00 00 45 31 c0 31 d2 31 f6 bf 11 00 20 01 4c 8d 90 d0 02 00 00 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 89 c2 85 c0 75 2c 64 48 8b 04 25 10 00 00
[327659.296413] RSP: 002b:00007fff93202428 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
[327659.296415] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f13341aa193
[327659.296416] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
[327659.296417] RBP: 0000000000000000 R08: 0000000000000000 R09: 0031263e32206c6c
[327659.296418] R10: 00007f13340d3a10 R11: 0000000000000246 R12: 0000000000000001
[327659.296420] R13: 00007fff93202660 R14: 0000561637cb5b08 R15: 000056163870c210
[327659.296424]  </TASK>
[327659.296424] Modules linked in: xfrm_user xfrm_algo bridge 8021q garp stp mrp llc nfnetlink_log nft_log nft_limit nft_ct nf_tables nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nfnetlink binfmt_misc ipmi_ssif intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass ghash_clmulni_intel sha512_ssse3 sha512_generic aesni_intel crypto_simd cryptd rapl mgag200 intel_cstate iTCO_wdt drm_shmem_helper intel_pmc_bxt mei_me iTCO_vendor_support drm_kms_helper intel_uncore watchdog ee1004 pcspkr mei acpi_ipmi intel_pch_thermal ie31200_edac ipmi_si ipmi_devintf ipmi_msghandler evdev joydev intel_pmc_core acpi_pad acpi_power_meter button sg drm fuse loop dm_mod efi_pstore configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 efivarfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid0 multipath linear csiostor raid1 md_mod sd_mod hid_generic t10_pi usbhid hid crc64_rocksoft crc64 crc_t10dif
[327659.296487]  crct10dif_generic ahci libahci xhci_pci libata xhci_hcd scsi_transport_fc cxgb4 igb crct10dif_pclmul crct10dif_common scsi_mod crc32_pclmul usbcore crc32c_intel i2c_i801 tls i2c_smbus i2c_algo_bit dca scsi_common usb_common video wmi
[327659.296519] ---[ end trace 0000000000000000 ]---
[327659.602320] pstore: backend (erst) writing error (-28)
[327659.602323] RIP: 0010:___slab_alloc+0x5f0/0xaf0
[327659.602368] Code: 48 48 8b 74 24 38 48 89 46 10 e8 0b 04 81 00 41 8b 54 24 28 48 8b 74 24 38 4c 01 f2 48 89 d0 48 0f c8 49 33 84 24 b8 00 00 00 <48> 33 02 48 81 46 08 00 20 00 00 48 89 06 49 8b 1c 24 48 83 c3 20
[327659.602369] RSP: 0018:ffffc3be41d57af0 EFLAGS: 00010286
[327659.602371] RAX: c8fdfcb812544c17 RBX: 0000000000039b60 RCX: ffffffffb12093a9
[327659.602372] RDX: def367ea15824066 RSI: ffffa0f28fcb9b40 RDI: ffffffffb203a026
[327659.602373] RBP: ffffc3be41d57bc8 R08: ffffa0f28fcb9b40 R09: 0000000000000048
[327659.602374] R10: ffffa0ebc025f800 R11: 0000000000000000 R12: ffffa0eb40044d00
[327659.602375] R13: ffffa0eb40044d00 R14: def367ea15824036 R15: fffff58dc4046500
[327659.602393] FS:  00007f13340d3740(0000) GS:ffffa0f28fc80000(0000) knlGS:0000000000000000
[327659.602395] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[327659.602396] CR2: 00005616386d6f78 CR3: 0000000152b6a004 CR4: 00000000003706e0
[327659.602397] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[327659.602397] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Regards,

Daniel.


----- Original Message -----
From: "Daniel Haryo Sugondo" <sugondo at hlrs.de>
To: "1053564" <1053564 at bugs.debian.org>
Sent: Tuesday, October 24, 2023 5:03:58 PM
Subject: Re: Bug#1053564: Acknowledgement (nftables: nft freeze after some times, probably as a result of excessive use of named set)

Hi,

just want to update the status, the backports kernel 6.5.0-0.deb12.1-amd64 still has bug

Should I contact the kernel maintainer, to report this?

# uptime 
 17:00:46 up  5:48,  1 user,  load average: 1.00, 1.00, 0.80

# ps aux | grep nft
root      118228  0.0  0.0      0     0 ?        D    16:38   0:00 [nft]


Oct 24 16:37:31  nftfqdn.sh[117820]: /dev/shm/fqdn.nft:6:39-63: Error: Could not process rule: File exists
Oct 24 16:37:31  nftfqdn.sh[117820]: add element inet firewall fq4-acc-o { 143.204.98.10 . tcp . 443 }
Oct 24 16:37:31  nftfqdn.sh[117820]:                                       ^^^^^^^^^^^^^^^^^^^^^^^^^
Oct 24 16:37:31  nftfqdn.sh[117820]: /dev/shm/fqdn.nft:10:39-63: Error: Could not process rule: File exists
Oct 24 16:37:31  nftfqdn.sh[117820]: add element inet firewall fq4-acc-o { 143.204.98.14 . tcp . 443 }
Oct 24 16:37:31  nftfqdn.sh[117820]:                                       ^^^^^^^^^^^^^^^^^^^^^^^^^
Oct 24 16:37:31  nftfqdn.sh[117820]: /dev/shm/fqdn.nft:14:39-63: Error: Could not process rule: File exists
Oct 24 16:37:31  nftfqdn.sh[117820]: add element inet firewall fq4-acc-o { 143.204.98.24 . tcp . 443 }
Oct 24 16:37:31  nftfqdn.sh[117820]:                                       ^^^^^^^^^^^^^^^^^^^^^^^^^
Oct 24 16:37:49  nftfqdn.sh[117922]: /dev/shm/fqdn.nft:2:39-62: Error: Could not process rule: File exists
Oct 24 16:37:49  nftfqdn.sh[117922]: add element inet firewall fq4-acc-o { 143.204.98.3 . tcp . 443 }
Oct 24 16:37:49  nftfqdn.sh[117922]:                                       ^^^^^^^^^^^^^^^^^^^^^^^^
Oct 24 16:38:06  nftfqdn.sh[118024]: /dev/shm/fqdn.nft:2:39-62: Error: Could not process rule: File exists
Oct 24 16:38:06  nftfqdn.sh[118024]: add element inet firewall fq4-acc-o { 143.204.98.3 . tcp . 443 }
Oct 24 16:38:06  nftfqdn.sh[118024]:                                       ^^^^^^^^^^^^^^^^^^^^^^^^
Oct 24 16:38:23  nftfqdn.sh[118126]: /dev/shm/fqdn.nft:2:39-62: Error: Could not process rule: File exists
Oct 24 16:38:23  nftfqdn.sh[118126]: add element inet firewall fq4-acc-o { 143.204.98.3 . tcp . 443 }
Oct 24 16:38:23  nftfqdn.sh[118126]:                                       ^^^^^^^^^^^^^^^^^^^^^^^^
Oct 24 16:38:41  kernel: general protection fault, probably for non-canonical address 0x2bdf9ea774ac39fc: 0000 [#1] PREEMPT SMP PTI
Oct 24 16:38:41  kernel: CPU: 3 PID: 118228 Comm: nft Tainted: G            E      6.5.0-0.deb12.1-amd64 #1  Debian 6.5.3-1~bpo12+1
Oct 24 16:38:41  kernel: Hardware name: FUJITSU PRIMERGY RX1330 M2/D3375-A1, BIOS V5.0.0.11 R1.31.0 for D3375-A1x                    02/22/2023
Oct 24 16:38:41  kernel: RIP: 0010:__kmem_cache_alloc_node+0x1cd/0x310
Oct 24 16:38:41  kernel: Code: f7 44 24 08 00 08 08 00 74 91 44 89 ea c1 ea 08 21 d0 eb 87 41 8b 44 24 28 4d 8b 0c 24 49 8d 88 00 20 00 00 48 01 f8 48 89 c2 <48> 8b 00 49 33 84 24 b8 00 00 00 48 0f ca 48 31 d0 4c 89 c2 48 89
Oct 24 16:38:41  kernel: RSP: 0018:ffffa49642a57530 EFLAGS: 00010206
Oct 24 16:38:41  kernel: RAX: 2bdf9ea774ac39fc RBX: 0000000000400dc0 RCX: 000000000634e003
Oct 24 16:38:41  kernel: RDX: 2bdf9ea774ac39fc RSI: ffffffffacc50147 RDI: 2bdf9ea774ac39dc
Oct 24 16:38:41  kernel: RBP: ffffa49642a57580 R08: 000000000634c003 R09: 0000000000038580
Oct 24 16:38:41  kernel: R10: 0000000000000000 R11: ffffffffffffffff R12: ffff919d80044c00
Oct 24 16:38:41  kernel: R13: 0000000000400dc0 R14: ffff919d83542140 R15: 00000000ffffffff
Oct 24 16:38:41  kernel: FS:  00007fcca6a70740(0000) GS:ffff91a4cfcc0000(0000) knlGS:0000000000000000
Oct 24 16:38:41  kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 24 16:38:41  kernel: CR2: 00007ffdc47ab0b8 CR3: 000000010650c006 CR4: 00000000003706e0
Oct 24 16:38:41  kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Oct 24 16:38:41  kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Oct 24 16:38:41  kernel: Call Trace:
Oct 24 16:38:41  kernel:  <TASK>
Oct 24 16:38:41  kernel:  ? die_addr+0x36/0x90
Oct 24 16:38:41  kernel:  ? exc_general_protection+0x1c5/0x430
Oct 24 16:38:41  kernel:  ? asm_exc_general_protection+0x26/0x30
Oct 24 16:38:41  kernel:  ? __kmem_cache_alloc_node+0x1cd/0x310
Oct 24 16:38:41  kernel:  ? nft_set_elem_init+0x54/0x200 [nf_tables]
Oct 24 16:38:41  kernel:  ? nft_set_elem_init+0x54/0x200 [nf_tables]
Oct 24 16:38:41  kernel:  __kmalloc+0x4d/0x150
Oct 24 16:38:41  kernel:  nft_set_elem_init+0x54/0x200 [nf_tables]
Oct 24 16:38:41  kernel:  nft_add_set_elem+0xb5b/0x12c0 [nf_tables]
Oct 24 16:38:41  kernel:  nf_tables_newsetelem+0x1a1/0x240 [nf_tables]
Oct 24 16:38:41  kernel:  nfnetlink_rcv_batch+0x7d6/0x970 [nfnetlink]
Oct 24 16:38:41  kernel:  nfnetlink_rcv+0x179/0x1a0 [nfnetlink]
Oct 24 16:38:41  kernel:  netlink_unicast+0x19e/0x290
Oct 24 16:38:41  kernel:  netlink_sendmsg+0x254/0x4d0
Oct 24 16:38:41  kernel:  sock_sendmsg+0x93/0xa0
Oct 24 16:38:41  kernel:  ____sys_sendmsg+0x285/0x310
Oct 24 16:38:41  kernel:  ? copy_msghdr_from_user+0x7d/0xc0
Oct 24 16:38:41  kernel:  ___sys_sendmsg+0x9a/0xe0
Oct 24 16:38:41  kernel:  ? sk_getsockopt+0x72b/0x1230
Oct 24 16:38:41  kernel:  __sys_sendmsg+0x7a/0xd0
Oct 24 16:38:41  kernel:  do_syscall_64+0x5c/0xc0
Oct 24 16:38:41  kernel:  ? fpregs_assert_state_consistent+0x26/0x50
Oct 24 16:38:41  kernel:  ? exit_to_user_mode_prepare+0x40/0x1d0
Oct 24 16:38:41  kernel:  ? syscall_exit_to_user_mode+0x2b/0x40
Oct 24 16:38:41  kernel:  ? do_syscall_64+0x6b/0xc0
Oct 24 16:38:41  kernel:  ? syscall_exit_to_user_mode+0x2b/0x40
Oct 24 16:38:41  kernel:  ? do_syscall_64+0x6b/0xc0
Oct 24 16:38:41  kernel:  ? exit_to_user_mode_prepare+0x40/0x1d0
Oct 24 16:38:41  kernel:  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Oct 24 16:38:41  kernel: RIP: 0033:0x7fcca6cb7930
Oct 24 16:38:41  kernel: Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 66 2e 0f 1f 84 00 00 00 00 00 90 80 3d b1 fc 0c 00 00 74 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 89 54
Oct 24 16:38:41  kernel: RSP: 002b:00007ffdc47ab0b8 EFLAGS: 00000202 ORIG_RAX: 000000000000002e
Oct 24 16:38:41  kernel: RAX: ffffffffffffffda RBX: 00007ffdc47bc2b0 RCX: 00007fcca6cb7930
Oct 24 16:38:41  kernel: RDX: 0000000000000000 RSI: 00007ffdc47bc160 RDI: 0000000000000003
Oct 24 16:38:41  kernel: RBP: 00007ffdc47bc260 R08: 00007ffdc47ab094 R09: 000055b302903520
Oct 24 16:38:41  kernel: R10: 00007fcca6e9ff00 R11: 0000000000000202 R12: 000055b3028d9b50
Oct 24 16:38:41  kernel: R13: 0000000000010000 R14: 00007ffdc47ab0d0 R15: 0000000000000001
Oct 24 16:38:41  kernel:  </TASK>
Oct 24 16:38:41  kernel: Modules linked in: bridge(E) 8021q(E) garp(E) stp(E) mrp(E) llc(E) nfnetlink_log(E) nft_log(E) nft_limit(E) nft_ct(E) nf_tables(E) nf_conntrack_netlink(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nfnetlink(E) binfmt_misc(E) inte>
Oct 24 16:38:41  kernel:  async_raid6_recov(E) async_memcpy(E) async_pq(E) async_xor(E) async_tx(E) xor(E) raid6_pq(E) libcrc32c(E) crc32c_generic(E) raid0(E) multipath(E) linear(E) csiostor(E) raid1(E) md_mod(E) sd_mod(E) t10_pi(E) hid_generic(E) crc64_rocksoft(E>
Oct 24 16:38:41  kernel: ---[ end trace 0000000000000000 ]---
Oct 24 16:38:41  kernel: RIP: 0010:__kmem_cache_alloc_node+0x1cd/0x310
Oct 24 16:38:41  kernel: Code: f7 44 24 08 00 08 08 00 74 91 44 89 ea c1 ea 08 21 d0 eb 87 41 8b 44 24 28 4d 8b 0c 24 49 8d 88 00 20 00 00 48 01 f8 48 89 c2 <48> 8b 00 49 33 84 24 b8 00 00 00 48 0f ca 48 31 d0 4c 89 c2 48 89
Oct 24 16:38:41  kernel: RSP: 0018:ffffa49642a57530 EFLAGS: 00010206
Oct 24 16:38:41  kernel: RAX: 2bdf9ea774ac39fc RBX: 0000000000400dc0 RCX: 000000000634e003
Oct 24 16:38:41  kernel: RDX: 2bdf9ea774ac39fc RSI: ffffffffacc50147 RDI: 2bdf9ea774ac39dc
Oct 24 16:38:41  kernel: RBP: ffffa49642a57580 R08: 000000000634c003 R09: 0000000000038580
Oct 24 16:38:42  kernel: R10: 0000000000000000 R11: ffffffffffffffff R12: ffff919d80044c00
Oct 24 16:38:42  kernel: R13: 0000000000400dc0 R14: ffff919d83542140 R15: 00000000ffffffff
Oct 24 16:38:42  kernel: FS:  00007fcca6a70740(0000) GS:ffff91a4cfcc0000(0000) knlGS:0000000000000000
Oct 24 16:38:42  kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 24 16:38:42  kernel: CR2: 00007ffdc47ab0b8 CR3: 000000010650c006 CR4: 00000000003706e0
Oct 24 16:38:42  kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Oct 24 16:38:42  kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


regards,

----- Original Message -----
From: "Daniel Haryo Sugondo" <sugondo at hlrs.de>
To: "Arturo Borrero Gonzalez" <arturo at debian.org>
Cc: "1053564" <1053564 at bugs.debian.org>
Sent: Tuesday, October 24, 2023 11:22:50 AM
Subject: Re: Bug#1053564: Acknowledgement (nftables: nft freeze after some times, probably as a result of excessive use of named set)

Hi Arturo,

thank you for your answer, I'll give now a shot with 6.5.0-0.deb12.1-amd64. 

On 1st of October, I tested it with linux-image-6.4.0-0.deb12.2-amd64 but 
the problem still exist and revert it back on 2nd of October to the default 
Debian 12 Kernel.

regards.

----- Original Message -----
From: "Arturo Borrero Gonzalez" <arturo at debian.org>
To: "Daniel Haryo Sugondo" <sugondo at hlrs.de>
Cc: "1053564" <1053564 at bugs.debian.org>
Sent: Tuesday, October 24, 2023 10:36:42 AM
Subject: Re: Bug#1053564: Acknowledgement (nftables: nft freeze after some times, probably as a result of excessive use of named set)

On 10/24/23 10:20, Daniel Haryo Sugondo wrote:
> Dear maintainer
> 
> the problem with named set makes the system unusable.
> 
> I would be so thankful, if you can give me some hints, what's
> wrong with the behavior since Debian12.
> 


Hi Daniel,

this sounds to me like a bug in the nf_tables linux kernel subsystem.

I don't have the info at hand at the moment whether if this has been fixed 
already. I would try using a newer kernel, either stable or backports.

regards.



More information about the pkg-netfilter-team mailing list