Provided by: bpftune_0.0~git20250314.8fd59cc-1_amd64 bug

NAME

       BPFTUNE-NEIGH - Neighbor table bpftune plugin for managing neighbor table sizing

DESCRIPTION

          The  neighbor  table  contains  layer  3  ->  layer  2 mappings and reachability information on remote
          systems.  We look up via layer  3  address  (IP  address)  to  find  layer  2  address  (MAC  address)
          associated.

          The  table  is  populated  with  both static and garbage-collected values.  When adding entries we can
          specify that they should be PERMANENT, in which case they are not (exempt from) garbage-collected.

          Periodic garbage collection happens for non-permanent failed or expired entries; it is run immediately
          if we cannot alloc a new neighbor table entry.

          There are a few pathologies we want to avoid here, principally

          • neighbor table full: if we see /var/log/messages "Neighbour table overflow."  we  have  run  out  of
            space.   Can  occur  if  garbage collection isn't run quickly enough or we are full with entries not
            subject to garbage collection.

            In former case, we could auto-tune by reducing gc_thresh2 since this makes GC run more quickly.

            In the latter case, with a large number (75% or more) of exempt from GC entries, garbage  collection
            won't  help so we have to increase gc_thresh3. This is done on a per-table basis via netlink, so the
            resource costs are limitied rather than setting a system-wide tunable. Size is increased by  25%  of
            the current value (so 1024 -> 1280, etc).

            Note  that  by  increasing gc_thresh3 only, garbage collection gets gets more time to run from table
            sized gc_thresh2 until we reach gc_thresh3.  So it effectively helps with both scenarios.

          • neighbor  table  thrashing:  too-aggressive  GC  eviction  might  lead  to  excessive  overhead   in
            re-estabilishing L3->L2 reachability information. TBD.

          Tunables:

          • gc_interval: how often garbage collection should happen; default 30 seconds.

          • gc_stale_time:  how  often to check for stale entries.  If neighbor goes stale, it is resolved again
            before sending data; defaults to 60sec

          • base_reachable_time_ms: how long neighbor entry is considered reachable for; defaults to 30sec.

          • gc_thresh1: with a table size below this value, no GC happens; default 128

          • gc_thresh2: soft max of entries in table; wait 5 secs if we exceed this value to do GC; default 512.

          • gc_thresh3: hard max for table size, GC will run if more entries than this exist, default 1024.

          Note: to set table size we need to use the equivalent of "ip ntable"; i.e.   "ip  ntable  change  name
          arp_cache dev eth0 thresh3 1024" (this is done directly in bpftune via netlink)

          Contrast  this approach with simply choosing a large net.ipv4.neigh.gc_thresh3. If thresh2 and thresh3
          are far apart, we may over-garbage collect, whereas if they are close we may end up keeping around too
          many entries.  In either case, we're mistuned because we've  had  to  choose  coarse-grained  defaults
          rather than adapting on a per-table basis as the need arises.

SEE ALSO

          bpf(2), bpftune(8),

                                                                                                BPFTUNE-NEIGH(8)