Provided by: asncounter_0.5.0_all 

NAME
asncounter — collect hits per ASN and netblock
DESCRIPTION
Count the number of hits (HTTP, packets, etc) per autonomous system number (ASN) and related network blocks. This is useful when you get a lot of traffic on a server to figure out which network is responsible for the traffic, to direct abuse complaints or block whole networks, or on core routers to figure out who your peers are and who you might want to seek particular peering agreements with.
SYNOPSIS
asncounter OPTIONS [ADDRESS ...]
OPTIONS
-h, –help show this help message and exit --cache-directory, -C CACHE_DIRECTORY where to store pyasn cache files, default: ~/.cache/pyasn --no-prefixes disable prefix count --no-asn disable ASN count --no-resolve-asn disable ASN to name resolution in output --top, -t N only show top N entries, default: 10 --input, -i INPUT input file, default: stdin --input-format, -I {line,tuple,tcpdump,scapy} input format, default: line --scapy-filter SCAPY_FILTER BPF filter to apply to incoming packets, default: ip and not src host 0.0.0.0 and not src net 192.168.0.0/24 --interface [INTERFACE] open an interface instead of stdin for packets, implies -I scapy, auto-detects by default --output, -o OUTPUT write stats or final prometheus metrics to the given file, default: stdout --output-format, -O {tsv,prometheus,null} output format, choices: tsv, prometheus, null, default: tsv --port, -p [PORT] start a prometheus server on the given port, default disabled, port 8999 if unspecified --refresh, -R download a recent RIB cache file and exit --repl run a REPL thread in main loop --manhole setup a REPL socket with manhole --debug more debugging output ADDRESS zero or more IP addresses to parse directly from the command line, before the input stream is read. disables the default stdin reading and --input-format cannot be changed.
INPUT FORMATS
The --input-formats option warrants more discussion. line The line input formats treats each line in the stream as an IP address, counting as one hit. Empty lines are skipped, and comments – whatever follows the pound (#) sign – are trimmed. Whatever cannot be parsed as an IP address is loggd as a warning and skipped. This, for example, counts as a hit on two different IP addresses, for a total of two hits. It will also yield a warning: 192.0.2.1 # comment 2001:DB8::1 # comment garbage that generates a warning tuple Same as the line input format, except the count is specified in a second, whitespace-separated field. This, for example, will count one hit for the first IP address, and two for the second one, and will generate a warning. 192.0.2.1 1 # comment 2001:DB8::1 2 # comment garbage that generates a warning The “count” field can be anything: to represent a count, but also sizes, timings, asncounter doesn’t care. The counts are actually parsed as floats, as Python understand them. The default output format (tsv) will round the numbers to the nearest even integer. This, for example, adds up to 5, which might be surprising to some (because Python rounds 2.5 to 2 and not 3): 192.0.2.2 3.4 192.0.2.2 2.5 This is known as the “rounding half to even” rule or the IEEE 754 standard. If the --output-format is set to prometheus floats will be recorded as accurately as Python allows. In that context, the above correctly sums up to 5.9. tcpdump The tcpdump format is a bit of an odd ball: it parses a tcpdump(1) line with a regular expression to extract the source IP address, and counts that. It could be extended to count the packets sizes but currently does not do so. Likewise, it only tracks the left (source) side of packets, and not the destination, but could be extended to track both. This approach likely can’t deal with a multi-gigabit per second small packet attack (2 million packets per second or more). But in a real production environment, it could easily deal with regular the 100-200 megabit per second traffic, where tcpdump and asncounter each took about 2% of one core to handle about 3-5 thousand packets per second. scapy The scapy input format is also special: instead of parsing text lines, it parses packets. With the --interface flag, it will open the default interface unless one is provided (e.g. --interface is generally equivalent to --interface eth0 if eth0 is the primary interface). This requires elevated privileges. This is much slower than the tcpdump parser (close to full 100% CPU usage) in a 100-200mbps scenario like above, but could eventually be leveraged to implement byte counts, which are harder to extract from tcpdump because of the variability of its output. This only counts packets, regardless of direction, and, like tcpdump, only keeps track of source IP addresses. Like tcpdump, it could also be improved by tracking sizes instead of counts, but does not currently do so.
OUTPUT FORMATS
The --output-format argument also warrants a little more discussion. tsv TSV stands for Tab-Separated Values. It’s a poorly designed output formats that dumps two tables where rows are separated by newlines and columns by tabs. One table shows per ASN counts, the other shows per prefix counts. As mentioned in the above tuple section, counts are rounded when recorded in tsv mode. This is to simplify the display, in theory, the underlying https://docs.python.org/3/library/collections.html#collections.Counterfl Counter supports floats as well. If more precision, long term storage, or alerting are needed, the prometheus output format is preferred. This format is useful because it doesn’t require any dependency outside of the standard library (and, obviously, pyasn). prometheus The prometheus output format keeps tracks of counters inside Prometheus data structures. With the --port flag, it will open up a port (defaulting to 8999) where metrics will be exposed over HTTP, without any special security, on all interfaces. Otherwise, upon completion, results will be written in a textfile collector-compatible format. null The null output formats doesn’t display anything. It can be used for debugging, but internally uses the same recorder as the tsv format.
EXAMPLES
Simple web log counter This extracts the IP addresses from current access logs and reports ratios: > awk '{print $2}' /var/log/apache2/*access*.log | asncounter INFO: using datfile ipasn_20250527.1600.dat.gz INFO: collecting addresses from <stdin> INFO: loading datfile /home/anarcat/.cache/pyasn/ipasn_20250527.1600.dat.gz... INFO: finished reading data INFO: loading /home/anarcat/.cache/pyasn/asnames.json count percent ASN AS 12779 69.33 66496 SAMPLE, CA 3361 18.23 None None 366 1.99 66497 EXAMPLE, FR 337 1.83 16276 OVH, FR 321 1.74 8075 MICROSOFT-CORP-MSN-AS-BLOCK, US 309 1.68 14061 DIGITALOCEAN-ASN, US 128 0.69 16509 AMAZON-02, US 77 0.42 48090 DMZHOST, GB 56 0.3 136907 HWCLOUDS-AS-AP HUAWEI CLOUDS, HK 53 0.29 17621 CNCGROUP-SH China Unicom Shanghai network, CN total: 18433 count percent prefix ASN AS 12779 69.33 192.0.2.0/24 66496 SAMPLE, CA 3361 18.23 None 298 1.62 178.128.208.0/20 14061 DIGITALOCEAN-ASN, US 289 1.57 51.222.0.0/16 16276 OVH, FR 272 1.48 2001:DB8::/48 66497 EXAMPLE, FR 235 1.27 172.160.0.0/11 8075 MICROSOFT-CORP-MSN-AS-BLOCK, US 94 0.51 2001:DB8:1::/48 66497 EXAMPLE, FR 72 0.39 47.128.0.0/14 16509 AMAZON-02, US 69 0.37 93.123.109.0/24 48090 DMZHOST, GB 53 0.29 27.115.124.0/24 17621 CNCGROUP-SH China Unicom Shanghai network, CN This can also be done in real time of course: tail -F /var/log/apache2/*access*.log | awk '{print $2}' | asncounter The above report will be generated when the process is killed. Send SIGHUP to show a report without interrupting the parser: pkill -HUP asncounter You can count sizes with --input-format=tuple as well. Assuming the size field is in the 10th column, this will sum sizes instead of just number of hits: tail -F /var/log/apache2/*access*.log | awk '{print $1 $10}' | asncounter --input-format=tuple If logs hold that information, you can also add up processing times, for example. tcpdump parser Extract IP addresses from incoming TCP/UDP packets on eth0 and report the top 5: > tcpdump -c 10000 -q -i eth0 -n -Q in "(udp or tcp)" | asncounter --top 5 --input-format tcpdump tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes INFO: collecting IPs from stdin, using datfile ipasn_20250523.1600.dat.gz INFO: loading datfile /root/.cache/pyasn/ipasn_20250523.1600.dat.gz... INFO: loading /root/.cache/pyasn/asnames.json ASN count AS 136907 7811 HWCLOUDS-AS-AP HUAWEI CLOUDS, HK 8075 254 MICROSOFT-CORP-MSN-AS-BLOCK, US 62744 164 QUINTEX, US 24940 114 HETZNER-AS, DE 14618 82 AMAZON-AES, US prefix count 166.108.192.0/20 1294 188.239.32.0/20 1056 166.108.224.0/20 970 111.119.192.0/20 951 124.243.128.0/18 667 A query similar to the HTTP log parser might be: tcpdump -q -i eth0 -n -Q in "tcp and (port 80 or port 443)" | grep 'Flags \[S\]' | asncounter --input-format=tcpdump --repl ... otherwise you will get different results from a pure packet count, as various connections will yield different number of packets! The above counts connection attempts, which is still different than an actual HTTP hit, as the connection could be refused before it reaches the webserver or aborted before it gets logged properly. It’s still a good estimate, and is especially useful if you do not log IP addresses, for example on high traffic caching servers. Note that we use grep above because tcpdump’s tcp[tcpflags] & tcp-syn != 0 only works for IPv4 packets, a disappointing (but understandable) limitation. scapy parser Extract IP addresses directly from the network interface, bypassing tcpdump entirely: asncounter --interface REPL With --repl, you will drop into a Python shell where you can interactively get real-time statistics: > awk '{print $2}' /var/log/apache2/*access*.log | asncounter --repl --top 2 INFO: using datfile ipasn_20250527.1600.dat.gz INFO: collecting addresses from <stdin> INFO: starting interactive console, use recorder.display_results() to show current results INFO: recorder.asn_counter and .prefix_counter dictionaries have the full data Python 3.11.2 (main, Apr 28 2025, 14:11:48) [GCC 12.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. (InteractiveConsole) >>> INFO: loading datfile /home/anarcat/.cache/pyasn/ipasn_20250527.1600.dat.gz... INFO: finished reading data >>> recorder.display_results() INFO: loading /home/anarcat/.cache/pyasn/asnames.json count percent ASN AS 13008 69.38 66496 SAMPLE, CA 3422 18.25 None None total: 18748 count percent prefix ASN AS 13008 69.38 192.0.2.0/24 66496 SAMPLE, CA 3422 18.25 None total: 18748 >>> recorder.asn_counter Counter({66496: 13008, None: 3422, [...]}) >>> recorder.prefix_counter Counter({'192.0.2.0/24': 13008, None: 3422, [...]}) So you can get the actual number of hits for an AS, even if it’s not listed in the --top entries with: >>> recorder.asn_counter.get(66496) 13008 Blocking whole networks asncounter does not block anything: it only counts. Another mechanism needs to be used to actually block attackers or act on the collected data. If you want to block the network blocks, you can use the shown netblocks directly in (say) Linux’s netfilter firewall, or Nginx’s access module geo module. For example, this will reject traffic from a network with iptables: iptables -I INPUT -s 192.0.2.0/24 -j REJECT or with nftables: nft insert rule inet filter INPUT 'ip saddr 192.0.2.0/24 reject' This will likely become impractical with large number of networks, look into IP sets to scale that up. With Nginx, you can block a network with the deny directive: deny 192.0.2.0/24; This will return a 403 status code. If you want to be fancier, you can return a tailored status code and build a larger list with the geo module: geo $geo_map_deny { default 0; 192.0.2.0/24 1; } if ($geo_map_deny) { return 429; } Many networks can be listed in the geo block relatively effectively. [pyasn][] doesn’t (unfortunately) provide an easy command line interface to extract the data you need to block an entire AS. For that, you need to revert to some Python. From inside the --repl loop: print("\n".join(sorted(recorder.asn_all_prefixes(64496)))) This will give you the list of ALL prefixes associated with AS64496, which is actually empty in this case, as AS64496 is an example AS from RFC5398. Note the list of prefixes is not aggregated by default. If netaddr is installed, you can pass aggregate=True to reduce the set. Aggregating results It might be worth aggregating large number of netblocks for performance reasons. Network block announcements can be spread in multiple contiguous blocks for various reasons and can often be unified in smaller sets. For IPv4-only, iprange is good (and fast) enough: > grep -v :: networks > networks-ipv4 > iprange < networks-ipv4 > networks-ipv4-filtered > wc -l networks* 588 networks 495 networks-ipv4 181 networks-ipv4-filtered If you have it installed, the netaddr Python package can also do that for you, and it supports IPv6: import netaddr print("\n".join([str(n) for n in netaddr.cidr_merge(recorder.asndb.get_as_prefixes(64496))])) Note that asncounter can aggregate those results directly now, for example: print(recorder.asn_all_prefixes_str(66496, aggregate=True)) ... but, as above, it requires the netaddr package to be available. Selective blocking A more delicate approach is to block all network blocks from a specific ASN that have been found in the result sets, instead of blocking the entire netblock. The recorder.as_prefixes and recorder.as_prefixes_str functions can do this for you, merging multiple ASNs and aggregating with netaddr as well: print(recorder.asn_prefixes_str(66496, 66497, aggregate=True)) Note that the asn_prefixes selectors are not implemented in Prometheus mode. Remember you can extract the list of current ASNs and prefixes just by looking at the dictionary keys as well: print("\n".join(recorder.asn_counter.keys())) print("\n".join(recorder.prefix_counter.keys()))
FILES
~/.cache/pyasn/ Default storage location for pyasn cache files. /run/$UID/asncounter-manhole-$PID or ~/.local/.state/asncounter-manhole-$PID Default location for the debugging manhole socket, if enabled.
LIMITATIONS
• only counts, does not calculate bandwidth, but could be extended to do so • does not actually do any sort of mitigation or blocking, purely an analysis tool; if you want such mitigation, hook up asncount in Prometheus and AlertManager with web hooks, this is not a fail2ban rewrite • test coverage is relatively low, 37% as of this writing. most critical paths are covered, although not the scapy parser or the RIB file download procedures • requires downloading RIB files, could be improved by talking directly with a BGP router daemon like Bird or FRR • only a small set of tcpdump outputs have been tested • the REPL shell does not have proper readline support (keyboard arrows, control characters like “control a” do not work) Note that this documentation and test code uses sample AS numbers from RFC5398, IPv4 addresses from RFC5737, and IPv6 addresses from RFC3849. Some more well known entities (e.g. Amazon, Facebook) have not been redacted from the output for clarity. Performance considerations As mentioned above, this will unlikely tolerate multi-gigabit denial of service attacks. The tcpdump parser, however, is pretty fast and should be able to sustain a saturated gigabit link under normal conditions. The scapy parser is slower. Memory usage seems reasonable: on startup, it uses about 250MB of memory, and a long-running process with about 40 000 blocks was using about 400MB. By extrapolation, it is expect that data on the full routing table (currently 1.2 million entries) could be held within 12 GB of memory, although that would be a rare condition, only occurring on a core router with traffic from literally the entire internet. Security considerations There’s an unknown in the form of the C implementation of a Radix tree in pyasn. asncounter itself should be fairly safe: it does not trust its inputs, and the worse it can do is likely a resource exhaustion attack on high traffic. It can run completely unprivileged as long as it has access to the input files, although in many scenarios people will not bother to drop privileges before calling it and it will not, itself, attempt to do so. Privileges can be dropped with systemd-run, for example: systemd-run --pipe --property=DynamicUser=yes \ --property=CacheDirectory=asncounter \ --setenv=XDG_CACHE_HOME=/var/cache/asncounter \ -- asncounter This interacts poorly with --repl option, as it tries to reopen the tty for stdin. You might have better luck with sharing a debug socket with --manhole: systemd-run --pipe --property=DynamicUser=yes \ --property=CacheDirectory=asncounter \ --setenv=XDG_CACHE_HOME=/var/cache/asncounter \ -- asncounter --manhole=/var/cache/asncounter/asncounter-manhole Then you can open a Python debugging shell for further diagnostics with: nc -U /var/cache/asncounter/asncounter-manhole
AUTHOR
Antoine Beaupré anarcat@debian.org
SEE ALSO
tcpdump(8), fail2ban(1) ASNCOUNTER(1)