Tuesday, April 23, 2024

Getting fail2ban Working [with my weird choices] on Ubuntu 22.04 (jammy)

To put the tl;dr up front:

  1. The systemd service name may not be correct
  2. The service needs to be logging enough information for fail2ban to process
  3. Unrelatedly, Apple Mail on iPhone is really bad at logging into Dovecot
  4. Extended Research

[2024-04-26: Putting the backend in the DEFAULT section may not actually work on all distributions.  One may need to copy it into each individual jail (sshd, postfix, etc.) for it to take effect.]

A minimalist /etc/fail2ban/jail.local for a few services, based on mine:

[DEFAULT]
backend = systemd
[sshd]
enabled = true
journalmatch = _SYSTEMD_UNIT=ssh.service + _COMM=sshd
[postfix]
enabled = true
journalmatch = _SYSTEMD_UNIT=postfix@-.service
[pure-ftpd]
enabled = true
journalmatch = _SYSTEMD_UNIT=pure-ftpd.service

(The journalmatch for pure-ftpd removes the command/_COMM field entirely.)

Using systemd

I decided not to have redundant local logs on my VPS, so I got rid of rsyslogd completely.  Instead, I installed python3-systemd (unless it was already), and changed the backend of fail2ban to systemd.

Because of this, I barked up a lot of wrong trees, trying to understand if the message matching was broken by using systemd.  It’s not.  fail2ban formats a journal entry as “hostname command[pid]: content”, and then uses the prefix regex to skip the non-content stuff.

(Normally, I would not have considered this possibility very hard.  The distribution should have the correct configuration… right?  But for the service name, it definitely didn’t, so I felt I had to double-check everything related to processing the systemd journal.)

The service name

The default configuration for sshd in /etc/fail2ban/filters.d/sshd.conf had the following line:

journalmatch = _SYSTEMD_UNIT=sshd.service + _COMM=sshd

But, that service name is an alias to ssh.service, which is apparently the real name that appears in the journal.  In fact, if you ask journalctl to filter by that unit name, it doesn’t find anything:

$ journalctl -u sshd.service
-- No entries --

Weird, huh?  I managed to figure it out by looking at the JSON output from the past few minutes (there were a lot of bots showing up):

$ journalctl --since -5m --output=json-pretty
{
    "_SYSTEMD_UNIT" : "ssh.service",
    ...
}

Similarly, I dug around and found some problems with the other services mentioned in the tl;dr.  The postfix service is not just postfix.service, and in my pure-ftpd configuration, the command is not simply pure-ftpd.

The error logging

For some reason, fail2ban wasn’t processing bans for sshd.

I could see lines being logged from PAM like:

pam_unix(sshd:auth): authentication failure; ... rhost=IP.ADDR.HERE

This matched a regex, which confused me endlessly, until I understood that the regex has <F-NOFAIL> in it.  It is adding context to whatever matches on another line.

A long time ago, I had been offended by the amount of logging that sshd did for failed connections, when I wasn’t going to do anything about them.  I turned down the verbosity by configuring LogLevel ERROR for it instead.

Now, that boomerang came back and hit me.  Without all the normal logging, not enough lines are produced for fail2ban to successfully detect failures and process a ban!  There was no other line to match.

I reverted the LogLevel, and things started working.

Apple Mail

Previously, I had seen four-connection bursts to Dovecot from Apple Mail, so I had the maxretry at 10 to allow for some slop.  While debugging everything else, I noticed there was still a ban placed on Dovecot for my IP.

I don’t really know why Mail needs more than two connections, or why it needs to fail once before succeeding on every single connection that it makes.

I’d get someone else’s mail app, but then they’d be doing analytics and AI training on it.  The whole point of not sending my email to Google is to keep it somewhat more private.

Extended Research

[This section was added to the post, and also updated, on 2024-04-26.]

In theory, the main [sshd] jail definition sets the backend to “the default backend”, which the fail2ban team thinks is not the backend set in the [DEFAULT] section.

In my setup, it appears to be affected only by this set of definitions:

default_backend = %(default/backend)s
sshd_backend = %(default_backend)s

(line 10 and 34, respectively, of my /etc/fail2ban/paths-common.conf file.)

My jail.conf includes paths-debian.conf, which does not do anything with the backends.  There is no paths-override.local to come after it, so this should be everything.

What does it mean?  As far as I can tell from the code in the package on the system, fail2ban has extended the Python ConfigParser syntax to recognize interpolations of the form section/value.  ConfigParser still lower-cases everything internally; thus, %(default/backend)s ends up being the backend from my [DEFAULT] section, making these the rules that “makes it all work” for me.

If I didn’t set the backend, then it would ultimately fall back to log files in the code, I think.  systemd is not the default source.

The scenario is pretty much the same in git; jail.conf doesn’t include the paths-debian.conf file, instead noting it should be customized by the distribution, but otherwise, the general structure is the same.  (They have added sshd_backend = systemd to paths-debian.conf.)

Regardless of all this theory…

I can verify experimentally that it works: fail2ban-client banned shows IPs being banned, and the rules performing them are visible in iptables -vn -L.

I have also verified that having the [DEFAULT] rule doesn’t actually break all of my other jails.  It is systemd which listens for syslog messages, and forwards those (in addition to the journal) to the actual syslog daemon.  In other words, rsyslogd receives from the journal, and the latter is the authoritative source.  I stopped rsyslog.service and fail2ban is still able to act on mail logs.

No comments: