Geek Blight - Bespoke solution to monitor power outages at home

Bespoke solution to monitor power outages at home

Posted on 2024-10-29T18:31Z.

When I came home from a 5-days family trip this summer I immediately realized power was off in our flat. The main switch in the electricity panel was down together with one other additional switch. Everything appeared to have happened a few days before we arrived, so a few things in the fridge were ruined and most of the freezer contents had to be discarded. This was despite the fact we have relatives living close by with an emergency set of keys but, as we were completely unaware of the events, we couldn’t ask them to go check.

I thought about what happened and decided I wanted to set something up so I would get warned if power fails while I’m away. My first thoughts were using something that was available off-the-shelf, but I failed to find something cheap and easy. Fortunately, I already had a couple of things that could help me in this: a small cloud server (that hosts this blog) and a permanently-connected RPi4 that I use as a Pi-Hole at home. To be warned of a power failure, I wanted to use my RPi to ping (somehow) the cloud server from time to time and, on the cloud server, periodically check if we have received a recent ping. If too much time goes by without receiving a ping from home, we can assume something’s wrong and either we have a power outage or an Internet service outage.

The implementation would need the following things:

The cloud server had to be able to send me an email.
The cloud server could have a CGI script that, when accessed, would write a timestamp somewhere.
The RPi would access that CGI script once every minute, for example.
The cloud server would have something to check timestamps periodically, then email me if it’s been too long without a ping.

The difficulty is that I’m not a web developer, plus I’m using nginx on the cloud server and nginx doesn’t support CGI scripts, which complicates things a bit. However, I made all of this work and wanted to share my scripts in case someone finds it useful.

Sending emails from the server

This one is easy because I was already using something similar to monitor disks on a few computers using smartd. When smartd detects a disk may be about to fail, it can be told to email root, and we can use $HOME/.forward to redirect the email with a script. The script, as in this case, can use msmtp, which is a nice program that lets you send emails from the command line using an SMTP server. Thanks to Fastmail, I generated a new set of credentials for SMTP access, installed msmtp in the cloud server and created a config file for it in /etc/msmtprc. Note running msmtp --version will report the right system configuration file name. The configuration file looks like this:

account default
host SERVER
port PORT
auth on
user USERNAME
password PASSWORD
tls on
tls_certcheck on
tls_starttls off
tls_trust_file /etc/ssl/certs/ca-bundle.crt
syslog on
timeout 30

In my case, SERVER is smtp.fastmail.com, PORT is 465 and USERNAME and PASSWORD are the ones I created. The TLS trust file has that path in Fedora, but it may be different on other distributions.

With that configuration all set, I created the following script as /usr/local/bin/pingmonitor-mail:

#!/usr/bin/env bash
FROM=YOUR_EMAIL_ADDRESS
TO=YOUR_EMAIL_ADDRESS
DATE="$( TZ=Z date -R )"
SUBJECT="$1"
BODY="$2"

msmtp -f "$FROM" "$TO" <<EOF
From: $FROM
To: $TO
Date: $DATE
Subject: $SUBJECT

$BODY
EOF

It expects the subject of the email as the first argument and typically a sentence for the body as the second argument. I ran it a few times from the command line and verified it worked perfectly.

CGI script to record ping timestamps

As mentioned before, nginx does not support CGI. It only supports FastCGI, so this is slightly more complicated than expected. After a few tries, I settled on using /var/run/pingmonitor as the main directory containing the FastCGI socket (more on that later) and /var/run/pingmonitor/pings for the actual pings.

I thought a bit about how to record the ping timestamps. My initial idea was to save it to a file but then I started overthinking it. If I used a file to store the timestamps (either appending to it or overwriting the file contents) I wanted to make sure the checker would always read a full timestamp and wouldn’t get partial file contents. If the CGI script wrote the timestamp to the file it would need to block it somehow in the improbable case that the checker was attempting to read the file at the same time. To avoid that complication, I decided to take advantage of the file system to handle that for me. /var/run/pingmonitor/pings would be a directory instead. When the CGI script runs, it would create a new empty file in that directory with the timestamp being the name of the file. The checker would list the files in the directory, convert their names to timestamps and check the most recent one. I think that works because either the file exists or it does not when you list the directory contents, so it’s atomic. If you know it’s not atomic, please leave a comment or email me with a reference.

For the FastCGI script itself, I installed the fastcgi Python module using pip. This allowed me to create a script that easily provides a FastCGI process that launches before nginx, runs as the nginx user and creates the timestamp files when called. Take a look below:

#!/usr/bin/env python
import os
import fastcgi
import sys
import pwd
import grp
import time
import pathlib

RUN_DIR = '/var/run/pingmonitor'
PINGS_DIR = os.path.join(RUN_DIR, 'pings')
USER='nginx'
GROUP='nginx'
ONE_SECOND_NS = 1000000000

# Create run and pings directory. Not a problem if they exist.
os.makedirs(RUN_DIR, mode=0o755, exist_ok=True)
os.makedirs(PINGS_DIR, mode=0o755, exist_ok=True)

# Get UID and GID for nginx.
uid = pwd.getpwnam('nginx').pw_uid
gid = grp.getgrnam('nginx').gr_gid

# Make the directories be owned by the nginx user, so it can create the socket
# and ping files.
os.chown(RUN_DIR, uid, gid)
os.chown(PINGS_DIR, uid, gid)

# Switch to the run (base) directory to create the socket there.
os.chdir(RUN_DIR)

# Become the nginx user.
os.setgid(gid)
os.setuid(uid)

@fastcgi.fastcgi()
def pingmonitor():
    timestamp = time.time_ns() // ONE_SECOND_NS
    filename = '%016d' % (timestamp,)
    path = os.path.join(PINGS_DIR, filename)
    pathlib.Path(path).touch()
    sys.stdout.write('Content-type: text/plain\n\n')
    sys.stdout.write('OK\n')

Apart from directory creation and user switching logic at the beginning, the interesting part is the pingmonitor function. It obtains the epoch in nanoseconds and converts it to seconds. The file name is a zero-padded version of that number, which is is then “touched”, and a reply is served to the HTTP client.

Not pictured, is that by decorating the function with @fastcgi.fastcgi(), a socket is created in the current directory (/var/run/pingmonitor) with the name fcgi.sock. That socket is the FastCGI socket that nginx will use to redirect requests to the FastCGI process. Also, if you run that file as a script, the decorator will create a main loop for you.

I saved the script to /usr/local/bin/pingmonitor.cgi and set up a systemd service file to start it. The systemd unit file is called called /etc/systemd/system/pingmonitor.service:

[Unit]
Description=FastCGI Ping Monitor Service
After=network.target

[Service]
Type=simple
Restart=always
RestartSec=1
ExecStart=/usr/local/bin/pingmonitor.cgi

[Install]
WantedBy=nginx.service

To hook it up with nginx, I created a block in its configuration file:

        location /cgi-bin/RANDOM_STRING-pingmonitor.cgi {
            # Document root
            root DOCUMENT_ROOT;
            # Fastcgi socket
            fastcgi_pass unix:/var/run/pingmonitor/fcgi.sock;
            # Fastcgi parameters, include the standard ones
            include /etc/nginx/fastcgi_params;
            # Adjust non standard parameters (SCRIPT_FILENAME)
            fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
        }

I used a StackOverflow question as a reference for this.

In the nginx configuration block you can see I’m using RANDOM_STRING as part of the CGI script URL, which is a long random string. This is because I didn’t want that URL to be easily discoverable. Its location is basically a secret between my server and my RPi.

After setting everything up I accessed the URL with my browser multiple times, confirmed the timestamp files were being created, etc.

Accessing the CGI script periodically

This is the easy part that goes in the RPi. I could’ve used a systemd timer but went with a service instead (like the guy pushing all shapes through the same hole), so the main part is a script that pings the URL once a minute, saved as /usr/local/bin/pingmonitor-pinger.sh.

#!/usr/bin/env bash
while true; do
    sleep 60
    curl --silent --max-time 30 -o /dev/null URL
done

And the corresponding systemd service file called /etc/systemd/system/pingmonitor-pinger.service:

[Unit]
Description=Ping Monitor Pinger Service
After=network.target

[Service]
Type=simple
Restart=always
RestartSec=1
ExecStart=/usr/local/bin/pingmonitor-pinger.sh

[Install]
WantedBy=multi-user.target

Checking timestamps periodically

This part goes into the cloud server again. The script tries to send a single email when it detects pings are too old (1000 seconds, more or less reasonable limit chosen arbitrarily), and another one if the pings come back. It’s also in charge of removing old ping files. I could have removed all existing files with each check, but I decided to arbitrarily keep the last 10 in case it was useful for something. To send emails, it uses /usr/local/bin/pingmonitor-mail as described above. I saved it under /usr/local/bin/pingmonitor-checker.py.

#!/usr/bin/env python
import glob
import os
import time
import subprocess
import sys

PINGS_DIR = '/var/run/pingmonitor/pings'
MAIL_PROGRAM = '/usr/local/bin/pingmonitor-mail'
MAX_DIFF = 1000 # Seconds.
SLEEP_TIME = 60 # Seconds.
MAX_FILES = 10
ONE_SECOND_NS = 1000000000

def get_epoch():
    return time.time_ns() // ONE_SECOND_NS

def print_msg(msg):
    print('%s' % (msg,), file=sys.stderr)

os.makedirs(PINGS_DIR, mode=0o755, exist_ok=True)
os.chdir(PINGS_DIR)

start_time = get_epoch()
ping_missing = False

while True:
    now = get_epoch()

    # List of files with a numeric name.
    filenames = glob.glob('0*')

    # Check the last timestamp. If no files exist yet, wait at least from the start
    # of the script.
    if len(filenames) == 0:
        last_timestamp = start_time
    else:
        filenames.sort()
        most_recent = filenames[-1]
        last_timestamp = int(most_recent, base=10)

    current_diff = now - last_timestamp

    # Remove old files.
    if len(filenames) > MAX_FILES:
        kept_files = filenames[-MAX_FILES:]
        for fn in filenames:
            if fn not in kept_files:
                os.remove(fn)

    if current_diff > MAX_DIFF and (not ping_missing):
        ping_missing = True
        subject = '[pingmonitor] No pings for %d seconds' % (MAX_DIFF,)
        body = 'Last timestamp: %s' % (time.ctime(last_timestamp),)
        print_msg('%s; %s' % (subject, body))
        subprocess.run([MAIL_PROGRAM, subject, body])

    elif current_diff < MAX_DIFF and ping_missing:
        ping_missing = False
        subject = '[pingmonitor] Ping recovered'
        body = 'Last timestamp: %s' % (time.ctime(last_timestamp),)
        print_msg('%s; %s' % (subject, body))
        subprocess.run([MAIL_PROGRAM, subject, body])

    time.sleep(SLEEP_TIME)

Again, such an script could be run as a systemd timer, but I decided to write it as a loop and use a service instead, called /etc/systemd/system/pingmonitor-checker.service.

[Unit]
Description=Ping Monitor Checker Service
After=pingmonitor.service
Wants=pingmonitor.service

[Service]
Type=simple
Restart=always
RestartSec=1
ExecStart=/usr/local/bin/pingmonitor-checker.py

[Install]
WantedBy=multi-user.target

Final thoughts

After setting that up, I checked it works by experimenting with a few timeouts and stopping and starting the pinger service on the RPi. I’m pretty happy with how things turned out, given that this sits outside my usual domain. With an unreliable Internet connection at home, what I did may not be suitable for you if all you’re interested in are the actual power outages. In my case, Internet outages are very infrequent so I’m willing to live with a few false positives if that means I won’t waste the contents of my fridge and freezer again.

Load comments