Restart a failed service - Marc-Olivier Meunier

Reading Time: < 1 minute

Sometimes my MariaDB database fails and crash. I am not sure why and I am not sure I want to find out why as I am guessing it would require some investigation and it may be above my pay grade.

It happens quiet rarely but every now and then it happens and when it does, my supervision is pinging on Slack and sending emails until it’s fixed.

One really bad option to hide this problem would be to automatically restart the database when it fails. I could run a cron every minute to check that the database is up.

It’s a very bad option. If we push this solution to its extreme, I could find myself in a position where the database is crashing every few minutes, and I restart it right away, before the supervision would detect the problem, and I would not be aware of the problem. No need to say that having a database crashing every few minutes is suboptimal…

But we could still do it for now as it seems it is crashing really rarely.

The easiest option I’ve found is to use systemctl.

Running this in a cron will do the job:

* * * * * systemctl is-failed --quiet mariadb.service && systemctl start mariadb.service

And maybe I could get a notification that this happened by sending myself an email… except emailing from a server is hard these days… If you know, you know.

Related posts:

By Marc Olivier Meunier

Leave a Reply Cancel reply