A necessary post-mortem

TL;DR: This is a post-mortem for the degraded performance of miniature.photography that occured between December 9th and December 10th.

Preface

When I set up miniature.photography, I used the version of PHP that came with the default repositories of Debian stable, 7.4. While it worked at the time, pretty soon afterwards some of the dependencies Pixelfed relies on began requiring PHP 8.x. Because of ensuring that I would be able to continue to apply security fixes, and because the new version of Pixelfed contains features that are of interest, a timely upgrade was required.

Incident

I dutifully performed the upgrade to PHP 8.1 on the morning of Friday, December 9th 2022, and then upgraded Pixelfed, following the steps noted in the Pixelfed documentation:

git pull origin dev
composer install
php artisan config:cache
php artisan route:cache
php artisan migrate

No errors were reported back, however it was not possible to upload pictures anymore. For some reason Pixelfed created the files with the wrong set of permissions, effectively prohibiting the webserver from accessing them. I initially worked around that by performing some fuckery with file permissions.

When I wanted to report that bug I found out that someone else luckily beat me to it. This issue shouldn’t bite anyone else.

Uploading worked again, stories worked, all the other local functionality seemed to work. I could also find accounts on other instances through the search mask, which - wrongfully, obviously so in hindsight - gave me the impression that federation was working as well. That in turn made me consider the upgrade to be successful and complete, and I communicated this in a post.

I didn’t really suspect anything, because people were able to post new pictures, until I noticed that an account (on another instance) that usually posts at least once per day hasn’t posted since the upgrade has happened.

Manually checking the profile revealed that said account had in fact posted since the upgrade, but for some reason the posts never showed up in my timeline. Further research confirmed that there was no federation was happening, at all.

I started looking into the issue, but even after two hours of looking at every potential issue I could think of I was no closer to understanding what was happening. I communicated this frustrating status on the evening of yesterday, December 10th 2022.

After going through the documentation for the umpteenth time I realized that the processing of the queue was managed by a cronjob. Checking it revealed the following line:

ExecStart=/usr/bin/php7.4 /var/www/pixelfed/artisan horizon

This obviously is a problem, because the application was already using PHP 8.1 and thus running into issues & jobs were failing because of incompatibility issues. Fixing the path and restarting the services immediately caused the server load to spike, up to 20 - which is a good thing, because it meant that the backlog of jobs was being processed. After roughly 30 hours everything was working properly again.

Lessons identified / learned

  • Updating Pixelfed doesn’t simply involve the application itself, it involves the queueing as well. I took note of this for my personal upgrade “guide” to ensure I will look there for troubleshooting support as well, in case it becomes necessary in the future.

  • In case of Pixelfed, webserver- and PHP-logs alone are not enough to triage issues. For Laravel applications that utilize Horizon the application logs are crucial as well. I noted this to make sure I will check there in future instances of debugging.

  • Simply checking if posting to the local instance and seeing if other instances can be found through the search mask does not suffice to ensure full functionality. For future upgrade I’ll make sure to include various tests of federation.


I apologize for the inconvenience, especially for the time it took to identify that things weren’t working as intended. This upgrade went significantly worse than I originally anticipated. I’m confident that the next upgrade won’t be causing that severe of issues - or, at least, I’m better prepared for it.