Hello all! We’re back with another “What’s cooking”, after another too-long hiatus since September. We did promise to resume monthly updates, but in hindsight that seems a bit ambitious given everything on our plate. For now we’re going to aim for the more modest ambition of publishing updates quarterly.
Drew’s updates
My work has been proceeding as expected since September, with significant disruptions due to the LLM bot crisis, which to my consternation has occupied a substantial amount of time I would have rather spent on more important priorities. Nonetheless, I am happy to report that we have more or less got the bot crisis under control – though it consumed nearly all of my attention for several weeks, I can now occasionally enjoy up to several days in a row without one of our services being knocked out by LLM scrapers.
In any case, I finished the git webhook upgrades I mentioned in September, though for now there is no option for a webhook recipient to intervene to prevent a push from proceeding. I’ve taken advantage of this and other GraphQL-native webhooks besides to make inroads towards updating our internal users of the legacy API, notably by moving the project hub to the new API and porting over the first webhooks from the legacy system to GraphQL webhooks.
A matter of greater interest to most users is the upgrades I’ve been gradually rolling out for the billing system. There are numerous improvements in the pipeline, including support for more payment methods (e.g. iDEAL, SEPA bank transfers, etc) and more currencies (in particular Euro). But, this is a time-consuming process of gradual improvements and migrations. The billing system has been largely untouched since it was originally written in 2018, and there are thousands of paying users whose payments we’re processing every day, and we need those payments to flow correctly to keep the lights on. Each improvement is as small as possible and is carefully reviewed and monitored as it’s rolled out to ensure that the system works correctly without interruption.
This work is part of the bigger picture work to move SourceHut entirely from the US to the EU, in particular by processing payments through our European entity and in accordance with European laws and regulations. We’re getting there! I hope to continue this work through the rest of March, with the ambitious goal of accepting our first payments in Euro in Q1 (but, more realistically, early Q2… it’s a lot of work).
In addition to billing work, my main focus for the next quarter will be paying down more tech debt, and in particular finishing the deprecation of the legacy API. In the past couple of weeks we have made more aggressive moves to get rid of it, removing the legacy API documentation from man.sr.ht and disabling some features which are not still being used by anyone. I have some flashy end-user-facing features coming up soon as well – but let’s make it a surprise.
Conrad’s updates
Before I start, I’ll let you in on a little secret. Like many infrastructure engineers, I have evolved to have two questionable superpowers. One is stumbling across very weird issues that take days to debug only to be fixed eventually by a single-line code change. The second is to perform major refactorings, often coördinated across multiple codebases, resulting in plenty of code changes but no discernable difference in functionality of any of the touched components. Both these superpowers are greatly under-appreciated by managers (“Last week I stared at packet traces, then disabled a single sysctl on all servers, that’s all, really…”). Fortunately, there are no managers at SourceHut, because I have plenty of the second category for you!
All our Python packages are finally PEP440-compliant! 🎉 This took so long and added zero value for our users, yet it was inevitable. And it did have the pleasant side effect of finally unbreaking builds for submitted patches again.
Another epic, but mostly invisible undertaking was the recent “shared asset refactoring”. This one, however, brought significant quality-of-life improvements for developers. It also drastically changed the process for setting up a dev environment. The generic instructions have already been updated, but I am also working on a more detailed write-up. Piggy-backing on this big change we also renamed all executables to have a consistent naming scheme, which should help developers and instance admins alike.
And yet more from the “big changes you didn’t notice” department: we finally consolidated all SSH key handling in meta.sr.ht. Previously, services requiring SSH access (like git and hg) would have their own copy of the SSH key table, which was prone to stale data and other inconsistencies. This unlocked the removal of various chunks of legacy API surface, with even more cleanup yet to come. Specifically, the entire SSH dispatching - which currently has very similar implementations in various services - will get a face-lift, improving code re-use and reducing complexity.
Okay, I’ll give you a break - here is something very tangible: in lists.sr.ht, we removed the instance-wide configuration that blocks emails with HTML parts, putting list owners in full control to make their own choices about blocking such parts.
As for what’s cooking right now: besides the above-mentioned SSH dispatching I am working on refactoring log file locations for all services and providing log rotation policies with our packages. We are also preparing to extend our Ceph cluster, and we’ll have to retry botched Alpine upgrades on two hosts which had an unfortunate combination of packages installed, triggering some very sneaky breakage in Alpine’s ceph packages. So, you know, lot’s of epic stuff that no one will notice. Unless it breaks something.