Autolint on Build

In my last role I was hired to start the company's quality engineering department. If you're not familiar, quality engineering is not quality assurance of quality control: the goal is not just to catch bugs before they go out, but to set up a system so that the bugs never exist in the first place. This system might be a process that every feature has to go through before it can be released or it might be a set of requirements on the codebase itself before you allow code to every make it to a testing environment. As the name suggests, I was tasked with engineering quality into an organization.

From that experience, ones that have come since, and now looking back on my own behavior as a software engineer I believe the fastest way to improve the quality of your software is to automatically lint code during normal build steps.

You can just test more

Testing by hand is a process. I'm not a fan of process.

While the Checklist Manifesto did make something in my brain click on making manual processes more reliable, I almost always groan whenever I have to do anything manual and repetitive, often audibly. Others may not have this same kind of reaction, but I regularly witness software engineers work around processes. In my mind, the only way to improve the quality of the software shipped was to shift left and automate the quality controls.

Shift-left testing is an approach to software testing and system testing in which testing is performed earlier in the lifecycle (i.e. moved left on the project timeline). It is the first half of the maxim "test early and often".

The furthest left you can shift is the developer's computer, before it ever reaches the eyes of a peer. This is also the perfect place to write automated tests, since the developer is in the best position to know what the expected behavior of the code is.

This isn't to say manual testing isn't necessary. Engineers can write the best set of checks, but if the logic itself is incorrect, those tests will just validate incorrect behavior. A human does need to use the system at some point to make sure it not only behaves consistently, but also correctly.

You can't always write tests

One of the perpetually hot topics in the software community is the concept of automated testing. Various nuances here were extremely controversial even in my career; less than 10 years ago! But the biggest thing to know: if your tests don't pass, your code cannot be deployed.

Writing tests is a process. They almost always take as much, if not more, time than writing the functional portions of your code. And if you don't remember, software engineers are very good at not doing processes. What if you didn't write tests? Then nothing would check your code and you could just ship it! What if the tests take a long time, or one of them fails occasionally for reasons that you can't explain? Delete the tests! Especially in large and old codebases, this is actually common practice. When no one knows what the test is supposed to test, or whether the assumptions still hold, it's easier to get rid of the test that dig up the reasoning for its existence.

What if you require the tests? What if a person, say an engineer with a focus on quality, were required to check every code change and verify the tests were sufficient to validate the feature or fix? That's been proposed several times in my career, and every time I've had to shoot it down. That's not engineering! If I'm asked to improve quality in a system, I have to integrate into the system, not force everything to funnel through me.

Even further left

If you can't guarantee tests get written, what else can you do for quality? How can you shift even further left? Could you shift testing to before tests get written? It turns out, there's a lot more you can do to make the developer experience so nice, no one would dare not follow best practices. Things like automatically formatting code, or running an analyzer to avoid bad practices and potential errors outside of the actual logic in your program. This latter group of tools are called linters, and they might be the most valuable tool in the development workflow.

Why? Linters are (or tend to be) fast. There's no process to think about when execution time is measured in tens of milliseconds. They also tend to be quite good, aggregating decades of best practices and gotchas into extremely compact tools. Things like ErrorProne for Java or Ruff for Python add basically no overhead to build times but catch things that compilers don't.

The issue is that there are so many linters out there that you may want to run. A single repository might want to lint and format its Python code, lint shell scripts, verify it's build file is up to date, generate an image for its database schema, and lint a container file. What was not a process is now 5 or 6 commands to run, and every codebase might have a slightly different set of commands. This is now a process again.

Lint on build

At some point at this job I found pre-commit, a utility that makes it easy to run a suite of automated checks before every git commit. The genius of the tool is that it will fail if any files are modified, which means that you can format and lint all in one go. All of a sudden, pre-commit run -a runs all checks over your codebase in one command, and if you've set it up correctly, this will be fast. And if you pre-commit install, you can even make it so these all run before code can even be committed. Suddenly every commit is of at least some minimum quality.

When I joined my next company, one of the first things was implement a similar setup for the Java codebases I worked in. As much as possible, I wanted build steps to be encapsulated by a single command that would run by default. Doing some rather unfortunate Gradle wizardry in a couple places, I added a code generation, auto formatters, and linters to the codebase with no human-discernible difference to build times.

Similar to the introduction of pre-commit, there was an immediate jump in code quality. Linters catch things that humans may not, and auto formatters make finding differences in code reviews significantly easier. The effects, perhaps unlike economics, trickle down almost immediately: with less effort to give quality code reviews, reviewers could suggest test cases. By suggesting test cases, developers started writing more tests ahead of time to reduce comments on code reviews. By writing more tests, features shipped with higher quality.

tl;dr: make it trivial to lint and format your codebase.