Python code quality with Ruff - Part 1

Introduction

When I started to work on this post, I had first planned to present several python static analysis tools and how they suit our specific needs: linting, code format, software composition analysis, code metrics, vulnerability and bug detection.

However, after doing a bit of bibliographical research, it appears there are already excellent resources that do just that, see for instance this article, that one or this comprehensive list of tools. So, I ended up changing my mind. I decided to offer something different. I am going to walk you through the introduction of a static analysis tool on a real and already well-sized codebase.

The naive way to perform such a task, would be as follows:

install the tool locally,
run it on the codebase with its default configuration,
realize the, maybe large, number of warnings it flags (after all many softwares have been living for several years in the hands of teams of several developers and can easily end up weighing several thousand or million lines of code),
despair a bit,
fix a few defects,
run the tool as a non-blocking step in the continuous integration pipeline (if there is one, if there is enough time and energy, if the tool and the infrastructure make it easily possible),
get interrupted by another, more urgent, task and stop,
forget about it all, ignore the warnings in the CI pipeline. Until some code auditer comes back a few years later and mention it in a paragraph of his 807 pages report.

That’s, unfortunately how we often get owned IRL, time and time again. Trapped in this conundrum of limited time budget and overwhelming workload, we can easily resign ourselves and never make any progress along the quality axis. Slowly, entropy kicks in and the codebase degrades. I’d like to propose a working strategy out of this real trap.

Let’s first state the objectives I would like to reach:

I have a limited time bugdet, maybe only half an hour at the end of each week. I know I can be interrupted at any time to solve more pressing matters (such as adding a shiny new feature, incorporating a bit of AI). So I would like to be able to invest any amount of time, gradually.
I still would like to capitalize all the work done. I would like to avoid any of my code improvements, which require some time and effort to manifest, to be cancelled by a later code modification. After all, time is money, or rather energy. I would like to improve the level of code quality permanently: ensure that all new code follows the same stylistic rules from now on and preclude any bug, that the tool is able to hunt, from creeping back later into the codebase.
At last, I would not like to put the existing behaviour of the software at risk. It would be a shame to imperil some functionality for the sake of code quality, isn’t it? Any code modification for the sake of quality should be semantically lossless.

To remember these objectives, let us give them nifty names: time constraints, capitalist mindset and delicate chinaware. Interestingly enough, by setting myself more constraints, I will make a better use of my time. This still requires a bit of methodology, a bit of ingenuity, and more than anything else assiduity, maybe at a mildly obsessive level. A grain of peacefulness may counter-balance the
psychological strain of managing it all.

Ruff

For the tool, I picked Ruff. It provides a linter and a code formatter for python. It’s programmed in Rust. The recent releases on GitHub show it is active, at least at the time of writing this article. With 33k stars, more than Pylint or Flake8, it seems quite popular. It’s very clearly documented. As you will see, it’s extremely easy to install and use.

The development team made many sound design choices in terms of customization. Rules can be set on/off, down to the granularity of a single rule. Violations can be suppressed thanks to the insertion of noqa directives directly into the code or alternatively through the configuration. In this respect, option --add-noqa particularly drew my attention: it automatically adds specific noqa directives to
silence all warnings. This may enable one possible strategy to gradually fix all violations from an existing large codebase.

The Ruff linter can automatically fix some of the violations it reports. There are both safe and unsafe fixes. Unsafe fixes may break behaviour for corner cases. The integration in some popular continuous integration environments (such as GitHub actions) are also
documented. The list of rules is quite extensive, even though not equalling Pylint yet.

On the cons side, Ruff is still not yet in version 1.0.0, which may result in breaking functionalities at some point. And, unlike Flake8, it does not provide any plugin system to write extensions and leverage the power of the community. Let us hope this issue gets resolved soon.

DFIR-IRIS

I will demonstrate the methodology on the codebase of the open-source software DFIR-IRIS. DFIR-IRIS is a collaborative platform for incident response. It allows security analysts to collect, assemble and share different pieces of information (alerts, assets, indicator of compromises, notes. . . ) during the investigation of complex security incidents. It can be plugged on various feeds, such as VirusTotal,
MISP, IntelOwl, . . . Automation is made straightforward by calls to its REST API. DFIR-IRIS backend is written in python. As such, it is an ideal candidate for this walkthrough.

Let’s get started. It’s going to be fun. No harm in some dose of positive thinking, to encourage oneself: it’s always better than falling down into negative obsessive loops of the mind.

From its GitHub interface, I start by forking the project with all its branches. As indicated in the coding style, development happens on branch develop. So, once cloned locally, I immediately switch the code to this branch:

git switch develop

At the time I am writing this sentence, the latest commit on the branch was the one with sha b3a36cd4f1e465f20f517590061d3a5bb7973bca. That’s in case you want to reproduce the approach. To avoid installing unnecessary python packages in my homedir, I will immediately create a virtual environment.

python -m venv static-analysis-venv
source static-analysis-venv/bin/activate

A quick glance at the directory hierarchy lets me know, the python source files are all located in sub-directory source/. Out of curiosity, I immediatly wonder how many lines of codes are present. Over the course of my career, this has become a reflex to quickly gauge the scale of the project I am about to engage with. Like weight classes in fighting sports, I tend to rank softwares by size scales. Let us install pygount, the python tool in the lineage of sloccount:

pip install pygount
pygount source/ --format summary

Almost 25K lines of python code.

Continuous integration

It is time to install ruff, and launch a first check:

pip install ruff
ruff check source

The run yields 123 errors, of which 48 are automatically safely fixable and 46 more in an unsafe way. Interesting, this already gives me some ideas to proceed gradually.

Remember, I want to capitalize immediately. So my first move is going to set up the continuous pipeline so that Ruff is run after every push. I know that there are still unfixed warnings reported by the tool. But that’s the point. I want to check that the pipeline will block, whenever there are defects found in the codebase. Then, I will come up with a way to bring the error count to 0 and have the pipeline pass (I will cheat, but for good reasons).

In a way, this approach is inspired by the TDD philosophy underlying the red-green-refactor cycle. Since, I don’t want to break the main development branch, I will first isolate on a new branch. This will give me time to take any detour I would like and let me decide when to propose a pull-request:

git switch -c ruff_step_by_step
git push --set-upstream origin ruff_step_by_step

Next, I add the Ruff GitHub action, as documented here in a new step of the existing CI pipeline:

name: Check code with ruff
uses: astral-sh/ruff-action@v2
with:
args: check --output-format=github
src: ./source

Yes, the pipeline fails! To set it back to green, I am going to craft a configuration file which explicitely excludes all the rules that are triggered in this run. Alternatively, I could have added noqa directives automatically, using the --add-noqa option. I don’t do this, because I plan (sworn promise) to put the rules back one by one, correcting all violations of a given rule within a single commit. Since the
total amount of violations is not extremely high, this seems like a plausible plan.

If I ask the tool to list the violations in a concise way, I count 123 errors:

ruff check --output-format=concise source

The last reported error violates rule F841. So let’s craft a minimal pyproject.toml file to configure ruff so that it ignores this rule:

[tool.ruff.lint]
ignore = ["F841"]

We are down to 109 errors. Let’s continue this loop, without E712, and so on, until all checks passed. This is the configuration file I end up with:

[tool.ruff.lint]
ignore = ["E402", "E711", "E712", "E721", "E722",
"F401", "F403", "F541", "F821", "F841"]

I commit and push. Mission Accomplished. I just Made the Codebase Great Again. I told you I was going to cheat. Sometimes, this is the easiest and most practical path forward. After all, you can still recognize there is already some value added. We can print the settings Ruff uses to lint the code:

ruff check --show-settings

The output includes the list of all enabled rules:

multiple-imports-on-one-line (E401),
multiple-statements-on-one-line-colon (E701),
multiple-statements-on-one-line-semicolon (E702),
useless-semicolon (E703),
not-in-test (E713),
not-is-test (E714),
lambda-assignment (E731),
ambiguous-variable-name (E741),
ambiguous-class-name (E742),
ambiguous-function-name (E743),
io-error (E902),
import-shadowed-by-loop-var (F402),
late-future-import (F404),
undefined-local-with-import-star-usage (F405),
undefined-local-with-nested-import-star-usage (F406),
future-feature-not-defined (F407),
percent-format-invalid-format (F501),
percent-format-expected-mapping (F502),
percent-format-expected-sequence (F503),
percent-format-extra-named-arguments (F504),
percent-format-missing-argument (F505),
percent-format-mixed-positional-and-named (F506),
percent-format-positional-count-mismatch (F507),
percent-format-star-requires-sequence (F508),
percent-format-unsupported-format-character (F509),
string-dot-format-invalid-format (F521),
string-dot-format-extra-named-arguments (F522),
string-dot-format-extra-positional-arguments (F523),
string-dot-format-missing-arguments (F524),
string-dot-format-mixing-automatic (F525),
multi-value-repeated-key-literal (F601),
multi-value-repeated-key-variable (F602),
expressions-in-star-assignment (F621),
multiple-starred-expressions (F622),
assert-tuple (F631),
is-literal (F632),
invalid-print-syntax (F633),
if-tuple (F634),
break-outside-loop (F701),
continue-outside-loop (F702),
yield-outside-function (F704),
return-outside-function (F706),
default-except-not-last (F707),
forward-annotation-syntax-error (F722),
redefined-while-unused (F811),
undefined-export (F822),
undefined-local (F823),
unused-annotation (F842),
raise-not-implemented (F901),

This means the source code is already conforming to 49 rules. And the pipeline now ensures that, from this point on, it will remain so. I have just started fastening the seat belt. If I were, like the song says, out of time, I could already stop at this point and propose a pull-request. This would already be some kind of improvement. But don’t panic, there’s still some more music to come.

In the following, every time I signal that I commit and push, I could alternatively have stopped working and proposed a pull-request. If you replicate my steps, you can evaluate the time that roughly elapses from one such point to the next. This may convince you of the progressive nature of this incremental approach.

You can find the second part of this article here: https://cyber.airbus.com/en/newsroom/stories/2025-10-python-code-quality-with-ruff-one-step-at-a-time-part-2