Jiby's toolbox

Jb Doyon’s personal website

My python toolbox

Posted on — Apr 1, 2024

The packaging workflow of Python has historically been a bit messy, starting with pip being just good enough, and virtualenvs being a great idea but a bit unwieldy. Installation was difficult, and tools like anaconda filled the gaps.

Lately, there’s been renaissance of package tooling in Python thanks to other languages innovating in the area, inspiring and pollinating the ecosystem, improving the comfort of development.

I want to present here the tools that I use to work with Python, usually latest iterations on well known concepts, to show that Python can be both fun to play with, and safe enough if we use modern tools.

This post continues the series on “my way of working”: See my git worfklow, my Presentation on Makefiles for documentation and shorthand. And as presented previously, all of the below patterns can be seen in action in my python template.

In a nutshell

I use and recommend:

We’ll elaborate on each of these tools next, but again, the best way to try using these tools together is to use my personal python project template, which generates a new project with my ideal setup ready to run make in.

System install of Python, not pyenv

I never really needed to have multiple simultaneous versions of python for each project, so I never found it useful to use pyenv, just go for system versions.

So I recommend against using Pyenv, which I find to be overkill for almost every project I have ever worked with, adding a layer of complexity, slowness, and brittleness over every python commands, for no benefit.

Similarly, I don’t use or recommend anaconda because it’s never been needed for me, last time I used it (a long time ago) it took over just a bit too much of the tools at once for my taste.

Instead, pick your system’s python installation (apt install python on Ubuntu, brew install python@3.12 in MacOS via homebrew, whatever works), make sure it’s reasonably supported (no end of life versions), and ensure you have pip and virtualenv available to start with.

Remember that Python is quite good at compatibility around minor versions, so if your project says it needs Python 3.10, there usually is not much issue running it locally from 3.12. Just ensure your CI matches the version of Python you target officially to spot any surprises.

Pipx for CLI installs, not pip

With the above Python installed, do this one and only pip invocation:

pip --user install pipx
Code Snippet 1: Your last pip command ever

Pipx is a wrapper for pip + virtualenv for Python command line tools. It will manage any package just like pip would, exposing the commands, but setting up internally a virtualenv per command/package, ensuring tools never conflict with each other.

Note you will need to run the following to get the commands exposed in shell properly:

pipx ensurepath
Code Snippet 2: Set up the $PATH for pipx

This allows us to install any CLI by replacing pip install <my-package> with pipx install <my-package>.

Npm users may recognize the pattern of npx vs npm regarding running commands in an isolated fashion.

Now we can install any CLI tool we want, and having them installed locally for this user without mangling system packages.

Pipx can even be (ab)used to install packages from git repositories without published packages, which is convenient, though not something I encourage at scale.

Poetry package manager

Poetry takes over the development process, replacing the need for requirements.txt files and manually managed virtualenvs. Instead we specify the dependencies and version range we accept, dependency groups, and get version pinning for free.

I like Poetry, it somehow fits into my mental model of how to work with code, doing the same as cargo, and its TOML file (using pyproject.toml) is very similar.

A few sample commands to get inspired by:

poetry install
Code Snippet 3: Installs all dependencies of a project, generating virtualenv if needed
poetry shell  # Activate the venv
# Alternatively, just run a single command in venv:
poetry run <mycmd>
Code Snippet 4: Running arbitrary commands, as shell or one-off
# Tree of dependencies incl. transitive:
poetry show --tree
# Table of outdated packages:
poetry show --outdated
# Update outdated packages within specified ranges:
poetry update
Code Snippet 5: Show and manipulate dependencies
# Prints current package version
poetry version --short
# Bump the package:
poetry version <major-minor-patch-or-number>
poetry version minor
poetry version 1.2.3
Code Snippet 6: Checking or bumping package's version

Note that, though poetry shell works fine as a venv activation command, I recommend scripts and users get used to poetry run <cmd> for all commands, in order to explicit the need to run this command inside the virtualenv. Once past the mental hurdle of adding this prefix, it’s really neat.

There’s just one setup thing I recommend force on all my fellow poetry users, it’s to run the following:

poetry config virtualenv.in-project true
Code Snippet 7: Ensures the virtualenv is created in project folder under .venv/, making it easy to play with the venv without poetry, and easy to wipe via find .venv/ -delete

I personally don’t understand why this config is not the default, instead preferring to create virtualenvs somewhere difficult to pin down.

Ruff, the mega linter, formatter

Once upon a time, we used a conjunction of isort, black, and flake8, and had to have plugins for each of them for compatibility with the others.

But nowadays, ruff just takes over it all.

Ruff is, according to its docs:

An extremely fast Python linter and code formatter, written in Rust […]

10-100x faster than existing linters (like Flake8) and formatters (like Black) […]

Ruff can be used to replace Flake8 (plus dozens of plugins), Black, isort, pydocstyle, pyupgrade, autoflake, and more, all while executing tens or hundreds of times faster than any individual tool.

There’s nothing else to say, install this one tool, learn to tweak some of it (like enabling auto-fix mode for most rules) and just forget about linting ever again!

In particular I enforce the rules about forcing docstrings in top-level of module + functions, but that’s just my preference, again to force myself to get basic explanation of purpose.

Mypy for type checking

Modern Python has type hints. These are obviously not as thorough as statically typed languages, but preserve the benefit of both explaining types to humans reading the code, and being verifiable statically by tooling (just remember that at runtime, the interpreter does not care about these!).

So we use a type-checker, explicitly checking that the types are consistent.

I strongly recommend starting to use types for even the most simple business-domain interpretations of primitives: a tool for managing software deployment will likely benefit from defining a custom alias over str called Environment, another for Datacenter, etc, so as to clarify the meaning of parameters being passed via typing. This allows return types like tuple[Environment, Datacenter] as opposed to tuple[str, str].

Suggest enforcing through code reviews the requirement of “any function should have types for input and type for output”. The escape hatch typing.Any is a valid type matching anything, fallback when we genuinely do not know, but generally this should be avoided as it goes against the point of narrowing down types.

And for the typing pedants who say Python’s duck typing is supposed to not enforce type constraints but “shape” constraints, they can enjoy being smug, and then define typing.Protocols for type matching instead, the idea remains the same.

Pre-commit to run all linters consistently

Pre-commit (the specific python tool from https://pre-commit.com, not the concept of having git hooks ran before committing, though the tool linked does that and is named after it) is a tool to define “hooks” to run against files of the repo, mostly for linters, but also for formatters.

The tool supports a truly impressive number of build systems1, isolating each tool in a virtualenv equivalent of the hook’s language, enabling hooks to be run repeatedly while staying isolated.

The default pre-commit-hook repo defines a slew of useful checks like “never allow committing a private SSH key”, or “trim trailing whitespaces in files”. These are inexpensive to run and helpful to enforce.

So linters like ruff and mypy above are managed by this tool, which is built to hook up to git pre-commit hooks, getting a kind of cheap local “CI-lite” checks before committing.

Error exit codes in hooks fail the checks, but the tool also diffs the repo before/after running the hooks, so that a formatter fixing format issues fails the pre-commit hook because files were modified. For me, this usually means running the checks via make once, getting a formatter autofix some code (failing hooks this time) and then a second run of make, hopefully passing now, just to be sure that there’s no remaining issue.

Scripts local to the repo can also be defined as checks, so we can enforce repo-specific consistency, and even export our checks for others. This is perfect for a team trying to set up validators across their organisation.

I highly recommend to configure CI servers to run the linters independently, to ensure that the standards stay high even if devs forget to run the checks locally. This way any merge request will require these to pass as well as tests to pass, so we’ll never get surprised with lint failures creeping in the codebase, zero tolerance for lint!

Editor support via LSP

I’m not preachy about editors, but I want to make sure everybody gets a consistent set of tooling, even if their editor choice is not very common (I’m an Emacs user after all).

So I recommend people who may not know about it look up Language Server Protocol (LSP), a technology that enables editor-independent code analysis, only need to instal an LSP client plugin for your favourite editor.

For python, a few LSP servers exist, I recommend using python-lsp-server, and integrating your editor with it.

In particular, it has a plugin for ruff, called python-lsp-ruff, which exposes in code all the checks that ruff would do when you run it as a pre-commit hook, but interactively during your editing for faster feedback.

Per above, these are python CLIs, so we recommend to install via pipx:

pipx install python-language-server
# Install the ruff plugin inside the LSP's venv
pipx inject python-language-server python-language-ruff
Code Snippet 8: Installing the Python LSP Server

Note again that LSP is just a nice to have in editors, maybe your favourite tool already has good Python support, like PyCharm or VSCode, but just in case, LSP is a really good tool to level the playing field.

Pytest for testing

Pytest is simply a nice testing framework. Remembering that since pre-commit above manages all their own virtualenv, this makes pytest is the first tool in the list that we depend on for testing that needs included in poetry’s explicit dependencies. We thus define a dedicated poetry dependency group for tests, in which we include pytest and a few others, as there are a couple more amazing plugins that complement pytest well.

Parametrized test cases

One such plugin is pytest-cases, which enables complex test fixtures, perfect for parametrizing tests that use complex structures like NumPy arrays etc. Since these parametrized tests “dimensions” are defined as functions or classes, we can use function/class naming, and expansive docstrings to describe exactly how this specific case is different from the rest, or why it’s useful to explore.

Integration tests using docker

For integration testing, I like to use the Python version of testcontainers, which enables anyone with a docker socket to run containerized services like postgres, spun up and down on the fly as test fixtures. This enables tests that validate features using a clean database, persisted (or not, configurable) across tests.

Fake data with Faker

The well known faker library is also amazing for generating realistic-sounding data in an easy way, with good localization too. Perfect when shipping a library with new types that need believable sample data for effective testing.

Property-based testing

Finally, I’m keen to get the opportunity to use the hypothesis library soon, to have property-based tests checking for invariants in the software, compared against generated data.

Sphinx for docs

For documentation, I like to define a separate poetry dependency group.

I’ve reviewed a lot of docs-as-code systems, and despite the innovations, I rely on good old Sphinx docs for documentation-generation, with a twist. Its various output formats like HTML and PDF via LaTeX are hard to beat, and the extensive plugin list makes up for its peculiarities.

In particular, though the original docs source format of Sphinx is a custom thing called reStructuredText (extension .rst), we replace it via myst-parser instead, a plugin that parses regular MarkDown instead of RST. So we write markdown in docs, and still get lovely documentation.

I default to using the sphinx-readthedocs theme to get the classic “ReadTheDocs” look, which looks serious enough, if a little unoriginal. Use your own taste to choose a theme, pick one from the sphinx themes gallery!

I strongly believe that code documentation should contain an API Reference detailing packages and contents. Sphinx does have an autodoc system, but I recommend sphinx-autodoc2 as a replacement, to enable the same markdown parsing of myst-parser in code and docstrings.

Like for pytest, there’s extras that are useful in some cases, but worth pointing out in general to show how accomodating our toolbox can be.

Requirements as code

Starting with sphinx-needs2, a requirements-as-code extension, enabling the writing up of requirements, specifications, and their relationship between each other as part of the docs.

Remember that we can mix in links to actual code (functions, classes etc) within our prose, so we can trace requirements from the source, to specifications, implementation, and even down to feature acceptance tests. When the alternative to track this is hundreds of pages of Word documents or spreadsheets, being able to manage the history in version control, alongside of the implementing code, and get change requests over git diffs of actual code is a wonderful improvement.

Page generation from template

To generate templated docs pages, there’s a little plugin called sphinx-jinja, which exposes the Jinja template engine of Sphinx, with custom programmable data. I’ve used this in the past to render a markdown page full of images stored in a folder, with section separators titled after the image filenames. We never think we need this sort of tool, but documentation often benefit from automatically including a summary of data/code in the repo, without having to manually update docs when the source changes.

The sphinx conf.py file is where we can run Python code to generate the context data, such as my example of listing images in filesystem, generating dict of data.

Diagrams as code

Finally, I love diagrams in docs, in particular diagrams-as-code. The tool that stole my heart a long time ago is PlantUML, a Java tool built on top of GraphViz. Note it recently shed its notoriously ugly default theme and is now quite pretty by default. So we have another sphinx plugin, sphinx-plantuml which opens up integration with sphinx, rendering diagrams from code to image.

Note though that a Java program is harder to package in a Python world, so sometimes I just render diagrams manually and commit the resulting image, not using this plugin. There’s always a docker image available to render these as one-offs. Remember to go for SVG if you can afford it, it makes diagrams nicer when zoomed in.

I’ll likely come back to this concept of X-as-code in other posts, as these successive movements overall have been very influential in my career, and keep on surprising me.

And ‘Make’ to shorten it all

As I’ve presented before, The Makefile is a convenient shorthand, a way to chain tasks together and override default values on the fly. The file is also handy to show/teach people what the common dev commands are.

So of course all the above tooling should be wrapped in a Makefile to run everything in a simple invocation:

# alias for:
make all
# which expands to:
make install lint test docs build install-hooks
Code Snippet 9: 90% of my work during heavy development

These commands break down to:

The sum of all these tasks runs in under 10 seconds total, with the slowest and most variable part being the tests, depending on the project. It helps to just run unit tests there, and have a separate target for heavier testing.

I’ve skipped here a few less common targets like make docker-build-dev, but the idea is still to document and automate common dev tasks. See a sample Makefile from a project I’m working on, for reference of what to expect.

Note that ideal use of make is to have CI run the exact same command as local devs do, to the point where CI pipelines run exclusively one of the make command linked above: make docs, make lint, make test, make docs, 100% match between devs’ workspaces and CI.

In projects that need configuration or secrets, I usually throw in .env file loading from the Makefile, to allow config to either be set via file, overriden by local .env file, and overriden by any command line invocations of make.

I’ve recently been experimenting with a make help command that reads Makefile comments with two hashes (##) as well as Makefile target names, which provides a nice documentation aspect for people who don’t want to read Makefiles!


There are a great many tools that are worth talking about but were left out, like the fascinating import-linter to enforce architecture via package-import contracts. I may revisit some of these in new posts as new patterns evolve, tools get replaced, etc.

This was an overview of the core tools I enjoy while working with modern Python. Ideally, this can serve as a guide to curious devs looking for a solid set of tools, and hope by this list to prove that by using nice utilities, we can get a toolbox that makes it reasonably safe to work in, but keeps the fun aspects of Python.

As mentioned before, I am continually porting new concepts and ideas in my personal project template, have a look sometimes to see what new idea made it in!

  1. Aside from the obvious like Python, Java, Golang, Rust, etc, it also has a Docker (Dockerfile) system, and a docker image one, which allow to respectively clone+docker-build+docker-run, or docker run an upstream image, so these checks can be pinned via containers too! Truly a wonderful tool, though others have noted that its auto-update policy is a little odd.

  2. While we’re talking about sphinx-needs, shout out to my ex-colleague and amazing developer Daniel Eades for his plugin sphinx-graph, which aims to reimplement the ideals of sphinx-needs from scratch, coming at it from an explicitly graph-based perspective. This may be early days, but I’m keeping my eye on that tool!

  3. Yes, this may be surprising to an unsuspecting user, that to run make will set up pre-commit hooks. It’s almost backdoor-like, but I found it very useful to ensure newcomers to the repo don’t have to think about these checks, so setting up the hooks for their next git commit is supremely useful, even if it is taking over their environment. Of course, on some repos I choose to not run that command, as the “surprise” may be seen as adversarial.