After a recent Christchurch Python meetup, I was asked to provide a list of Python libraries and tools that I tend to gravitate towards when working on Python projects. The idea is to provide a narrower set options for guiding newer Pythonistas and help them to avoid being bogged down by the sheer number of Python libraries and tools available these days.
This is my attempt at such a list. I've done a lot of Python in my career but that shouldn't mean that this list necessarily represents the best Python tool for each job. This list is highly subjective, and is coloured by my personal journey. Still, I hope it's useful as a starting point for some. I also don't think there's any big surprises here.
Many programming language communities are waking up to the fact that
having a tool handle code formatting for you is a productivity booster
and avoids pointless arguments within programming
teams. Go can take much of the credit for the
recent interest in code formatters with the
gofmt tool that ships
with Go toolchain. Rust has
projects seem to prefer prettier and a
formatter seems to be expected now for any new language.
A few formatters exist for Python but Black seems to be fast becoming the default choice, and rightly so. It makes sensible formatting decisions (the way it handles line length is particularly smart) and has few configuration options so everyone's code ends up looking the same across projects.
All Python projects should all be using Black!
Python has a number of options for processing command line arguments. I prefer good old argparse which has been in the standard library since Python 3.2 (and 2.7). It has a logical API and more than enough capabilities for most programs.
The standard library has a perfectly fine xUnit style testing package in the form of unittest but pytest is more efficient - and dare I say, fun - to use. I really like the detailed failure output when tests fail and the fixtures mechanism which provides a more powerful and clearer way of reusing test common functionality than the classic setup and teardown methods.
The plugin mechanism is great too. We have a custom test plugin at The Cacophony Project which includes recent API server logs in the output when tests fail.
So much data ends up being available in CSV or similarly formatted files and I've done my fair share of extracting data out of them or producing CSV files for consumption by other software. The csv package in the standard library is well designed and flexible workhorse that deserves more praise.
Dates and times
XXX clock stock pic
The standard datetime package from the standard library is excellent and ends up getting used in almost every Python program I work on. It provides convenient ways to represent and manipulate timestamps, time intervals and time zones. I frequently pop open a Python shell just to do some quick ad hoc date calcuations.
datetime intentionally doesn't try to get too involved with the
vaguaries of time zones. If you need to represent timestamps in
specific timezones or convert between them, the
pytz package is your friend.
There are times where you need to do more complicated things with
timestamps and that's where
dateutil comes in. It supports
generic date parsing, complex recurrence rules and relative delta
calculations (e.g. "what is next monday?"). It also has a complete
timezone database built in so you dont need
pytz if you're using
Shell scripts are great for what they are but there are also real benefits to using a more riga rous programming language for the tasks that shell scripts are typically used for, especially once a script get beyond a certain size or complexity. One way forward is to use Python for its expressiveness and cleanliness and the plumbum package to provide the shell-like ease of running and chaining commands together that Python lacks on it's own.
Here's a somewhat contrived example showing plumbum's command chaining capabilities combined with some Python to extract the first 5 lines:
from plumbum.cmd import find, grep, sort output = (find['-name', '*.py'] | grep['-v', 'python2.7'] | sort)() for line in output.splitlines()[:5] print(line)
In case you're wondering, the name is latin for lead, which is what pipes used to be made from (and also why we also have plumbers).
Python class creation with a lot less boilerplate. attrs turns up all over the place and with good reason - you end up with classes that require fewer lines to define and that are correct in terms of Pythons magic comparison methods.
Here's a quick example of some of the things that attrs gives you:
>>> import attr >>> @attr.s ... class Point: ... x = attr.ib(default=0) ... y = attr.ib(default=0) ... >>> p0 = Point() # using default values >>> p1 = Point(0, 0) # specifying attribute values # equality implemented by comparing attributes >>> p0 == p1 True >>> p2 = Point(3, 4) >>> p0 == p2 False >>> repr(p2) # nice repr values 'Point(x=3, y=4)'
There's a lot more to attrs that I haven't covered here. Most of the default behaviour is customisable and there are many extra features.
It's worth nothing that data classes in Python 3.7 and later offers some of the features of attrs so you could use those if you want to stick to the standard library. attrs offers a richer feature set though.
If you're making HTTP 1.0/1.1 requests with Python then you should almost certainly be using requests. It can do everything you need and then some, and has a lovely API.
As far as HTTP 2.0 goes, it seems that requests 3 will have that covered, but it's a work in progress at time of writing.
Effective use of virtual environments is crucial for a happy Python development experience. After trying out a few approaches for managing virtualenvs, I've settled on pew as my preferred tool of choice. It feels clean and fit the way I work.
So that's what's in my Python toolbox. What's in yours? I'd love to hear your thoughts in the comments.