Dead Batteries Included

Recharging the Python standard library

It’s unfortunate that the official About Python page still describes Python’s standard library as having “batteries included.” Sure, some of those old standbys will keep your project going and going, but many of them are leaking acid all over the place. Guido Van Rossum, head developer of Python, has said “the stdlib offerings … are not very convenient and may not support popular idioms very well.” Five years ago, I always assumed the Python library contained the “best of breed” for all packages. These days, I tend to think the opposite.

To counteract this minor flaw, I keep a small “personal standard library.” I keep a pip requirements file listing all the packages I use in every project. A simple script automatically installs that file whenever I create a virtualenv for a new project. With the pip download cache enabled, this is a near-painless process.

My favorite third-party standby is the relatively unknown module. It provides a beautiful object oriented interface to file manipulation operations. This example illustrates several of's nifty features:

Notice the division operator overloading and how makedirs returns the path object so it can be used in further path construction. This syntax is much more readable than the older os.path module in the standard library. I like to have all my projects depend on from the start so that I don’t have to make a development time decision as to whether to add a dependency. Third party dependencies are very easy to include if you’re working with setuptools.

Speaking of setuptools, it’s another required module in my personal standard library. Last summer’s merging of the distribute / setuptools fork makes it the obvious best way to do package management in Python. In contrast to its history, the project is now well-documented and easy to use.

The famous requests module is probably the most obvious package to be included in any web developer’s personal standard library. Its author, Kenneth Reitz, prefers to keep it as a third party module, as he boldly states that the Python standard library is where Python modules go to die.

Another key dependency is the pytz library. The standard datetime library has support for timezones, but accidentally encourages use of so-called naive datetimes, which do not include timezone information. Naive datetimes are a heinous contrivance, yet it is impossible, using just the standard library, to work effectively with timezones other than UTC.

There is, however, a very good reason that pytz is not included in the standard library. Worldwide timezone information changes on a regular basis. Considering the slower release cycle of Python itself, it is important to keep pytz as a third-party library so it can react quickly to new decisions about daylight savings time or changing timezone boundaries. Regardless, it is vital to have this package included in any package that manipulates dates.

And then there’s testing and documentation. Python ships the robust unittest module with its standard distribution, but I prefer to use the much more agile py.test library. The Python language has built-in support for inline documentation, but without the incredible sphinx documentation engine, the feature is seriously crippled. Even the reference Python interpreter and debugger have superhero third-party versions, and I use ipython and ipdb everywhere I work.

The good news is that the Python developers are aware of and actively discussing the shortcomings of the Python Standard Library. They composed PEP411 to actively address the dead and dying batteries issue. The proposal introduces a provisional stage that allows packages to go into the standard library without enforcing the hard guarantees of backwards compatibility and API stability that previously made standard library inclusion undesirable. Further, Python has had a solid deprecation policy for many years; even longer than it’s had a style guideline!

A robust standard library allows new developers to get up and running quickly without having to understand the intricacies of packaging. However, Python packages are easily able to specify, download, and install their own dependencies so there is little call for packages to be included in the standard library. Therefore, in production systems, Python programmers should never restrict themselves to the standard library and should be open—even eager—to depend on third party packages that provide the APIs and functionality they need.


Sign up for the O'Reilly Programming Newsletter to get weekly insight from industry insiders.
topic: Programming
  • noisybit

    Thank you a lot for these insights! Your post will save me hours of wondering, suffering and searching for solutions!

  • mikemike

    It’s common to overload bitwise operators, because they’re hardly used (at least for enterprise software and web apps), but overloading the division operator seems like a bad idea…

    • h6o6

      I agree, mostly from a readability standpoint. If you weren’t familiar with the division overload of, it is a bit startling. Great article though, esp. on pytz.

  • Christopher Mahan

    Pip doesn’t work so well behind the company firewall. I end up downloading the gs and running install

  • cm

    I’ve recently read very good things about the arrow module as a replacement for the various time related modules in Python (datetime, time, dateutil, etc). Arrow’s web page also lists pytz as one that it obviates, so maybe arrow kills several important birds with one importable stone. I haven’t checked it out myself yet, though.

    • Pierre Villeneuve

      I played with Arrow last night and I think it’s great. Some parts are a little unclear, but overall it’s nice to have all time and date stuff in one nice package.

  • Not Richard Shea

    I’d be interested to hear more details on what you consider the benefits of over the standard library is . Thanks.

  • Paddy3118

    “I always assumed the Python library contained the “best of breed” for all packages. ”
    I see you found your problem and have proposed a solution that works for you. That’s great Dusty.

    As Guido states in that link of yours, the stdlib can only change when there is a release of Python (or even less often than that for continuity). The standard library did, and probably still does set a high standard for libraries that are part of a languages distribution. The community has spent time improving access to third party libraries which leads to your ability to install even better librariesbuilt on the shoulders of the standard lib.