I’m a Windows Person. How can I contribute to Python? [Part I]

If I might essay a generalisation: Windows-based users of Python are less likely to have become involved in the development of Python itself. This might be for the same reasons that other people don’t get involved: a lack of time, of inclination, or of confidence in their ability to help. But it might be because Windows users on the whole aren’t so accustomed to delving into the source code of the things they use, nor are they used to checking out from source control, testing fixes and rebuilding. It’s not so much part of the culture.

I’m keen to see more Windows users & developers active in core CPython development. Python has been supported on Windows for a long time by the core development team but it can sometimes feel a little bit like a poor cousin. The majority of Python users (and of key Python developers) are Unix-based: traditionally Linux & BSD, these days often OS X. Even if they’re not unsympathetic to the needs of Windows users, they may not be be in a position to help. But Python is a volunteer effort and if Windows-savvy developers aren’t available, Windows issues will remain unfixed and untested and better support for Windows-specific issues will not be forthcoming.

There are practicalities involved: Subversion or Mercurial, Visual Studio, being able to build Python from scratch and so on, but let’s start from the other end: you’re well-intentioned and motivated, but what should you do? I’m focusing on Windows-specific issues, but some of what I say here applies to anyone who’s interested in helping Python’s core development.

How should I start?

You don’t have to be an über-hacker or even a developer at all to contribute to Python. (Although you can do more if you are). You can start by submitting bug reports or requests for improvements to the issue tracker:

  • Good: Submit a bug report: send an email to bugs@python.org detailing what version of Python you’re using and on which version of Windows, what you did and what went wrong. Or what you think should happen. If you’re able to re-test with a newer version of Python that will help, in case the bug’s already been identified and fixed. If it’s a documentation issue you’re describing, be clear what alternative or additional text you’re proposing.
  • Better: Submit a bug report or feature request and a failing test case: send an email as above, but attach the simplest possible self-contained script which will cause the failure you’re describing or which will illustrate what should happen.
  • Best: Submit a bug report with a failing test and add a patch which you believe will solve the issue plus a patch to the relevant Python test module plus a doc patch if applicable. If this is a documentation issue, the ideal is a patch against the reStructuredText docs; at the least, a clear plain text version of what you’re proposing.

Things to bear in mind:

  • Don’t be disappointed if no-one takes any action. Ideally – and often – someone quickly notices the new issue, does a little categorisation on it and assigns it to a developer who’s known to be the owner of the area where the problem occurred. At the least, one or more developers may be added to the nosy list, and one of them may take ownership. Even if that doesn’t happen, you’ve definitely done the right thing by reporting the issue and offering a patch. If it’s Windows-related, feel free to add me to the nosy list. But even if a developer is assigned, developers are volunteers with their own availability, pressures and interests.
  • Don’t be surprised if the issue isn’t quite as straightforward as it appears at first blush, particularly if backwards compatibility is at stake. There may be some history of which you’re unaware, or some ramifications in other areas of the codebase which aren’t obvious.
  • Don’t give up in despair if the issue is rejected. This isn’t a hard-hearted manoeuvre by the core developers to reject all outside help. It just means that things aren’t always as simple as they might appear and some bigger-picture issue might have put the kibosh on your bright idea. If you feel that a valid point has genuinely been misunderstood you’re free to ask that the issue be reopened; but please don’t do this frivolously.
  • Not all versions of Python are under active development. At the time of writing, the 2.x series is at the end of its life with 2.7 the final release. Clear bugs may be fixed in that branch, but little else. It’s unlikely that anything categorised as a new feature will make it in there. The same goes for 3.1. The latest development branch is version 3.2 where new features will be targetted. Any earlier releases will either see no changes whatsoever or only critical security fixes.

I’m a Windows developer, but where are the problems?

Being a core Python developer on Windows can mean several things: doing any kind of Python core development when Windows is your primary platform; working on Windows-specific modules, such as msvcrt or winreg, or the Windows-specific aspects of other modules, such as os and subprocess; or assessing and maintaining platform-specific subtleties such as compiler-sensitive datatype sizes or line-ending issues.

A lot depends on your own inclinations, motivations and areas of expertise. If you’re an experienced Visual Studio user you might focus more on some of the compiler-specific issues or build file configurations. If you have experience of sysadmin-type work you might want to look at the os module which wraps some of the Win32 API in posix-like calls or the winreg module which exposes the registry functions. If you’re a general purpose C coder who just uses Windows as a platform then you could look at the core Python code or the parts of the stdlib written in C. If you’re an MSI guru, look at some of the outstanding issues in Python’s own MSI and the bdist_msi distutils subsystem.

Ultimately involvement probably comes from one of three things:

  • Scratching your own itch: is there something in the docs which you’ve never really understood because it relies on knowledge of Unix-specifics? Or just because it’s confusing? Have you been tripped up by a stdlib function which assumes that Windows paths always act like Unix paths? Have you tried to use os.access only to discover that it’s essentially useless on Windows? Do you think you could reimplement the signal module using the Win32 API? Or sponsor the merge of a 3rd-party Windows port of the curses module?
  • Scratching someone else’s itch: have a look at bugs.python.org and choose an issue which is unresolved. The issue tracker has recently had new quick-searches added to it down the left and you can easily build your own searches. This search, for example, looks for unclosed issues which include Windows in the component list. It can be a useful exercise if you simply confirm that an unclear issue still obtains on current versions of Python and, possibly, on recent versions of Windows. Or if you produce a failing test case for a merely textual or vague report. (Something which can take quite some time). But be wary of producing “me-too” noise as people following that particular issue will receive notification emails which don’t add usefully to the known state of affairs.
  • Finding an itch to scratch: if you’re keen to contribute but haven’t experienced any problems which need fixing, and haven’t found any existing bugs which take your fancy, then why not just browse the source code seeing how things are done and looking for ways to improve. Be careful here since cosmetic, trivial or frivolous changes are generally frowned upon. But, for example, the subprocess and multiprocessing modules both involve some reasonably serious Windows-specific plumbing. The module maintainers aren’t necessarily Windows experts and even if they are, the more eyes the better. There are other modules which have Windows-specific elements if only in small ways.

Where should I be helping?

  • The issue tracker: bugs.python.org hosts the Python issue tracker, an instance of roundup which keeps track of all bug reports and feature requests. It’s broadly followed by core developers, but a small team keeps track of incoming bugs and tries to categorise them appropriately and to assign them to a known developer if possible, or to add to the issue’s nosy list developers known to be interested in that area of the system. By helping out in this way, you get a fairly good idea of the issues being reported and the work being done on them. And on which core developers have expertise in which areas of the system. It’s also relatively easy to get higher privileges on the tracker than he default logged-in user has so that you can help classify and manage open issues.
  • Documentation: The documentation is maintained as reStructuredText files within Python’s version control repository and is published using Sphinx. It’s very easy to change and patch and there’s a batch file to rebuild it on Windows although you may need to tweak it a little. Doc patches are welcome, whether for small fry such as typos or slight clarifications, or for weightier rewrite.
  • stdlib Python: The Python-only parts of the standard library. A lot of the stdlib is written in Python itself; that’s what makes the whole package so portable. Even if you’re not a C developer, you can still debug and test issues in the stdlib itself. It’s especially possible to do this without a C compiler or even a Subversion checkout: you only need to put an altered version of the module in a test directory from which you run the Python interpreter and the interpreter will see the local version first.
  • stdlib Python & C: Some of the standard library relies partly or wholly on extension modules, written in C against Python’s C-API. The C is reasonably readable and is often concerned only with moving Python objects to and from some system library, which makes it a fairly restricted subset of C. To do anything useful with this, you will need a C compiler and, ideally, a version-controlled checkout.
  • Core Python: The core of CPython is written in C: the objects, the interpreter, the underlying data structures, key built-in modules. You’ll need to be a fairly good C programmer to work here, not least because the impact is high if a mistake is made.
  • Infrastructure: CPython needs a certain amount of scaffolding to build, especially on Windows where the standard autoconf/make tools don’t apply. This scaffolding tends to be under-supported, not least because it’s not very public-facing and doesn’t solve most people’s problems. But it still needs to be done. This includes Visual Studio solution and proj files, msi build mechanisms, buildbot scripts and documentation make files.

This is Part I; Part II should include information about checkout out the source, building it, creating & applying patches and working within the development community.