Underscores in Python

This post discusses the use of the _ character in Python. Like with many things in Python, we’ll see that different usages of _ are mostly (not always!) a matter of convention.

Single Lone Underscore (_) #

This is typically used in 3 cases:

  1. In the interpreter: The _ name points to the result of the last executed statement in an interactive interpreter session. This was first done by the standard CPython interpreter, and others have followed too.

    >>> _
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    NameError: name '_' is not defined
    >>> 42
    >>> _
    42
    >>> 'alright!' if _ else ':('
    'alright!'
    >>> _
    'alright!'
    
  2. As a name: This is somewhat related to the previous point. _ is used as a throw-away name. This will allow the next person reading your code to know that, by convention, a certain name is assigned but not intended to be used. For instance, you may not be interested in the actual value of a loop counter:

    n = 42
    for _ in range(n):
        do_something()
    
  3. i18n: One may also see _ being used as a function. In that case, it is often the name used for the function that does internationalisation and localisation string translation lookups. This seems to have originated from and follow the corresponding C convention.
    For instance, as seen in the Django documentation for translation, you may have:

    from django.utils.translation import ugettext as _
    from django.http import HttpResponse
    
    def my_view(request):
        output = _("Welcome to my site.")
        return HttpResponse(output)
    

The second and third purposes can conflict, so one should avoid using _ as a throw-away name in any code block that also uses it for i18n lookup and translation.

Single Underscore Before a Name (e.g. _shahriar) #

A single underscore before a name is used to specify that the name is to be treated as “private” by a programmer. It’s kind of* a convention so that the next person (or yourself) using your code knows that a name starting with _ is for internal use. As the Python documentation notes:

a name prefixed with an underscore (e.g. _spam) should be treated as a non-public part of the API (whether it is a function, a method or a data member). It should be considered an implementation detail and subject to change without notice.

* I say kind of a convention because it actually does mean something to the interpreter; if you from <module/package> import *, none of the names that start with an _ will be imported unless the module’s/package’s __all__ list explicitly contains them. See “Importing * in Python” for more on this.

Double Underscore Before a Name (e.g. __shahriar) #

The use of double underscore (__) in front of a name (specifically a method name) is not a convention; it has a specific meaning to the interpreter. Python mangles these names and it is used to avoid name clashes with names defined by subclasses. As the Python documentation notes, any identifier of the form __spam (at least two leading underscores, at most one trailing underscore) is textually replaced with _classname__spam, where classname is the current class name with leading underscore(s) stripped.

Take the following example:

>>> class A(object):
...     def _internal_use(self):
...         pass
...     def __method_name(self):
...         pass
... 
>>> dir(A())
['_A__method_name', ..., '_internal_use']

As expected, _internal_use doesn’t change but __method_name is mangled to _ClassName__method_name. Now, if you create a subclass of A, say B (argh, bad, bad names!) then you can’t easily override A‘s __method_name:

>>> class B(A):
...     def __method_name(self):
...         pass
... 
>>> dir(B())
['_A__method_name', '_B__method_name', ..., '_internal_use']

The intended behaviour here is almost equivalent to final methods in Java and normal (non-virtual) methods in C++.

Double Underscore Before and After a Name (e.g. __init__) #

These are special method names used by Python. As far as one’s concerned, this is just a convention, a way for the Python system to use names that won’t conflict with user-defined names. You then typically override these methods and define the desired behaviour for when Python calls them. For example, you often override the __init__ method when writing a class.

There is nothing to stop you from writing your own special-method-looking name (but, please don’t):

>>> class C(object):
...     def __mine__(self):
...         pass
...
>>> dir(C)
... [..., '__mine__', ...]

It’s easier to stay away from this type of naming and let only Python-defined special names follow this convention.


Discussion on hackernews and reddit.

 
2,292
Kudos
 
2,292
Kudos

Now read this

Understanding write-through, write-around and write-back caching (with Python)

This post explains the three basic cache writing policies: write-through, write-around and write-back. Although caching is not a language-dependent thing, we’ll use basic Python code as a way of making the explanation and logic clearer... Continue →