Mypy Overview

This is an overview of some of the main new features of mypy. Familiarity with Python is assumed.

The basic idea of mypy is to add static typing to Python with fairly minimal language changes, and while making both dynamic and static typing practical, elegant, efficient and interoperable. For any type system features mypy tends to use established, well-understood techniques, except in cases when the current state of the art was not sufficient. New syntax is mostly borrowed from the C family of languages, since many (most?) Python programmers are familiar with Java, C, C++ or C#.

As this was written as an accessible tutorial and introduction for programmers, not only computer scientists, many technical details have been omitted for clarity.

Mypy and this overview are work in progress. The details of the mypy language have not been finalised yet and are likely to change in the future. Some of these features might never see the light of day.

Contents

Function signatures

A function without a type signature is dynamically typed. You can declare the signature of a function using a syntax similar to Java and C/C++ (omit the def keyword). This makes the function statically typed (the type checker reports type errors within the function):

def greeting(name):         # Dynamically typed (similar to Python)
    return 'Hello, {}'.format(name)

str greeting2(str name):    # Statically typed
    return 'Hello, {}'.format(name)

A void return type indicates a function that always return None. Using a void result in a statically typed context results in a type check error:

void p():
    print('hello')

a = p()   # Type check error: p has void return value

Built-in types

These are examples of some of the most common built-in types:
int            # integer objects of arbitrary size
float          # floating point number
bool           # boolean value
str            # unicode string
bytes          # 8-bit string
str[]          # list of str objects
dict<str, int> # dictionary from str to int
object         # the common superclass
any            # dynamically typed value

The type dict is a generic type, signified by type arguments within <...>. For example, dict<int, str> is a dictionary from integers to strings and and dict<any, any> is a dictionary of dynamically typed (arbitrary) values and keys. List is also a generic type, but list types use a more compact syntax. This is useful as list types are very common and are also often nested: list<list<list<float>>> would be ugly.

User-defined types

Each class is also a type. Any instance of a subclass is also compatible with all superclasses. All values are compatible with the object type (and also the any type).

class A:
    int f(self):        # Type of self inferred (A)
        return 2

class B(A):
    int f(self):
        return 3
    int g(self):
        return 4

A a = B()     # OK (explicit type for a; no type inference)
a = A()       # OK
a.f()         # 3
a.g()         # Type check error: A has no method g

The any type

A value with the any type is dynamically typed. Any operations are permitted on the value, and the operations are checked at runtime, similar to Python. If you do not define a function return value type or argument types, these default to any. Also, a function defined using def is dynamically typed and each local variable implicitly has the type any.

Any is compatible with every other type, and vice versa. Implicit type check is inserted when assigning a value of type any to a variable with a more precise type:

any x = 2
int i = x      # OK
str s = x      # Runtime error: 2 not compatible with str

Runtime type checks are only inserted when using the mypy VM. When translating to Python, type declarations are treated as comments, and thus the above code would not generate a runtime error.

Tuple types

The type tuple<t, ...> represents a tuple with the item types t, ...:

tuple<int, str> t = 1, 'foo'

Since mypy supports type inference, the type declaration above is actually optional.

Type inference

The initial assignment defines a variable. If you do not explicitly specify the type of the variable, mypy infers the type based on static type of the rvalue:

i = 1           # Infer type int
l = [1, 2]      # Infer type int[]

Type inference is also used for arbitrary expressions. Type inference takes context into account. For example, the following is valid:

object[] l = [1, 2]      # Infer type object[] for [1, 2]

Type inference is not used in dynamically typed functions (those defined using the def keyword) — every local variable type defaults to any.

Explicit types for collections

The type checker cannot always infer the type of a list or a dictionary. This often arises when creating an empty list or dictionary and assigning it to a new variable without an explicit variable type. In these cases you can give the type explicitly using a < ... > type prefix before the literal:

l = <int> []       # Create int[] object
d = <str, int> {}  # Create dict<str, int> (from string to int)

The above example is equivalent to this code:

int[] l = []
dict<str, int> d = {}

You can also give an explicit type when creating an empty set and when calling other functions that construct collections, such as dict:

s = set<int>()
s = dict<str, int>()

Data members

Data members of a class not defined in the class body must be initialized in the __init__ method. This way the mypy runtime can detect an assignment to an undefined attribute and report a type error or raise an exception. Mypy uses type inference to determine the types of data members:

class A:
    void __init__(self, int x):
        self.x = x     # Attribute x of type int

any a = A(1)
a.x = 2       # OK
a.y = 3       # Runtime error

This is similar to each class having an implicitly defined __slots__ attribute. You can selectively define a class as dynamic; dynamic classes have Python-like semantics:

class A(dynamic):
    pass

a = A()
a.x = 2     # OK

You can also declare data members explicitly, using a member definition (tentative syntax):

class A:
    member int[] x     # Declare attribute x of type int[]

a = A()
a.x = [1]     # OK

The __slots__ member is not sufficient for mypy since the compiler must also have access the types of the slots, and maintaining types with __slots__ would look ugly.

It's still not obvious what to do when a variable is defined in the class body:
class A:
    x = 1      # Class variable or member variable?
Some Python code would use x above as a member variable (1 being the default value), but in other cases it is used as a class variable. Mypy will probably adopt a policy that is compatible with most existing Python code. Perhaps the above would define both a class variable and a member variable; the class variable gives the default value for the member variable when creating an instance. To define specifically a member or a class variable, you could use a member or class (static) prefix:
class A:
    x = 1         # Class variable acts as the default for the member variable
    member x = 1  # Only member variable
    static x = 1  # Only class variable

Overriding statically typed methods

You can override a statically typed method with a dynamically typed one. This allows dynamically typed code to override methods defined in library classes without worrying about their type signatures, similar to Python.

This may result in a runtime type error when calling the method, if the override does not return a value that is compatible with the original return type. This happens only if the method is called via an instance that has the base class as the static type:

class A:
    int inc(self, int x):
        return x + 1

class B(A):
    def inc(self, x):       # Override
        return 'x'

B b = B()
b.inc(1)    # OK
A a = b
a.inc(1)    # Runtime type error; expected int return but it was str

Declaring multiple variable types on a line

You can declare more than a single variable at a time. In order to nicely work with multiple assignment, you must give each variable a type separately:

int n, str s                   # Declare an integer and a string
int i, bool found = 0, False
This is somewhat different in C and Java, perhaps because they don't have multiple assignment.

Dynamically typed code

As mentioned earlier, function bodies declared using the def keyword are dynamically typed (operations are checked at runtime). Code outside functions is statically typed by default, and types of variables are inferred. This does usually the right thing, but you can also make any variable dynamically typed by defining it explicitly with the type any:

s = 1      # Statically typed (type int)
any d = 1  # Dynamically typed (type any)
s = 'x'    # Type check error
d = 'x'    # OK

Additionally, there will a per-file option to regard all code outside functions dynamically typed by default. This mode makes it easy to run existing Python code that is not trivially compatible with static typing.

Interface types

Interface or protocol types are explicit in mypy. There are several built-in interface types (for example, Sequence, Iterable and Iterator). You can define interface types using the new 'interface' construct. In addition to abstract methods, interfaces can also contain default method implementations.

interface A:
    void foo(self, int x)
    str bar(self)

Unlike Python, interfaces are likely to play a significant role in complex mypy programs. That's why mypy has a more convenient interface syntax than what Python uses for abstract methods and abstract base classes.

A class can inherit one arbitrary class and any number of interfaces. As with overrides, a dynamically typed method can implement a statically typed one defined in an interface.

There are also plans to support more Python-style "duck typing" in the type system. However, we need more experience with non-trivial mypy code in order to select the right approach.

Function overloading

You can define multiple instances of a function with the same name but different signatures. The first matching signature is selected at runtime when evaluating each individual call. This enables also a form of multiple dispatch.

int abs(int n):
    return n if n >= 0 else -n

float abs(float n):
    return n if n >= 0.0 else -n

abs(-2)     # 2 (int)
abs(-1.5)   # 1.5 (float)

Overloaded function variants still define a single runtime object; the following code is valid:

my_abs = abs
my_abs(-2)      # 2 (int)
my_abs(-1.5)    # 1.5 (float)

The overload variants must be adjacent in the code. This makes code clearer, and otherwise there would be awkward corner cases such as partially defined overloaded functions that could surprise the unwary programmer.

Callable types and lambdas

You can pass around function objects and bound methods as you would in Python. The type of a function that accepts arguments a1, ..., an and returns rt is func<rt(a1, ..., an)>. Example:

int twice(int i, func<int(int)> next):
    return next(next(i))

int add(int i):
    return i + 1

print(twice(3, add))   # 5

Lambdas are also supported. The lambda argument types can often be inferred based on context:

l = map(lambda x: x + 1, [1, 2, 3])   # infer x as int and l as int[]

Casts

Mypy supports type casts, similar to Java and C#. Unlike Java, however, mypy keeps track at runtime of the type variables of generic objects such as lists and dicts and checks these for compatibility when performing a cast (mypy uses reified generics).

Example:

object o = [1]
l = (int[])o   # Ok
ll = (str[])o  # Runtime type error

When using the Python back end, however, casts are erased and have no runtime effect (they can never fail). Supporting reified generics in Python would degrade performance and make the generated code difficult to read. You should not rely in your programs on casts being checked at runtime. For example, you should only catch exceptions raised by failed casts to report errors, not for making arbitrary runtime decisions.

You don't need casts if the source or target type is any, as we explained earlier.

Notes about writing statically typed code

Statically typed function bodies are often identical to normal Python code, but sometimes you need to do things slightly differently. This section introduces some of the most common cases which require different conventions in statically typed code.

First, you need to specify the type when creating an empty list or dict, as mentioned earlier:

a = <int> []      # <int> required in statically typed code
a = []            # Fine in a dynamically typed function, or if a has type any

Sometimes you can avoid the explicit list item type by using a list comprehension:

# An explicit type annotation is required.
l = <int> []
for i in range(n):
    l.append(i * i)

# No type annotation needed.
l = [i * i for i in range(n)]

However, in more complex cases the explicit type annotation can improve the clarity of your code, whereas a complex list comprehension can make your code difficult to understand.

Second, each name within a function is only has a single type. You can reuse for loop indices etc., but if you want to use a variable with multiple types within a single function, you may need to declare it with the 'any' type. The reason for disallowing this is clarity and simplicity: if a variable can have multiple types, it gets increasingly tedious to track the inferred type in different parts of a larger function. Be explicit: use different names for different variables. Even better, don't write large functions so you have less temptation to reuse local names.

void f():
    n = 1
    ...
    n = 'str'        # Type error: n has type int

Third, sometimes the inferred type is a subtype of the desired type. The type inference uses the first assignment to infer the type of a name:

# Assume Shape is the base class of both Circle and Triangle.
shape = Circle()    # Infer shape to be Circle
...
shape = Triangle()  # Type error: Triangle is not a Circle

You can just give an explicit type for the variable in cases such the above example:

Shape shape = Circle()    # The variable s can be any Shape, not just Circle
...
shape = Triangle()        # OK

Fourth, if you use isinstance tests or other kinds of runtime type tests, you may have to add casts (this is similar to instanceof tests in Java):

void f(object o):
    if isinstance(o, int):
        n = (int)o
        n += 1    # o += 1 would be an error
        ...

Note that the object type used in the above example is similar to Object in Java: it only supports operations defined for all objects, such as equality and isinstance(). The type any, in contrast, supports all operations, even if they may fail at runtime. The cast above would have been unnecessary if the type of o was any.

Some consider casual use of isinstance tests a sign of bad programming style. Often a method override or an overloaded function is a cleaner way of implementing functionality that depends on the runtime types of values. However, use whatever techniques that work for you. Sometimes isinstance tests are the cleanest way of implementing a piece of functionality.

Type inference in mypy is designed to work well in common cases, to be predictable and to let the type checker give useful error messages. More powerful type inference strategies have complex and difficult-to-prefict failure modes and would often result in very confusing error messages.

Defining generic classes

The built-in collection types are generic types. Generic types have one or more type parameters, which can be arbitrary types. For example, dict<int, str> has the type parameters int and str, and int[] has a type parameter int.

You can define your own generic classes, using a syntax similar (but not identical) to Java and C#. Here is a very simple generic class that represents a stack:

class Stack<t>:
    void __init__(self):
        self.items = <t> []   # Create an empty list with items of type t

    void push(self, t item):
        self.items.append(item)

    t pop(self):
        return self.items.pop()

    bool empty(self):
        return not self.items

The Stack class can be used to represent a stack of any type: Stack<int>, Stack<tuple<int, str>>, etc.

Using Stack is similar to built-in container types:

stack = Stack<int>()   # Construct an empty Stack<int> instance
stack.push(2)
stack.pop()
stack.push('x')   # Type error

Type inference works for user-defined generic types as well:

void process(Stack<int> stack): ...

process(Stack())   # Argument has inferred type Stack<int>

Translation to Python

The current implementation translates a mypy program to Python before running it, and this will be supported as an alternative back end also in the future. The translation mostly preserves the original program structure. These are the main changes introduced during translation:

The generated Python code is stored in the directory __mycache__ under the program directory (a different directory is used if there is no write access). You can always consult the generated code to see what's going on.

Let's go back to the first code example in this document:

def greeting(name):         # Dynamically typed (similar to Python)
    return 'Hello, {}'.format(name)

str greeting2(str name):    # Statically typed
    return 'Hello, {}'.format(name)

It gets translated into the Python 3 code below. Note that comments are preserved and the only change is the removal of the type signature of greeting2:

def greeting(name):         # Dynamically typed (similar to Python)
    return 'Hello, {}'.format(name)

def greeting2(name):    # Statically typed
    return 'Hello, {}'.format(name)

The result is just ordinary Python code. The translation did not introduce any library dependencies. Some mypy features may generate some additional standard library imports or other minor additions to the code, but these only happen infrequently.

Supported Python features and modules

Lists of supported Python features and standard library modules are maintained in the mypy wiki:

Additional features

Several mypy features are not currently covered by this tutorial, including the following:

Planned features

This section introduces some language features that are still work in progress.

None

Currently, None is a valid value for each type, similar to null or NULL in many languages. However, it is likely that some types will not allow None values, including int, float and bool. The types int?, float? etc. would be corresponding types that allow None values.

int a
a = None      # Error
int? b
b = None      # OK
The decision to allow None values by default is at least somewhat controversial, and this might change in the future. Another option is to always require explicit '?' to enable None values (and perhaps support it as an optional default at class level). Alternatively, explicit type variants that don't allow None values could be supported for all types.

Runtime redefinition of methods and functions

By default, mypy will not let you redefine arbitrary methods at runtime. All non-dynamic classes behave like built-in or extension types in CPython in this respect. It will also be possible to explicitly declare individual methods as runtime modifiable.

In a similar fashion, functions defined in modules (including builtins) cannot be redefined at runtime by default. However, modules may override this for some (or all) functions.

These changes from Python semantics seem to be necessary for getting good performance in many important scenarios. For example, they make it possible to efficiently inline calls to built-in functions such as len() and ord().

When using the Python back end, ordinary Python semantics are supported, and you can add and redefine methods at runtime. The type checker only catches these in statically typed code, and you can get around even this by using setattr. However, you need to be careful if you decide to do this. You will have difficulty using static typing, since the type checker cannot see functions defined at runtime. Additionally, your code will require extra work to port to the native mypy back end.

More general type inference

It may be useful to support type inference also for variables defined in multiple locations in an if/else statement, even if the initializer types are different:

if x:
    y = None     # First definition of y
else:
    y = 'a'      # Second definition of y

In the above example, both of the assignments would be used in type inference, and the type of y would be str. However, it is not obvious whether this would be generally desirable in more complex cases.

Revision history

List of major changes to this document: