Type system concepts

Static, dynamic, and gradual typing

A statically typed programming language runs a type checker before running a program. The program is required to be well typed according to the language’s type system. The type system assigns a type to all expressions in the language and verifies that their uses obey the typing rules. Normally, a program that is not well typed (i.e., one that contains a type error) will not run. Java and C++ are examples of statically typed object-oriented languages.

A dynamically typed programming language does not run a type checker before running a program. Instead, it checks the types of values before performing operations on them at runtime. This is not to say that the language is “untyped”. Values at runtime have a type and their uses obey typing rules. Not every operation will be checked, but certain primitive operations in the language such as attribute access or arithmetic are. Python is a dynamically typed language.

Gradual typing is a way to combine static and dynamic typing. Type-annotated Python allows opting in to static type checking at a fine level of granularity, so that some type errors can be caught statically, without running the program. Variables, parameters, and returns can optionally be given static type annotations. Even within the type of a single data structure, static type checking is optional and granular. For example, a dictionary can be annotated to enable static checking of the key type but only have dynamic runtime checking of the value type.

A gradual type system is one in which a special “unknown” or “dynamic” type is used to describe names or expressions whose types are not known statically. In Python, this type is spelled Any. Because Any indicates a statically unknown type, the static type checker can’t check type correctness of operations on expressions typed as Any. These operations are still dynamically checked, via the Python runtime’s usual dynamic checking.

The Python type system also uses ... within Callable types and within tuple[Any, ...] (see Tuples) to indicate a statically unknown component of a type. The detailed rules for these usages are discussed in their respective sections of the specification. Collectively, along with Any, these are gradual forms.

This specification describes a gradual type system for Python.

Fully static and gradual types

We will refer to types that do not contain a gradual form as a sub-part as fully static types.

A gradual type can be a fully static type, Any itself, or a type that contains a gradual form as a sub-part. All Python types are gradual types; fully static types are a subset.

Fully static types

A fully static type denotes a set of potential runtime values. For instance, the fully static type object is the set of all Python objects. The fully static type bool is the set of values { True, False }. The fully static type str is the set of all Python strings; more precisely, the set of all Python objects whose runtime type (__class__ attribute) is either str or a class that inherits directly or indirectly from str. A Protocol denotes the set of all objects which share a certain set of attributes and/or methods.

If an object v is a member of the set of objects denoted by a fully static type T, we can say that v is a “member of” the type T, or v “inhabits” T.

Gradual types

Any represents an unknown static type. It denotes some unknown set of runtime values.

This may appear similar to the fully static type object, which represents the set of all Python objects, but it is quite different.

If an expression has the type object, a static type checker should ensure that operations on the expression are valid for all Python objects, or else emit a static type error. This allows very few operations! For example, if x is typed as object, x.foo should be a static type error because not all Python objects have an attribute foo.

An expression typed as Any, on the other hand, should be assumed to have _some_ specific static type, but _which_ static type is not known. A static type checker should not emit static type errors on an expression or statement if Any might represent a static type which would avoid the error. (This is defined more precisely below, in terms of materialization and assignability.)

Similarly, a type such as tuple[int, Any] (see Tuples) or int | Any (see Union types) does not represent a single set of Python objects; rather, it represents a (bounded) range of possible sets of values.

In the same way that Any does not represent “the set of all Python objects” but rather “an unknown set of objects”, tuple[int, Any] does not represent “the set of all length-two tuples whose first element is an integer”. That is a fully static type, spelled tuple[int, object]. By contrast, tuple[int, Any] represents some unknown set of tuple values; it might be the set of all tuples of two integers, or the set of all tuples of an integer and a string, or some other set of tuple values.

In practice, this difference is seen (for example) in the fact that we can assign an expression of type tuple[int, Any] to a target typed as tuple[int, int], whereas assigning tuple[int, object] to tuple[int, int] is a static type error. (Again, we formalize this distinction in the below definitions of materialization and assignability.)

In the same way that the fully static type object is the upper bound for the possible sets of values represented by Any, the fully static type tuple[int, object] is the upper bound for the possible sets of values represented by tuple[int, Any].

The gradual guarantee

Any allows gradually adding static types to a dynamically typed program. In a fully dynamically typed program, a static checker assigns the type Any to all expressions, and should emit no errors. Inferring static types or adding type annotations to the program (making the program more statically typed) may result in static type errors, if the program is not correct or if the static types aren’t able to fully represent the runtime types. Removing type annotations (making the program more dynamic) should not result in additional static type errors. This is often referred to as the gradual guarantee.

In Python’s type system, we don’t take the gradual guarantee as a strict requirement, but it’s a useful guideline.

Subtype, supertype, and type equivalence

A fully static type B is a subtype of another fully static type A if and only if the set of values represented by B is a subset of the set of values represented by A. Because the subset relation on sets is transitive and reflexive, the subtype relation is also transitive (if C is a subtype of B and B is a subtype of A, then C is a subtype of A) and reflexive (A is always a subtype of A).

The supertype relation is the inverse of subtype: A is a supertype of B if and only if B is a subtype of A; or equivalently, if and only if the set of values represented by A is a superset of the values represented by B. The supertype relation is also transitive and reflexive.

We also define an equivalence relation on fully static types: the types A and B are equivalent (or “the same type”) if and only if A is a subtype of B and B is a subtype of A. This means that the set of values represented by A is both a superset and a subset of the values represented by B, meaning A and B must represent the same set of values.

We may describe a type B as “narrower” than a type A (or as a “proper subtype” of A) if B is a subtype of A and B is not equivalent to A. In the same scenario we can describe the type A as “wider” than B, or a “proper supertype” of B.

Nominal and structural types

For a type such as str (or any other class), which describes the set of values whose __class__ is str or a direct or indirect subclass of it, subtyping corresponds directly to subclassing. A subclass MyStr of str is a subtype of str, because MyStr represents a subset of the values represented by str. Such types can be called “nominal types” and this is “nominal subtyping.”

Other types (e.g. Protocols and TypedDict) instead describe a set of values by the types of their attributes and methods, or the types of their dictionary keys and values. These are called “structural types”. A structural type may be a subtype of another type without any inheritance or subclassing relationship, simply because it meets all the requirements of the supertype, and perhaps adds more, thus representing a subset of the possible values of the supertype. This is “structural subtyping”.

Although the means of specifying the set of values represented by the types differs, the fundamental concepts are the same for both nominal and structural types: a type represents a set of possible values and a subtype represents a subset of those values.

Materialization

Since Any represents an unknown static type, it does not represent any known single set of values (it represents an unknown set of values). Thus it is not in the domain of the subtype, supertype, or equivalence relations on static types described above.

To relate gradual types more generally, we define a materialization relation. Materialization transforms a “more dynamic” type to a “more static” type. Given a gradual type A, if we replace zero or more occurrences of Any in A with some type (which can be different for each occurrence of Any), the resulting gradual type B is a materialization of A. (We can also materialize a Callable type by replacing ... with any type signature, and materialize tuple[Any, ...] by replacing it with a determinate-length tuple type.)

For instance, tuple[int, str] (a fully static type) and tuple[Any, str] (a gradual type) are both materializations of tuple[Any, Any]. tuple[int, str] is also a materialization of tuple[Any, str].

If B is a materialization of A, we can say that B is a “more static” type than A, and A is a “more dynamic” type than B.

The materialization relation is both transitive and reflexive, so it defines a preorder on gradual types.

Consistency

We define a consistency relation on gradual types, based on materialization.

A fully static type A is consistent with another fully static type B if and only if they are the same type (A is equivalent to B).

A gradual type A is consistent with a gradual type B, and B is consistent with A, if and only if there exists some fully static type C which is a materialization of both A and B.

Any is consistent with every type, and every type is consistent with Any. (This follows from the definitions of materialization and consistency but is worth stating explicitly.)

The consistency relation is not transitive. tuple[int, int] is consistent with tuple[Any, int], and tuple[Any, int] is consistent with tuple[str, int], but tuple[int, int] is not consistent with tuple[str, int].

The consistency relation is symmetric. If A is consistent with B, B is also consistent with A. It is also reflexive: A is always consistent with A.

The assignable-to (or consistent subtyping) relation

Given the materialization relation and the subtyping relation, we can define the consistent subtype relation over all types. A type B is a consistent subtype of a type A if there exists a materialization A' of A and a materialization B' of B, where A' and B' are both fully static types, and B' is a subtype of A'.

Consistent subtyping defines “assignability” for Python. An expression can be assigned to a variable (including passed as an argument or returned from a function) if its type is a consistent subtype of the variable’s type annotation (respectively, parameter’s type annotation or return type annotation).

We can say that a type B is “assignable to” a type A if B is a consistent subtype of A. In this case we can also say that A is “assignable from” B.

In the remainder of this specification, we will usually prefer the term assignable to over “consistent subtype of”. The two are synonymous, but “assignable to” is shorter, and may communicate a clearer intuition to many readers.

For example, Any is assignable to int, because int is a materialization of Any, and int is a subtype of int. The same materialization also shows that int is assignable to Any.

The assignable-to relation is not generally symmetric, however. If B is a subtype of A, then tuple[Any, B] is assignable to tuple[int, A], because tuple[Any, B] can materialize to tuple[int, B], which is a subtype of tuple[int, A]. But tuple[int, A] is not assignable to tuple[Any, B].

For a gradual structural type, consistency and assignability are also structural. For example, the structural type “all objects with an attribute x of type Any” is consistent with (and assignable to) the structural type “all objects with an attribute x of type int”.

Summary of type relations

The subtype, supertype, and equivalence relations establish a partial order on fully static types. The analogous relations on gradual types (via materialization) are “assignable-to” (or “consistent subtype”), “assignable-from” (or “consistent supertype”), and “consistent with”. We can visualize this analogy in the following table:

Fully static types

Gradual types

B is a subtype of A

B is assignable to (or a consistent subtype of) A

A is a supertype of B

A is assignable from (or a consistent supertype of) B

B is equivalent to A

B is consistent with A

We can also define an equivalence relation on gradual types: the gradual types A and B are equivalent (that is, the same gradual type, not merely consistent with one another) if and only if all materializations of A are also materializations of B, and all materializations of B are also materializations of A.

Attributes and methods

In Python, we can do more with objects at runtime than just assign them to names, pass them to functions, or return them from functions. We can also get/set attributes and call methods.

In the Python data model, the operations that can be performed on a value all desugar to method calls. For example, a + b is (roughly, eliding some details) syntactic sugar for either type(a).__add__(a, b) or type(b).__radd__(b, a).

For a static type checker, accessing a.foo is a type error unless all possible objects in the set represented by the type of a have the foo attribute. (We consider an implementation of __getattr__ to be a getter for all attribute names, and similarly for __setattr__ and __delattr__. There are more complexities; a full specification of attribute access belongs in its own chapter.)

If all objects in the set represented by the fully static type A have a foo attribute, we can say that the type A has the foo attribute.

If the type A of a in a.foo is a gradual type, it may not represent a single set of objects. In this case, a.foo is a type error if and only if there does not exist any materialization of A which has the foo attribute.

Equivalently, a.foo is a type error unless the type of a is assignable to a type that has the foo attribute.

Union types

Since accepting a small, limited set of expected types for a single argument is common, the type system supports union types, created with the | operator. Example:

def handle_employees(e: Employee | Sequence[Employee]) -> None:
    if isinstance(e, Employee):
        e = [e]
    ...

A fully static union type T1 | T2, where T1 and T2 are fully static types, represents the set of values formed by the union of the sets of values represented by T1 and T2, respectively. Thus, by the definition of the supertype relation, the union T1 | T2 is a supertype of both T1 and T2, and T1 and T2 are both subtypes of T1 | T2.

A gradual union type S1 | S2, where S1 and S2 are gradual types, represents all possible sets of values that could be formed by union of the possible sets of values represented by materializations of S1 and S2, respectively.

For any materialization of S1 to T1 and S2 to T2, S1 | S2 can likewise be materialized to T1 | T2. Thus, the gradual types S1 and S2 are both assignable to the gradual union type S1 | S2.

If B is a subtype of A, B | A is equivalent to A.

This rule applies only to subtypes, not assignable-to. The union T | Any is not reducible to a simpler form. It represents an unknown static type with lower bound T. That is, it represents an unknown set of objects which may be as large as object, or as small as T, but no smaller.

Equivalent gradual types can, however, be simplified from unions; e.g. list[Any] | list[Any] is equivalent to list[Any]. Similarly, the union Any | Any can be simplified to Any: the union of two unknown sets of objects is an unknown set of objects.

Union with None

One common case of union types are optional types, which are unions with None. Example:

def handle_employee(e: Employee | None) -> None: ...

Either the type Employee or the type of None are assignable to the union Employee | None.

A past version of this specification allowed type checkers to assume an optional type when the default value is None, as in this code:

def handle_employee(e: Employee = None): ...

This would have been treated as equivalent to:

def handle_employee(e: Employee | None = None) -> None: ...

This is no longer the recommended behavior. Type checkers should move towards requiring the optional type to be made explicit.

Support for singleton types in unions

A singleton instance is frequently used to mark some special condition, in particular in situations where None is also a valid value for a variable. Example:

_empty = object()

def func(x=_empty):
    if x is _empty:  # default argument value
        return 0
    elif x is None:  # argument was provided and it's None
        return 1
    else:
        return x * 2

To allow precise typing in such situations, the user should use a union type in conjunction with the enum.Enum class provided by the standard library, so that type errors can be caught statically:

from enum import Enum

class Empty(Enum):
    token = 0
_empty = Empty.token

def func(x: int | None | Empty = _empty) -> int:

    boom = x * 42  # This fails type check

    if x is _empty:
        return 0
    elif x is None:
        return 1
    else:  # At this point typechecker knows that x can only have type int
        return x * 2

Since the subclasses of Enum cannot be further subclassed, the type of variable x can be statically inferred in all branches of the above example. The same approach is applicable if more than one singleton object is needed: one can use an enumeration that has more than one value:

class Reason(Enum):
    timeout = 1
    error = 2

def process(response: str | Reason = '') -> str:
    if response is Reason.timeout:
        return 'TIMEOUT'
    elif response is Reason.error:
        return 'ERROR'
    else:
        # response can be only str, all other possible values exhausted
        return 'PROCESSED: ' + response

References

The concepts presented here are derived from the research literature in gradual typing. See e.g.: