Polymorphism & Type Inference ============================= Intro ----- `Slides `_ **Quizzes** cse116-nanotype-ind -> D1 cse116-typed-ind -> B cse116-subst-ind -> B cse116-unify-ind -> C, D, E cse116-infer-ind -> E Type System ----------- A type system defines what types an expression can have To define a type system, we need to define: - the syntax of types: what do types look like? - the static semantics of our language (i.e. the typing rules): assign types to expressions Syntax of Types ^^^^^^^^^^^^^^^ .. code-block:: haskell T ::= Int -- integers | T1 -> T2 -- function types Now, we define a typing relation ``e :: T`` ("e has type T"), inductively thru typing rules: .. code-block:: haskell [T-Num] n :: Int e1 :: Int e2 :: Int -- premises [T-Add] ---------------------- e1 + e2 :: Int -- conclusions [T-Var] x :: ??? Type Environment ^^^^^^^^^^^^^^^^ An expression has a type in a given type environment (or context), which maps all its free variables to their types: .. code-block:: haskell G = x1:T1, x2:T2, ..., xn:Tn -- now, our typing relation should include G: G |- e :: T -- e has type T in G Typing Rules ^^^^^^^^^^^^ An expression e has type T if we can derive ``G |- e :: T`` using these rules An expression e is well-typed in G if we can derive ``G |- e :: T`` for some type T .. code-block:: haskell -- typing rules using G [T-Num] G |- n :: Int G |- e1 :: Int G |- e2 :: Int [T-Add] -------------------------------- G |- e1 + e2 :: Int [T-Var] G |- x :: T if x:T in G G,x:T1 |- e :: T2 [T-Abs] ------------------------ G |- \x -> e :: T1 -> T2 G |- e1 :: T1 -> T2 G |- e2 :: T1 [T-App] ------------------------------------ -- modus ponens! G |- e1 e2 :: T2 G |- e1 :: T1 G,x:T1 |- e2 :: T2 [T-Let] ----------------------------------- G |- let x = e1 in e2 :: T2 .. note:: examples: .. code-block:: haskell -- 1 [] |- (\x -> x) 2 :: Int [T-Var] ------------------- [x:Int] |- x :: Int [T-Abs] ------------------- -------------- [T-Num] [] |- \x -> x :: Int -> Int [] |- 2 :: Int [T-App] ----------------------------------------------- [] |- (\x -> x) 2 :: Int -- 2 [] |- let x = 1 in x + 2 :: Int [T-Var] ----------------- -----------------[T-Num] x:Int |- x :: Int x:Int |- 2 :: Int [T-Num] -------------- ------------------------------------[T-Add] [] |- 1 :: Int x:Int |- x + 2 :: Int [T-Let] ----------------------------------- [] |- let x = 1 in x + 2 :: Int ``[] |- (\x -> x x) :: T`` is underivable, because T has to be equal to ``T -> T`` According to these rules, an expression can have zero, one, or many types. e.g. ``1 2`` has no types, ``1`` has 1 type, ``\x -> x`` has many types. One problem with this system: there's no generics. Polymorphic Types ----------------- We can formalize a type ``a -> a`` as a polymorphic type: ``forall a . a -> a`` - where ``a`` is a bound type variable - also called a type scheme - haskell has polymorphic types, but forall isn't usually required We can instantiate this scheme into different types by replacing ``a`` in the body with some type, e.g. instantiating with ``Int`` yields ``Int -> Int``. .. note:: Similar to lambda expression at type level With polymorphic types, we can derive ``e :: Int -> Int`` where ``e`` is .. code-block:: haskell let id = \x -> x in let y = id 5 in id (\z -> z + y) Inference works as follows: 1. When we have to pick a type T for x, we pick a fresh type variable a 2. So the type of ``\x -> x`` comes out as ``a -> a`` 3. We can generalize this type to ``forall a . a -> a`` 4. When we apply id the first time, we instantiate this polymorphic type with ``Int`` 5. When we apply id the second time, we instantiate this polymorphic type with ``Int ->Int`` Type System 3 ^^^^^^^^^^^^^ Types: .. code-block:: haskell -- Mono-types T ::= Int | T1 -> T2 | a -- type variables -- Poly-types S ::= T -- mono | forall a . S -- polymorphic -- where a ∈ TVar, T ∈ Type, S ∈ Poly Type Environment """""""""""""""" The type environment now maps variables to poly-types: ``G : Var -> Poly`` - example, ``G = [z: Int, id: forall a . a -> a]`` Type Substitutions """""""""""""""""" We need a mechanism for replacing all type variables in a type with another type: A type substitution is a finite map from type variables to types: ``U : TVar -> Type`` - example: ``U1 = [a / Int, b / (c -> c)]`` To apply a substitution U to a type T means replace all type vars in T with whatever they are mapped to in U - example 1: ``U1 (a -> a) = Int -> Int`` - example 2: ``U1 Int = Int`` Typing Rules """""""""""" We need to change the typing rules so that: .. code-block:: haskell -- 1. variables and their definitions can have polymorphic types [T-Var] G |- x :: S if x:S in G G |- e1 :: S G, x:S |- e2 :: T [T-Let] ------------------------------------ G |- let x = e1 in e2 :: T -- 2. we can instantiate a type scheme into a type G |- e :: forall a . S [T-Inst] ---------------------- G |- e :: [a / T] S -- 3. we can generalize a type with free type variables into a type scheme G |- e :: S [T-Gen] ---------------------- if not (a in FTV(G)) -- FTV = Free Type Variables G |- e :: forall a . S -- the rest of the rules are the same: [T-Num] G |- n :: Int G |- e1 :: Int G |- e2 :: Int [T-Add] -------------------------------- G |- e1 + e2 :: Int G,x:T1 |- e :: T2 [T-Abs] ------------------------ G |- \x -> e :: T1 -> T2 G |- e1 :: T1 -> T2 G |- e2 :: T1 [T-App] ------------------------------------ -- modus ponens! G |- e1 e2 :: T2 Examples """""""" .. code-block:: haskell -- derive: [] |- \x -> x :: forall a . a -> a [T-Var] --------------- [x:a] |- x :: a [T-Abs] ----------------------- [] |- \x -> x :: a -> a [T-Gen] ---------------------------------- not (a in FTV([])) [] |- \x -> x :: forall a . a -> a -- derive: [x:a] |- x :: forall a . a -- not derivable, since a is not in FTV([x:a]) -- derive: G1 |- id 5 :: Int where G1 = [id : (forall a . a -> a)] [T-Var] ----------------------------- G1 |- id :: forall a . a -> a [T-Inst]---------------------- -------------- [T-Num] G1 |- id :: Int -> Int G1 |- 5 :: Int [T-App] --------------------------------------------- G1 |- id 5 :: Int -- see slides page 12 for example 3 Representing Types ^^^^^^^^^^^^^^^^^^ The eventual goal is to create a function ``infer``, which: - given a context G and an expression e, - returns a type T s.t. ``G |- e :: T`` - or reports a type error .. code-block:: haskell data Type = TInt -- int | Type :=> Type -- T1 -> T2 | Var String -- a, b, c data Poly = Mono Type | Forall TVar Poly type TVar = String type TEnv = [(Id, Poly)] -- type environment type Subst = [(String, Type)] -- type sub **Main idea**: let's implement infer like this: 1. Depending on the kind of expression, find the typing rule that applies to it 2. If the rule has premises, recursively call ``infer`` to obtain the types of subexpressions 3. Combine the types of subexpressions according to the conclusion of the rule 4. If no rule applies, report a type error .. code-block:: haskell -- | This is not the final version!!! infer :: TypeEnv -> Expr -> Type infer _ (ENum _) = TInt infer tEnv (EVar var) = lookup var tEnv infer tEnv (EAdd e1 e2) = if t1 == TInt && t2 == TInt then return TInt else throw "type error: + expects Int operands" where t1 = infer tEnv e1 t2 = infer tEnv e2 The problem is, some of our typing rules are nondeterministic (see slides pg. 13) 1. guessing type .. code-block:: haskell infer tEnv (ELam x e) = tX :=> tBody where tEnv' = extendTEnv x tX tEnv tX = ??? -- ?????? tBody = infer tEnv' e 2. guessing when to generalize solution: 1. whenever we need to guess a type, don't. just return a fresh type variable 2. whenever a rule imposes a constraint on a type, try to find the right substitution for the free type vars to satisfy the constraint (unification) Unification ^^^^^^^^^^^ The unification problem: given two types T1 and T2, find a type substitution U s.t. ``U T1 = U T2``. Such a substitution is called a unifier of T1 and T2. e.g.: 1. The unifier of ``a`` and ``Int`` is ``[a/Int]`` 2. ``a -> a`` and ``Int -> Int`` is ``[a/Int]`` 3. ``a -> Int`` and ``Int -> b`` is ``[a/Int, b/Int]`` 4. ``Int`` and ``Int`` is ``[]`` 5. ``a`` and ``a`` is ``[]`` 6. ``Int`` and ``Int -> Int`` is invalid 7. ``Int`` and ``a -> a`` is invalid 8. ``a`` and ``a -> a`` is invalid 9. ``b`` and ``a -> a`` is ``[b/a -> a]`` Infer 2 """"""" To add constraint-based typing, we need to keep track of the current substitution: .. code-block:: haskell -- | Now has to keep track of current substitution! infer :: Subst -> TypeEnv -> Expr -> (Subst, Type) infer sub _ (ENum _) = (sub, TInt) infer sub tEnv (EVar var) = (sub, lookup var tEnv) -- Lambda case: simply generate fresh type variable! infer sub tEnv (ELam x e) = (sub1, tX' :=> tBody) where tEnv' = extendTEnv x tX tEnv tX = freshTV -- we'll get to this (sub1, tBody) = infer sub tEnv' e tX' = apply sub1 tX -- Add case: recursively infer types of operands -- and enforce constraint that they are both Int infer sub tEnv (EAdd e1 e2) = (sub4, TInt) where (sub1, t1) = infer sub tEnv e1 -- 1. infer type of e1 sub2 = unify sub1 t1 Int -- 2. constraint: t1 is Int tEnv' = apply sub2 tEnv -- 3. apply subst to context (sets in scope) (sub3, t2) = infer sub2 tEnv' e2 -- 4. infer e2 type in new ctx sub4 = unify sub3 t2 Int -- 5. constraint: t2 is Int .. note:: **Fresh Type Variables** How do you create a new fresh type variable every time? You'll have to pass an argument along. Polymorphism ^^^^^^^^^^^^ When do we generalize a type like ``a -> a`` to ``forall a . a -> a``? When do we instantiate a polymorphic type and to what? **Generalization and Instantiation** - Whenever we infer a type for a let-defined variable, generalize it - It's safe, even when not necessary - Whenever we see a variable with polymorphic type, instantiate it with a fresh type variable