Final Appendix (Wiki, Student-Generated; credits at end in print version)

Formalizing Constraint Generation

Expression at Node	Generated Constraints
n, where n is a numeral	n = number
true	true=boolean
false	false=boolean
(add1 e)	(add1 e)=number e=number
(+ e1 e2)	(+ e1 e2)=number e1 = number e2 = number
(zero? e)	(zero? e)=boolean e = number
(ncons e1 e2)	(ncons e1 e2)=list(num) e1 = number e2 = list(num)
(nﬁrst e)	(nﬁrst e)=number e = list(num)
(nrest e)	(nrest e)=list(num) e = list(num)
(nempty? e)	(nempty? e)=boolean e = list(num)
nempty	nempty=list(num)
(if c t e)	(if c t e)=t (if c t e)=e c=boolean
(lambda (x) b)	(lambda (x) b)= x→b
(f a)	f = a→(f a)

Unification Algorithm

Begin with an empty substitution. Push all the constraints onto a stack. If the stack is empty, return the substitution; otherwise, pop the constraint X = Y off the stack:

If X and Y are identical identiﬁers, do nothing.
If X is an identiﬁer, replace all occurrences of X by Y both on the stack and in the substitution, and add X 7→ Y to the substitution.
If Y is an identiﬁer, replace all occurrences of Y by X both on the stack and in the substitution, and add Y 7→ X to the substitution.
If X is of the form C(X1,...,Xn) for some constructor C, 4 and Y is of the form C(Y1,...,Yn) (i.e., it has the same constructor), then push Xi = Yi for all 1 ≤ i ≤ n onto the stack.
Otherwise, X and Y do not unify. Report an error

Midterm 1 Material

Multiple Argument Function Conversion Example

(with (f (fun (x y z) (+ x y)))
  (f 1 2))

==>

 (with (f (fun (x) (fun (y) (fun (z) (+ x y)))))
   ((f 1) 2)

Definitions

Abstract syntax - idealized syntax, designed for convenient use by the underlying interpreter/compiler; result of parsing a concrete syntax expression
Concrete syntax - syntax written by the user; consumed by the parser
Closure - a wrapper that contains an expression (function/ binding) and its environment; expression saved until it needs to be used
CPS stands for Continuation-Passing Style
Scope: a section of the program code, either as it appears in text or during execution
Static Scope: a subtree of the abstract syntax tree; an entire expression in the code as it appears in the program text
Dynamic Scope: a subtree of the dynamic execution tree; all the code which is dynamically evaluated "underneath" (in the stack) a given expression
First-order Functions are not values in the language. They can only be defined in a designated portion of the program, where they must be given names for use in the remainder of the program.
Higher-order Functions can return other functions as values and take other functions as parameters.
First-class Functions are values with all the rights of other values. In particular, they can be supplied as the value of arguments to functions, returned by functions as answers, constructed as values by expressions, and stored in data structures.
Deferred Substitution Every time we encounter a substitution (in the form of a with or application), we augment the repository with one more entry, recording the identiﬁer's name and the value (if eager) or expression (if lazy) it should eventually be substituted with. We continue to evaluate without actually performing the substitution. To prevent dynamic scoping, use closures to store the environment in which the functions are created
Cache calculated substitution values are saved the first time they are calculated. Can be implemented by adding a mutable field to thunkVs to store the cached values.

Eager vs. Lazy Evaluation

Lazy With- we reduce the named expression to a value only when we need to
Lazy Functions- we do not reduce the argument to a value until necessary
Strictness Points- the points where the implementation of a lazy language forces an expression to reduce to a value (example: in a function application, need to evaluate exactly what function to invoke, can't just leave it as a function closure)
Eager- the named expression is reduced to a value before binding it in the environment.

Recursion implementation by patching the environment

Fixed-point of a function: a value that, when input to a function, is also output from the function.

interp the bound expression with a fake environment binding, and then patch that binding to point to the result of said interp afterwards using set! mutation. This creates a cyclic environment that addresses the needs of recursion. The cyclicity ensures that there is always "one more binding" for the recursive function when we need it.

(rec (bound-id named-exp body)
  (local ([define new-env (anEnv bound-id (numV 0) env)]
          [define named-value (interp named-exp new-env)])
    (begin
      (set-anEnv-val! new-env named-value)  ;; Patches false out for the correct value
      (interp body new-env))))

Racket syntax and General Examples

BNF

 <CFWAE> ::= <num>
   | {+ <CFWAE> <CFWAE>}
   | {- <CFWAE> <CFWAE>}
   | {* <CFWAE> <CFWAE>}
   | {/ <CFWAE> <CFWAE>}
   | <id>
   | {if0 <CFWAE> <CFWAE> <CFWAE>}
   | {with {<id> <CFWAE>} <CFWAE>}
   | {fun {<id> ...} <CFWAE>}
   | {<CFWAE> <CFWAE> ...}
   | {rec {<symbol> <CFWAE>} <CFWAE>}

Conditional chains

 (cond
     [(number? v) ...]
     [(boolean? v) ...]
     [(string? v) ...]
     [else false])

Map performs a procedure on all elements of the list in order

 (map (lambda (x) (* 2 x)) [1, 2, 3]) 
 outputs [2, 4, 6]

Pattern matching

 '...' or '___' means 0 or more

 '..n' or '__n" means n or more

 '_' matches anything

==>

 (match data
  [pattern  action]
     [(? number?) (foo)]              ;; check if data is number 
     [(list a b c) (foo a b c) ]      ;; back-reference of list elements
     [((and n (? number?))) (foo n)]  ;; check data and back-reference it as n
     [else (foo)])

Begin evaluate expressions in order. Results are ignored except for that of last expression, which is the evaluated value. In this case, the side-effect of printing "Hello World" occurs, however the expression overall evaluates to 4.

 (begin
   (+ 5 2) (printf "Hello World!\n") (+ 3 1))

Functions
(lambda (x) x) ; first-class value, not used here
(define f (lambda (x) x))
(define (f x) x)
Type-case is a conditional construct specific to a particular define-type

 (type-case WAE expr
   [num (n) expr]
   ...)

Racket code specific to implementation of our languages

Environment: - repository of deferred substitutions

(define-type Env
     [mtEnv]
     [anEnv (name symbol?)(value CFWAE-Value?)])

Expression Closures:

(define-type CFWAE-Value
     [numV (num number?)]
     [exprV (body CFWAE?)(env Env?)] ;; used to hold an unevaluated expression and its environment (used in lazy interp)
     [closureV (param symbol?)(body CFWAE?)(env Env?)]) ;; used to close over the existing environment to ensure static scope

Define-type (From assignment 1)

(define-type WAE
 [num (n number?)]
 [binop (op procedure?) (lhs WAE?) (rhs WAE?)]
 [with (b Binding?) (body WAE?)]
 [with* (lob (listof Binding?)) (body WAE?)]
 [id (name symbol?)])

Strict code (from lecture)

(define (strict val)
 (if (expV? val)
     (strict (interp (expV-exp val) (expV-env val)))
     val))

Midterm 2 Material

State

Entails mutation
State is "threaded through" a program in the sense that the state that results from evaluating a given function is the state in which the next sequential function is to be evaluated in.
State is inherently dynamic and so isn't supported by solely the environment, as it has static scope in our implementations.
- The environment protects static scope while the store tracks the dynamic changes
- The environment maps identifiers to either locations (or values if there is no store) while the store maps locations to values.
Closures do not remember state as function are to be applied using the state that exists where they are applied.

Mutability implementation data structures

To support mutability in a language, we need two data structures:

Environment: This is our regular substitution list required to support static scope.

Maps either:
- Identifiers to values: Makes sense when we are differentiating between mutable and immutable identifiers. If it is immutable, it will simply map to a regular value. If it is mutable, it will map to a boxV value with a store location stored within.
- Identifiers to store locations: Only store locations are stored within the environment. Thus, even for an immutable value, there is always a two-step lookup: the environment maps the identifier to a store location, and the store maps the store location to a value.
Identifiers in the environment have lexical scope: a particular environment belongs to a particular expression scope.

Store: This is where we keep track of dynamic variable values.

Maps location to a value (which can be a boxV pointing to another location itself).
The store is "threaded through" the evaluation across sub-expressions.
Mutation occurs when we change a value in the store.

"Whereas the interpreter employs the same environment for both arms of an addition, for instance, it cascades the store from one arm to the next and then back out alongside the resulting value."

Meta vs Syntactic Interpreters

Syntactic Interpreter - An interpreter that uses the interpreting language to represent only terms of the interpreted language, implementing all the corresponding behavior explicitly

doesn't use Scheme's implementation of things (like numbers)
doesn't matter how well the interpreting and interpreted languages correspond

Meta Interpreter - An interpreter that uses language features of the interpreting language to directly implement behavior of the interpreted language

easy to write when there is a strong match between interpreted and interpreting language
uses Scheme's implementation of things (closures, procedure applications, numbers, etc)

Meta-Circular Interpreter - A meta interpreter in which the interpreting and interpreted language are the same

Only uses Scheme implementation of things

Continuations

What are they?

stack is represented procedurally
remember only the result and what is left to do (passes the state along or back to caller)
in the presence of tail call optimization, shifts used space from the stack onto the heap (hard to tell whether a particular program requires more heap space or stack space)

Why?	Why Not?
Saves on memory and computation (ex. tail recursion summation vs augmenting recursion summation) A way of simulating continuations in a language that lacks native support (useful for web applications, for instance) Makes consumption of stack space explicit; much easier to tell if a function is tail recursive	get rid of machine optimizations by forcing the structure of stack into continuation memory wastage if no tail call optimization (so useless frames are retained on stack even though all calls are tail calls) need access to the source of entire program (in particular, the caller of every CPS function must be CPS'ed) security: what if someone outside knew how to read the stack procedure?

Continuation Passing Style (end in /k)

Main Ideas

every function takes an extra argument (its continuation)
every argument in a function call must be an identifier, a primitive (such as if), or a lambda expression (not a call to a non-primitive function).
every call is a tail call

Continuation Passing Style rules

(1) Convert each function f to f/k and take an extra continuation argument.

(2) Find the "next step to take".

(3) If that step is in tail position, great! If it's a value, apply the continuation to it. If it's a function call, pass the continuation to the function call.

(4) If that step is not in tail position, cut it from its current location, replace it with a placeholder value (it!), write it in tail position (at the front), and give it a lambda that takes that placeholder as a parameter and includes the whole rest of the function as its body. Then, repeat from (2) on the whole rest of the function!

Example from midterm 2
We have the following, and want to convert it to CPS (assuming (all-pos/k) and (range/k) are correctly implemented, and represent CPS versions of all-pos and range, respectively:

(define (test-all-pos)
 (if (all-pos (range -5 5))
     'fail
     'pass))

==>

(define (test-all-pos/k k)
 (range/k -5 5
  (lambda (range-result)
   (all-pos/k range-result
    (lambda (allpos-result)
     (k (if allpos-result
           'fail
           'pass))))))

Note that this assumes if is a primitive that need not be converted into CPS.

(define (map f l)
 (if (empty? l)
    empty
    (cons (f (first l)) 
             (map f (rest l)))))

==>

(define (map/k f/k list k)
 (if (empty? list)
     (k empty)
     (f/k (first list)
          (lambda (f-result)
            (map/k f/k (rest list)
                   (lambda (r-result)
                     (k (cons f-result r-result))))))))

Note that this assumes empty, first, rest, and cons are all primitives that do not need to be converted.

Variables

Call-by-value

Call-by-reference

evaluated argument is held in a new location,
- Therefore, changes to the content of that location in the store don't affect the actual parameter

pass a reference to the actual argument, not the value
updates to the reference within the called procedure will become visible to the calling context
cheaper to use (no additional allocation) but introduce problems
to implement:
- create a closure and give it the location of the actual argument
- uses l-value: env lookup without store lookup
- any mutations to the formal parameter are now changes to the same location as the actual parameter

Stateful	Stateless
server maintains state information easier to program don't need setup and breakdown of state at each interaction ex: FTP interp of each command is relative to history of past commands	does not retain record of prior communication Web application must completely restore state of the computation for each interaction server can handle higher loads server can ignore clients who don't appear to be active must transmit enough data to resume computation

Web Programs

Receiver - a procedure of one argument representing the pending computation

any computation not mentioned in the receiver never gets performed because of the program's termination after each iteration

Lifting - make nested procedures into top-level procedures

e.g. lambdas become defines

Make a program Web-Ready

1. Generate receivers that capture pending computations
2. Pass values to receivers instead of returning them

Implications

1. Order of evaluation
2. Transformation is global
- all procedures in program must consume an extra receiver
3. Sequentializes the program

Escaper Lambdas Example

From assignment 7 (midterm 2 review), we want to describe the continuation of the bolded expression (namely (* (- x 32) 5) ) below:

(define (fahrenheit-celcius-converter type x)
  (case type
    ['Fahrenheit (/ (* (- x 32) 5) 9)]
    ['Celcius    (+ 32 (/ (* x 9) 5))]))

  (+ (fahrenheit-celcius-converter 'Fahrenheit 32) 
     (fahrenheit-celcius-converter 'Fahrenheit 212))

We can use escaper lambdas to do this. The continuation of the bolded expression in the first evaluation of the function is:

(lambda^ (value) (+ (/ value 9) (fahrenheit-celcius-converter 'Fahrenheit 212)))

And the second continuation is: (lambda^ (value) (+ 0 (/ value 9)))

Post-Midterm 2 Material

Lambda Calculus

Consists of:

1-parameter functions
1-argument function applications
identifiers

Shrinking of a language is achieved by showing that a given component of the language may be replaced by a combination of other components making the component unnecessary. (ex. numbers are unnecessary as they can be represented by a function that represents zero and a successor function) By creating a simplified language it is easier to construct proofs showing that the language has desired properties. (ex. correctly implements tail-call optimization, never accesses an array out of bounds, ...)

Lists:

cons_ = (lambda (f) (lambda (r) (lambda (selector) ((selector f) r))))
head_ = (lambda (p) (p (lambda (f) (lambda (r) f))))
rest_ = (lambda (p) (p (lambda (f) (lambda (r) r))))

Booleans:

if_ = (c t e) ;;only works in lazy eval
if_eager = (c (lambda (d) t) (lambda (d) e)) ;;works in eager
false = (lambda (t f) f)
true = (lambda (t f) t)

e.g. (if false (/1 0) 2)

    => (false (/1 0) 2)
    => ((lambda (t f) f) (/1 0) 2) ;; gets applied to function
    => 2

Representing zero, succession and sum using lambda calculus:

zero_ = (lambda (f) (lambda (x) x)) 
one_ = (lambda (f) (lambda (x) (f x))) 
two_ = (lambda (f) (lambda (x) (f(f x)))) ;; the f's keep on increasing with the number
succ_ = (lambda (n) (lambda (f) (lambda (x) (f ((n f) x)))))   ;; apply the function f n times
sum_ = (lambda (m)  (lambda (n)  ((n succ) m)))
dec_ = (lambda (n) (first ((n (lambda (p) (cons (rest p) (succ (rest p))))) (cons zero zero)))))
sub_ = (lambda (x) (lambda (y) ((y dec_) x)))
mul_ = (lambda (x) (lambda (y) ((x (sum_ y)) zero)))

Eliminating Recursion

While reducing the language closer to the lambda calculus, we needed a fixed-point function to avoid the problem of having unbound errors in recursion definitions (eventually would get to some expression of the form (f f))

These are the lecture notes that show how we created the fixed-point function and used it in implementing factorial:

(define _fixmaker
  (lambda (fixmaker)
    (lambda (f)
      (f (lambda (x) (((fixmaker fixmaker) f) x))))))

;; Then, we'll give it itself:
(define _fix
  ((lambda (f) (f f))
   _fixmaker))

;; Now, we can use fix to define recursive functions:
(define _fac
  (_fix (lambda (fac)
          (lambda (n)
            (((_if (_zero? n))
              (lambda (IGNORE) _one))
             (lambda (IGNORE) ((_mul n) (fac (_pred n)))))))))

Here is the Y-combinator all of its lambadic glory:

(lambda (p)
  ((lambda (f)
     (f f))
   (lambda (f)
     (p (f f)))))

Type Checking

Advantages

More Advantages

Disadvantages

Safety: A type checker can guarantee certain errors won't happen in the program before we run it.
Readability: Type declarations give information about the code to a programmer, and is never outdated in a program that compiles (unlike comments). This in itself is a form of Documentation.

Exploit types to make programs run faster, use less space, etc

Efficiency: Type declarations document static properties at compile time, so we don't have to check them each time code is visited during run time.

Reduce time spent debugging.
Catch errors in code that is not executed by the programmer (good if test suite is weak)
Prohibits certain kinds of terrible coding styles

Loses Expressiveness: some things will never produce errors but can't be type-checked will be rejected.
More Work at Start: upfront work is necessary before runtime testing.

Type - any property that can be established without executing the program. Types record kind of value not the precise value

A type system is that part of a programming language (definition and implementation) that concerns itself with making sure that no operations are performed on inappropriate (types of) arguments.

A language is “Type Safe” (or just “Safe”) if it has a type system that ensures that no operations can be performed on inappropriate arguments without this being detected. The property of type safety can be achieved by static checking (=detecting before runtime), dynamic checking (=detecting at runtime) or a combination of both!

Type Judgements - Collection of rules. One type rule for every syntactic construct. At least one type rule applies to every sub-term. A type error occurs when we are unable to construct a type judgment tree

Relation between type judgements for functions and applications:
- Function declaration: assume arg has right type, guarantee that the body will have the promised type
- Function application: guarantee the argument has the right type for function, assume the result will have the type the function promises

Function	Application	Recursion
Г[i <- τ1] \|- b : τ2 Г \|- fun {i : τ1} : τ2 b} : (τ1 ->τ2	Г \|- e1: τ1 -> τ2, Г \|- e2: τ1 Г \|- {e1 e2}: τ2	Г[i <- τi] \|- b : τ, Г[i <- τi] \|- v : τi Г \|- {rec {i : τi v} b} : τ

Note: Read "Г |- exp: τ" as "Exp has type τ."

Type Procedures (^) (polymorphic function)

e.g. map : (for all) a, b. list(a) × (a -> b) -> list(b)

Soundness:

Def: A type system is called “type-sound” or simply “sound” iff for every program that passes the type checker and for every possible execution of an expression in that program, it is true that the (runtime) value of that expression is an element of the (static) type of the expression.
A program P typechecks without error --> No possible runs of P have "forbidden errors"
Typechecker says P has an error <-- Some possible runs of P have "forbidden errors"
Soundness = "The type system gives an F to all invalid programs"

;; this is type sound
(define (type-of exp)
  (error "I don't know how to type-check!"))

Metavariables - Not program variables. Stand in for program text

Typing Control

Cannot write an infinitely long type
Strongly Normalizing: No matter what program you write in a strongly normalized language, it will always terminate
Typing Recursion: extend the environment for body to initiate recursion, extend again for v to sustain it

Datatype Variant Tags -- With static type checker, only need to store variant since type is guaranteed by type checker. Better space conception

Type Inferencing

Constraints are generated by traversing the abstract syntax tree and determining what types expressions are required to have by the operations that act with them.

-> Label expressions & add all constraints for the type of that particular expression

The unification algorithm is applied on the generated constraints: If the constraints unify a type can be determined for the expression otherwise type inferencing fails on the expression.

Unification Algorithm

1. If X and Y are identical identifiers, do nothing.
2. If X is an identifier, replace all occurrences of X by Y both on the stack and in the substitution, and
   add X |-> Y to the substitution.
3. If Y is an identifier, replace all occurrences of Y by X both on the stack and in the substitution, and
   add Y |-> X to the substitution.
4. If X is of the form C(X1,...,Xn) for some constructor C, and Y is of the form C(Y1,...,Yn) (i.e., it
   has the same constructor), then push Xi=Yi for all 1 <= i <= n onto the stack.
5. Otherwise, X and Y do not unify. Report an error.

Occurs Check - Throw an error if the replacee is in the replacer (ie - recursion)

Principal Type - Imposes only enough constraints needed for type soundness, and no more

Explicit Polymorphism

(define length
  < Λ (τ)
   (lambda (l : list (τ)) : number
     (cond
       [(Empty? <τ> l) 0]
       [(Cons? <τ> l) (add1 (length <τ> (Rest <τ>  l)))]))>)

The expression (Rest <τ> l) first applies Rest to τ, resulting in an actual rest procedure that applies to lists of values of type τ
This procedure consumes l as an argument and proceeds as it would in the type system free case
Every type-parameterized procedure, such as Rest or length, is a generator of infinitely many procedures that each operate on specific types.
Type Elaborator : The phase that performs type applications.

Implicit Polymorphism

Want to get new identifiers for the ones bound by the closures but not the ones in its lexical scope (lexical scope variables are shared between all applications of the closure). Only get fresh type variable for types introduced by let or letrec

Г|- v : τ' Г[x <- CLOSE(τ', Г)]|- b : τ
Г |- (let ([xv])b):τ

to get fresh type variables:

Г|- e : CLOSE(τ', Г')
Г |- e : τ

where τ is the same as τ', except the renaming applies only to type variables in τ' that are not bound by Г