Course:CPSC311/2010WT1/Final Exam Appendix
Do not remove: This reference sheet is the appendix for the CPSC 311 2010W1 Final Exam. Only the first 10 printed pages are guaranteed to be printed; so be compact! Deadline for edits: 0900 Monday 20 December.
Final Appendix (Wiki, Student-Generated; credits at end in print version)
Midterm 1 Material
Multiple Argument Function Conversion Example
(with (f (fun (x y z) (+ x y))) (f 1 2)) |
==> |
(with (f (fun (x) (fun (y) (fun (z) (+ x y))))) ((f 1) 2) |
Definitions
Abstract syntax - idealized syntax, designed for convenient use by the underlying interpreter/compiler; result of parsing a concrete syntax expression
Concrete syntax - syntax written by the user; consumed by the parser
Closure - a wrapper that contains an expression (function/ binding) and its environment; expression saved until it needs to be used
CPS stands for Continuation-Passing Style (a.k.a. Tail Call Elimination)
Static Scope: In a language with static scope, the scope of an identifier's binding is a syntactically delimited region. A typical region would be the body of a function or other binding construct.
Dynamic Scope: The scope of an identifier's binding is the entire remainder of the execution during which that binding is in effect.
First-order Functions are not values in the language. They can only be defined in a designated portion of the program, where they must be given names for use in the remainder of the program.
Higher-order Functions can return other functions as values and take other functions as parameters.
First-class Functions are values with all the rights of other values. In particular, they can be supplied as the
value of arguments to functions, returned by functions as answers, constructed as values by expressions, and stored in data structures.
Deferred Substitution Every time we encounter a substitution (in the form of a with or application), we augment the
repository with one more entry, recording the identifier's name and the value (if eager) or expression (if
lazy) it should eventually be substituted with. We continue to evaluate without actually performing the
substitution. To prevent dynamic scoping, use closures to store the environment in which the functions are created
Cache calculated substitution values are saved the first time they are calculated. By including a 3rd field, that is either false or contains the calculated value
Eager vs. Lazy Evaluation
Lazy With- we reduce the named expression to a value only when we need to
Lazy Functions- we do not reduce the argument to a value until necessary
Strictness Points- the points where the implementation of a lazy language forces an expression to reduce to a value (example: in a function application, need to evaluate exactly what function to invoke, can't just leave it as a function closure)
Eager- the named expression is reduced to a value before being substituting it.
Recursion implementation by patching the environment
Fixed-point of a function: a value that, when input to a function, is also output from the function.
interp the bound expression with a fake environment binding, and then patch that binding to point to the result of said interp afterwards using set! mutation. This creates a cyclic environment that addresses the needs of recursion. The cyclicity ensures that there is always "one more binding" for the recursive function when we need it.
(rec (bound-id named-exp body) (local ([define new-env (anEnv bound-id (numV 0) env)] [define named-value (interp named-exp new-env)]) (begin (set-anEnv-val! new-env named-value) ;; Patches false out for the correct value (interp body new-env))))
Racket syntax and General Examples
BNF <CFWAE> ::= <num> | {+ <CFWAE> <CFWAE>} | {- <CFWAE> <CFWAE>} | {* <CFWAE> <CFWAE>} | {/ <CFWAE> <CFWAE>} | <id> | {if0 <CFWAE> <CFWAE> <CFWAE>} | {with {<id> <CFWAE>} <CFWAE>} | {fun {<id> ...} <CFWAE>} | {<CFWAE> <CFWAE> ...} | {rec {<symbol> <CFWAE>} <CFWAE>} |
Conditional chains (cond [(number? v) ...] [(boolean? v) ...] [(string? v) ...] [else false]) |
Map performs a procedure on all elements of the list in order (map (lamda (x) (* 2 x)) [1, 2, 3]) outputs [2, 4, 6] |
Pattern matching '...' or '___' means 0 or more |
==> |
(match data [pattern action] [(? number?) (foo)] ;; check if data is number [(list a b c) (foo a b c) ] ;; back-reference of list elements [((and n (? number?))) (foo n)] ;; check data and back-reference it as n [else (foo)]) |
Begin evaluate expressions in order. Results are ignored except for that of last expression, which is the evaluated value. In this case, the side-effect of printing "Hello World" occurs, however the expression overall evaluates to 4. (begin (+ 5 2) (printf "Hello World!\n") (+ 3 1)) |
Functions (type-case WAE expr [num (n) expr] ...) |
Racket code specific to implementation of our languages
Environment: - repository of deferred substitutions
(define-type Env [mtEnv] [anEnv (name symbol?)(value CFWAE-Value?)])
Expression Closures:
(define-type CFWAE-Value [numV (num number?)] [exprV (body CFWAE?)(env Env?)] ;; used to hold an unevaluated expression and its environment (used in lazy interp) [closureV (param symbol?)(body CFWAE?)(env Env?)]) ;; used to close over the existing environment to ensure static scope
Define-type (From assignment 1) (define-type WAE [num (n number?)] [binop (op procedure?) (lhs WAE?) (rhs WAE?)] [with (b Binding?) (body WAE?)] [with* (lob (listof Binding?)) (body WAE?)] [id (name symbol?)]) |
Strict code (from lecture) (define (strict val) (if (expV? val) (strict (interp (expV-exp val) (expV-env val))) val)) |
Midterm 2 Material
State
- Entails mutation
- State is "threaded through" a program in the sense that the state that results from evaluating a given function is the state in which the next sequential function is to be evaluated in.
- State is inherently dynamic and so isn't supported by solely the environment, as it has static scope in our implementations.
- The environment protects static scope while the store tracks the dynamic changes
- The environment maps identifiers to either locations (or values if there is no store) while the store maps locations to values.
- Closures do not remember state as function are to be applied using the state that exists where they are applied.
Mutability implementation data structures
To support mutability in a language, we need two data structures:
Environment: This is our regular substitution list required to support static scope.
- Maps either:
- Identifiers to values: Makes sense when we are differentiating between mutable and immutable identifiers. If it is immutable, it will simply map to a regular value. If it is mutable, it will map to a boxV value with a store location stored within.
- Identifiers to store locations: Only store locations are stored within the environment. Thus, even for an immutable value, there is always a two-step lookup: the environment maps the identifier to a store location, and the store maps the store location to a value.
- Identifiers in the environment have lexical scope: a particular environment belongs to a particular expression scope.
Store: This is where we keep track of dynamic variable values.
- Maps location to a value (which can be a boxV pointing to another location itself).
- The store is "threaded through" the evaluation across sub-expressions.
- Mutation occurs when we change a value in the store.
"Whereas the interpreter employs the same environment for both arms of an addition, for instance, it cascades the store from one arm to the next and then back out alongside the resulting value."
Meta vs Syntactic Interpreters
Syntactic Interpreter - An interpreter that uses the interpreting language to represent only terms of the interpreted language, implementing all the corresponding behavior explicitly
- doesn't use Scheme's implementation of things (like numbers)
- doesn't matter how well the interpreting and interpreted languages correspond
Meta Interpreter - An interpreter that uses language features of the interpreting language to directly implement behaviour of the interpreted language
- easy to write when there is a strong match between interpreted and interpreting language
- uses Scheme's implementation of things (closures, procedure applications, numbers, etc)
Meta-Circular Interpreter - A meta interpreter in which the interpreting and interpreted language are the same
- Only uses Scheme implementation of things
Continuations
What are they?
- stack is represented procedurally
- remember only the result and what is left to do (passes the state along or back to caller)
- roughly the same idea as tail recursion
Why? | Why Not? |
---|---|
|
|
Continuation Passing Style (end in /k)
Main Ideas
- every function takes an extra argument (its continuation)
- every argument in a function call must be either a variable or a lambda expression (not a more complex expression).
- every call is a tail call
Example from midterm 2
We have the following, and want to convert it to CPS
(assuming (all-pos/k) and (range/k) are correctly implemented,
and represent CPS versions of all-pos and range, respectively:
(define (test-all-pos) (if (all-pos (range -5 5)) 'fail 'pass)) |
==> |
(define (test-all-pos/k k) (range/k -5 5 (lambda (range-result) (all-pos/k range-result (lambda (allpos-result) (k (if allpos-result 'fail 'pass)))))) |
Note that this assumes if is in CPS. |
(define (map f l) (if (empty? l) empty (cons (f (first l)) (map f (rest l))))) |
==> |
(define (map/k f/k list k) (if (empty? list) (k empty) (f/k (first list) (lambda (f-result) (map/k f/k (rest list) (lambda (r-result) (k (cons f-result r-result)))))))) |
Note that this assumes empty, first, rest, and cons are all in CPS. |
Variables
Call-by-value | Call-by-reference |
---|---|
|
|
Stateful | Stateless |
---|---|
|
|
Web Programs
Receiver - a procedure of one argument representing the pending computation
Lifting - make nested procedures into top-level procedures
|
Make a program Web-Ready
Implications
|
Escaper Lambdas Example
From assignment 7 (midterm 2 review), we want to describe the continuation of the bolded expression (namely (* (- x 32) 5) ) below:
(define (fahrenheit-celcius-converter type x) (case type ['Fahrenheit (/ (* (- x 32) 5) 9)] ['Celcius (+ 32 (/ (* x 9) 5))])) (+ (fahrenheit-celcius-converter 'Fahrenheit 32) (fahrenheit-celcius-converter 'Fahrenheit 212))
We can use escaper lambdas to do this. The continuation of the bolded expression in the first evaluation of the function is:
(lambda^ (value) (+ (/ value 9) (fahrenheit-celcius-converter 'Fahrenheit 212)))
And the second continuation is: (lambda^ (value) (+ 0 (/ value 9)))
Post-Midterm 2 Material
Lambda Calculus
Consists of:
- 1-parameter functions
- 1-argument function applications
- identifiers
Shrinking of a language is achieved by showing that a given component of the language may be replaced by a combination of other components making the component unnecessary. (ex. numbers are unnecessary as they can be represented by a function that represents zero and a successor function) By creating a simplified language it is easier to construct proofs showing that the language has desired properties. (ex. correctly implements tail-call optimization, never accesses an array out of bounds, ...)
Representing zero, succession and sum using lambda calculus:
zero_ = (lambda (f) (lambda (x) x)) ;; apply the function f zero times succ_ = (lambda (n) (lambda (f) (lambda (x) (f ((n f) x))))) ;; apply the function f n times sum_ = (lambda (m) (lambda (n) ((n succ) m)))
Eliminating Recursion
While reducing the language closer to the lambda calculus, we needed a fixed-point function to avoid the problem of having unbound errors in recursion definitions (eventually would get to some expression of the form (f f))
These are the lecture notes that show how we created the fixed-point function and used it in implementing factorial (_mul is multiplication, _pred is predecessor (i.e. "decrement"):
(define _fixmaker (lambda (fixmaker) (lambda (f) (f (lambda (x) (((fixmaker fixmaker) f) x)))))) ;; Then, we'll give it itself: (define _fix ((lambda (f) (f f)) _fixmaker)) ;; Now, we can use fix to define recursive functions: (define _fac (_fix (lambda (fac) (lambda (n) (((_if (_zero? n)) (lambda (IGNORE) _one)) (lambda (IGNORE) ((_mul n) (fac (_pred n))))))))) |
Here is the Y-combinator all of its lambadic glory: (lambda (p) ((lambda (f) (f f)) (lambda (f) (p (f f))))) |
Type Checking
Advantages | More Advantages | Disadvantages |
---|---|---|
Safety: A type checker can guarantee certain errors won't happen in the program before we run it.
|
Efficiency: Type declarations document static properties at compile time, so we don't have to check them each time code is visited during run time.
|
Loses Expressiveness: some things will never produce errors but can't be type-checked will be rejected. |
Type - any property that can be established without executing the program. Types record kind of value not the precise value
A type system is that part of a programming language (definition and implementation) that concerns itself with making sure that no operations are performed on inappropriate (types of) arguments.
A language is “Type Safe” (or just “Safe”) if it has a type system that ensures that no operations can be performed on inappropriate arguments without this being detected. The property of type safety can be achieved by static checking (=detecting before runtime), dynamic checking (=detecting at runtime) or a combination of both!
Type Judgements - Collection of rules. One type rule for every syntactic construct. At least one type rule applies to every sub-term. A type error occurs when we are unable to construct a type judgment tree
- Relation between type judments for functions and applications:
- Function declaration: assume arg has right type, guarantee that the body will have the promised type
- Function application: guarantee the argument has the right type for function, assume the result will have the type the function promises
Function | Application | Recursion |
---|---|---|
Г[i <- τ1] |- b : τ2 Г |- fun {i : τ1} : τ2 b} : (τ1 ->τ2 |
Г |- e1: τ1 -> τ2, Г |- e2: τ1 Г |- {e1 e2}: τ2 |
Г[i <- τi] |- b : τ, Г[i <- τi] |- v : τi Г |- {rec {i : τi v} b} : τ |
- Type Procedures (^) (polymorphic function)
e.g. map : (for all) a, b. list(a) × (a -> b) -> list(b)
Soundness:
- A program P typechecks without error --> No possible runs of P have "forbidden errors"
- Typechecker says P has an error <-- Some possible runs of P have "forbidden errors"
Metavariables - Not program variables. Stand in for program text
Typing Control
- Cannot write an infinitely long type
- Strongly Normalizing: No matter what program you write in a strongly normalized language, it will always terminate
- Typing Recursion: extend the environment for body to initiate recursion, extend again for v to sustain it
- Datatype Variant Tags -- With static type checker, only need to store variant since type is guaranteed by type checker. Better space conception
Type Inferencing
- Constraints are generated by traversing the abstract syntax tree and determining what types expressions are required to have by the operations that act with them.
- The unification algorithm is applied on the generated constraints: If the constraints unify a type can be determined for the expression otherwise type inferencing fails on the expression.
Unification Algorithm
1. If X and Y are identical identifiers, do nothing. 2. If X is an identifier, replace all occurrences of X by Y both on the stack and in the substitution, and add X |-> Y to the substitution. 3. If Y is an identifier, replace all occurrences of Y by X both on the stack and in the substitution, and add Y |-> X to the substitution. 4. If X is of the form C(X1,...,Xn) for some constructor C, and Y is of the form C(Y1,...,Yn) (i.e., it has the same constructor), then push Xi=Yi for all 1 <= i <= n onto the stack. 5. Otherwise, X and Y do not unify. Report an error.
Occurs Check - Throw an error if the replacee is in the replacer (ie - recursion)
Principal Type - Imposes only enough constraints needed for type soundness, and no more
Garbage Collection
Why manual?
1. Automatic GC is a slow and expensive process, therefore let people handle it and only do as much as is necessary. 2. Soundness over completeness -> Automatic GC cannot detect all correct instances of GC (can fail to reclaim, or reclaim too early)
Why not manual?
1. People make mistakes when de-allocating memory. (i.e. Reclaiming something that is in use, or failing to reclaim unused memory) 2. Loss of structural simplicity (e.g. multiple versions of the same code that differ only in memory management) 3. Loops often lose tail-calling behaviour
Algorithms for GC:
1. Pointer Counting -> Collect anything with 0 things referencing it (Cyclic Pointers!) 2. Mark and sweep 0) System is low on memory 1) Mark root set as always live, and recursively Mark down any reachable children of root set as live. 2) anything that can't be reached is trash and is Sweeped.
GC should demonstrate:
- Utility: - must reclaim enough garbage to actually help computation
- Soundness: - never reclaim something that is still needed
- Efficiency: - Run quickly
Space Leakage: - Collector not reclaiming space that we know is no longer needed (eg. global variables that are no longer needed)
Explicit Polymorphism
(define length < Λ (τ) (lambda (l : list (τ)) : number (cond [(Empty? <τ> l) 0] [(Cons? <τ> l) (add1 (length <τ> (Rest <τ> l)))]))>)
- The expression (Rest <τ> l) first applies Rest to τ, resulting in an actual rest procedure that applies to lists of values of type τ
- This procedure consumes l as an argument and proceeds as it would in the type system free case
- Every type-parameterized procedure, such as Rest or length, is a generator of infinitely many procedures that each operate on specific types.
- Type Elaborator : The phase that performs type applications.
Implicit Polymorphism
- Want to get new identifiers for the ones bound by the closures but not the ones in its lexical scope (lexical scope variables are shared between all applications of the closure). Only get fresh type varaibale for types introduced by let or letrec
Г|- v : τ' Г[x <- CLOSE(τ', Г)]|- b : τ
Г |- (let ([xv])b):τ
- to get fresh type variables:
Г|- e : CLOSE(τ', Г')
Г |- e : τ
- where τ is the same as τ', except the renaming applies only to type variables in τ' that are not bound by Г