
 = 

 ∈ 
 
 ⊂ 

 = 

 = 

\(
, \)
special characters make it possible to include parentheses in expressions in a way that does not cause them to interpreted as regular expression grouping symbols;\s
matches a single whitespace character;[09]
matches a single numeric digit;[az]
matches a single lowercase letter;[AZ]
matches a single uppercase letter;[azAZ]
matches a single letter;[azAZ09]
matches a single alphanumeric character.re
module provides functionality for automatically checking whether a string matches a particular regular expression. In order to check whether a string exactly matches a regular expression, it is necessary to wrap the regular expression in parentheses and then add special markers to ensure that the regular expression matches from the beginning of the string to the end of the string.
None
is returned.
 = 

 = 

 = 

 
 
 
 
 

 ::= 
 
 
 
⋮  
 

 = 

 ::= 

 = 

 = 

 ::= 
 
 

 = 

 = 

 ::= 
 
 
 
 
 
 
 
 

 = 

 
 
 
 
 

 = 

 ::= 
 
 
 
 
 
 ::= 
 
 ::= 
 
 ::= 
 
 ::= 

 = 

 ::= 
 
 
 
 ::= 
 
 
 
 ::= 
 
 
 
 
 
 
 
 

 ::= 
 
 

 ::= 
 
 

character string (concrete syntax) 
⇒  lexer/ tokenizer 
⇒  token sequence 
⇒  parser  ⇒  parse tree (abstract syntax) 
character string (concrete syntax) 
⇒ 
parser 
⇒  parse tree (abstract syntax) 
 ::= 
 
 

 ::= 
 
 

 ::= 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 ::= 
 
 
 
 
 
 

condition()
function with a generic parser for base cases that contain an arbitrary number of terminals in each sequence.
Any call to condition(...)
can then be replaced with a call to parseBaseCases(seqsCondition, ...)
.
 ::= 
 
 
 
 
 
 
 
 
 
 

 ::= 
 
 
 
 
 
 ::= 
 
 
 
 
 
 

formula()
uses backtracking because the formula production rule has three choices that all begin with the same nonterminal left. It would be possible to perform left factoring on the formula production rule as follows, which would then make it possible to implement formula()
as a predictive parser:
 ::= 
 
 ::= 
 
 
 
 
hw1/hw1.py
. An incorrect file path and/or file name will result in an automatic 10point deduction. Please follow the gsubmit directions.
hw1tests.py
. You should be able to place it in the same directory as hw1.py
and run it. Feel free to modify or extend it as you see fit.
regexp()
that takes a list of tokens and returns a single string. This string should be a regular expression that matches only the tokens in that list and nothing else.
tokenize()
that takes two arguments: a list of terminals and a concrete syntax character string to tokenize. It should build a regular expression, use that regular expression to convert the second string into a token sequence (represented in Python as a list of strings), and then return that token sequence. You should not assume the tokens will always be separated by spaces in the concrete syntax.
tree()
that takes a token sequence (represented in Python as a list of strings) and parses that token sequence into a parse tree according to the following grammar definition:
 ::= 
 
 
 
 

"Two"
, "One"
, or "Zero"
. An example input and output are provided below. Note that your implementation of tree()
must return a tuple containing the parse tree as well as the token list containing the remaining unparsed tokens (the token list may be empty).
json
module. Your implementation does not need to format its results in this way and can produce it in the above format.
#
character, and these special characters could be separated by whitespace from the actual variable or number):
 ::= 
 
 
 
 
 
 
 
 ::= 
 
 
 
 
 
 
 
 
 
 
 
 
 
 ::= 
 
 
 
 
 
 
 
 
 
 ::= 
 
 ::= 

program()
, formula()
, term()
, variable()
, and number()
. Each function corresponds to one of the productions. Below is an implementation of number()
:
Your algorithm must parse all programs in this programming language correctly. However, it need not return useful error messages if the input provided does not conform to the grammar definition.
variable()
that takes a single token list as an argument and, if the token sequence conforms to the grammar definition, returns a tuple containing two values: an instance of the abstract syntax (i.e., parse tree) with the root node label "Variable"
, and the remaining, unparsed suffix of the input token sequence.term()
that takes a single token list as an argument and, if the token sequence conforms to the grammar definition, returns a tuple containing two values: an instance of the abstract syntax (i.e., parse tree) with an appropriate root node label ("Plus"
, "Max"
, "If"
, "Variable"
, or "Number"
), and the remaining, unparsed suffix of the input token sequence.
formula()
that takes a single token list as an argument and, if the token sequence conforms to the grammar definition, returns a tuple containing two values: an instance of the abstract syntax (i.e., parse tree) with an appropriate root node label ("True"
, "False"
, "Not"
, "Xor"
, "Equal"
, "Less"
, or "Greater"
), and the remaining, unparsed suffix of the input token sequence.program()
that takes a single token list as an argument and, if the token sequence conforms to the grammar definition, returns a tuple containing two values: an instance of the abstract syntax (i.e., parse tree) with an appropriate root node label ("Print"
, "Input"
, "Assign"
, or "End"
), and the remaining, unparsed suffix of the input token sequence.parse()
that takes a single string as an argument. If the input string conforms to the grammar of the programming language, the function should return a parse tree corresponding to that input string.
 ::= 
 
 
 
 
 
 
 
 ::= 
 
 
 
 
 
 
 
 ::= 
 
 
 
 
 
 ::= 
 
 ::= 

 ::= 

[NameofInferenceRule] 

[Example] 

[AlgorithmCase] 

[AlgorithmCase] 

expressions (abstract syntax) 
⇒  evaluation algorithm 
⇒  values (abstract syntax) 
 ::= 
 
 ::= 

[True] 

[False] 

[NotTrue] 

[NotFalse] 

[AndTrueTrue] 

[AndTrueFalse] 

[AndFalseTrue] 

[AndFalseFalse] 

[OrTrueTrue] 

[OrTrueFalse] 

[OrFalseTrue] 

[OrFalseFalse] 

 = 
 
 = 
 
 = 
 
 = 
 
 = 
 
 = 
 
 = 
 
 = 
 
 = 
 
 = 

[True] 

[False] 

[Not] 

[And] 

[Or] 

vnot
, vand
, and vor
correspond to ¬, ∧, and ∨, respectively, in the operational semantics.
statement sequence (abstract syntax) 
⇒  execution algorithm 
⇒  output (record of side effects) 
 ::= 
 
 ::= 
 
 ::= 

[StatementPrint] 

[StatementEnd] 

[FormulaTrue] 

[FormulaFalse] 

[FormulaNot] 

[FormulaAnd] 

[FormulaOr] 

evaluate()
from a previous example.
print()
function and turn this into an interpreter that uses the Python runtime system directly to interpret the program. However, we must be careful about where we put the call to print()
: it must be invoked in a way that conforms to a preorder traversal of the syntax tree.

⇒  execution algorithm 
⇒  output (record of side effects) 
 ::= 
 
 ::= 
 
 ::= 
 
 ::= 

[StatementAssign] 

[StatementPrint] 

[StatementEnd] 

[FormulaTrue] 

[FormulaFalse] 

[FormulaNot] 

[FormulaAnd] 

[FormulaOr] 

[FormulaVariable] 

vnot
, vand
, and vor
are the same as those in a previous example.

⇒  execution algorithm 
⇒ 

 ::= 
 
 
 
 
 
 
 
⋮ 
[StatementAssign] 

[StatementPrint] 

[StatementRepeatTwice] 

[StatementEnd] 

evaluate()
is the same as the one in a previous example, while the implementation of execute()
from that example must be modified so that it also returns an updated version of the environment data structure.
[StatementWhileFalse] 

[StatementWhileTrue] 

[StatementUntilTrue] 

[StatementUntilFalse] 

program (abstract syntax) 
⇒ 

⇒  output and/or value(s) 
character string (concrete syntax) 
⇒ 
interpreter 
⇒  output and/or value(s) 


hw2/parse.py
and hw2/interpret.py
. Please follow the gsubmit directions and remember to put your files in the hw2
directory.
hw2tests.py
. Feel free to modify or extend it as you see fit.
parse.py
. You may adapt the techniques and/or code from Assignment #1 where appropriate to complete this problem. Recall that all parsers take a sequence of tokens (and possibly a flag) as their input, and return a tuple containing a parse tree and a sequence of remaining tokens.variable()
and number()
that can parse token sequences that conform to the following two productions. The functions should return a string or integer, respectively, without wrapping the result in a dictionary. If the first token in the token sequence supplied to a function does not correspond to the production, the function should return None
.
 ::= 
 
 ::= 

formula()
that can parse token sequences that conform to the following production. You may need to perform leftrecursion elimination on a production before you can implement a working parser. You should use the abstract syntax tree node labels "And"
, "Nonzero"
, "Not"
, "Variable"
, "True"
, and "False"
. If the token sequence is not in the language defined by the production, the function should return None
(this will be tested, and is necessary to receive full credit). Note that to complete this part, you will need to assume that a parser for term is already implemented (you will then implement it in the next part of the problem, immediately below).
 ::= 
 
 
 
 
 
 
 
 
 
 

term()
that can parse token sequences that conform to the following productions. These productions ensure that the traditional order of operations for addition and multiplication is preserved in a parse tree. However, you may need to perform leftrecursion elimination on a production before you can implement a working parser. You should use the abstract syntax tree node labels "Plus"
, "Mult"
, "Parens"
, "If"
, "Variable"
, and "Number"
. If the token sequence is not in the language defined by the production, the function should return None
(this will be tested, and is necessary to receive full credit).
 ::= 
 
 
 
 ::= 
 
 
 
 
 
 
 
 

program()
that can parse token sequences that conform to the following production. You may want to implement a helper parser function expression()
. You should use the abstract syntax tree node labels "Print"
, "Assign"
, "If"
, "DoUntil"
, and "End"
. Note that the last choice in the production defining program is meaningful: it means that an empty token sequence is a valid program. The parse tree corresponding to an empty program should be "End"
. If the token sequence is not in the language defined by the production, the function should return None
(this will be tested, and is necessary to receive full credit).
 ::= 
 
 
 
 
 
 
 
  
 ::= 
 
 
 
interpret.py
. Your interpreter must conform to the operational semantics for this language, as presented in each of the problem parts below.evalTerm(env, t)
that takes an environment env
and a parse tree t
as its two arguments. The function should return the value that corresponds to the evaluation of the parse tree t
, as determined by the operational semantics below.[TermNumber] 

[TermVariable] 

[TermParens] 

[TermIfTrue] 

[TermIfFalse] 

[TermPlus] 

[TermMult] 

evalFormula(env, f)
that takes an environment env
and a parse tree f
as its two arguments. The function should return the value that corresponds to the evaluation of the parse tree f
, as determined by the operational semantics below.[FormulaTrue] 

[FormulaFalse] 

[FormulaVariable] 

[FormulaNonzeroTrue] 

[FormulaNonzeroFalse] 

[FormulaNot] 

[FormulaAnd] 

execProgram(env, s)
that takes an environment env
and a parse tree s
as its two arguments. The function should return a tuple containing the output (represented as a list of values) and an updated environment that represent the result of the execution of the program s
as determined by the operational semantics below. Note that end ; corresponds to the empty token sequence and is used as a placeholder to make the notation unambiguous. You should not use the Python while
or for
control construct, or any other looping construct, anywhere in your implementation of the [StatementUntil*] cases; you will only receive credit if you use recursion as determined by the inference rules.[StatementPrint] 

[StatementAssign] 

[StatementIfFalse] 

[StatementIfTrue] 

[StatementDoUntilTrue] 

[StatementDoUntilFalse] 

[StatementEnd] 

interpret()
that takes a single string as its argument. The function should tokenize the string and parse it to generate a parse tree. If parsing fails, it should return None
; otherwise, it should execute the parse tree to obtain an output. It should then return the output. Your implementation only needs to interpret entire programs (you do not need to interpret standalone terms or formulas).Equal
, Less
, and Greater
for the tree node labels).program (e.g., input/configuration, instructions, etc.) 
computing device (e.g., difference engine, modern CPU, quantum computer, etc.) 
physical laws (mechanics, electricity, chemistry, biology, quantum mechanics, etc.) 
memory


central processing unit (CPU) 
machine program  
machine languauge  

FORTRAN program  
FORTRAN compiler  
machine program  
machine languauge  

application  
highlevel program  
compiler  
machine program  
machine languauge  

user A

user B


highlevel program 
highlevel program 
highlevel program 
highlevel program 

compiler  
machine program 
machine program 
machine program 
machine program 

machine languauge  
operating system  

application  application  application  application 
operating system  
hardware 
app.  app.  
web app. platform  
browser  app.  app.  
API  API  OS  OS  
web service  web service  virtual machine  virtual machine  
dist. platform  web server  virtualization layer  
controller  controller  OS  host operating system  
node  node  node  node  node  node  node  node  server  hardware (web/cloud server) 
source language abstract syntax 
⇒  compiler  ⇒  target language abstract syntax 
source language concrete syntax 
libraries  
⇓  ⇓  
parser  linker  ⇒  target language concrete syntax 

⇓  ⇑  
source language abstract syntax 
⇒  compilation algorithm 
⇒  target language abstract syntax 
source language abstract syntax 
⇒  comp. alg. A 
⇒  IR #1 
⇒  comp. alg. B 
⇒  IR #2 
⇒  comp. alg. C 
⇒  target language abstract syntax 
 ::= 
 
  
 ::= 
 
 
 
 
 
 
 
 
 
 
 
 ::= 
 
 ::= 
 
 ::= 

 ::= 
 
  
 ::= 
 
 
 
 
 
 
 
 
 
 
 
 
 
 ::= 
 
 ::= 
 
 ::= 

copy()
function below can be viewed as a macro that generates a specific machine language program.
increment()
that takes a single argument, a memory address, and generates a machine program that increments the value at the address specified memory address. We could then implement the following macro, which will set an entire region of memory to a particular constant.
setAllToInRegion()
on specific inputs generates a machine language program for a particular region.
15
). Note that #increment#
is a macro provided by the simulator and not part of the machine language itself; you cannot use #increment#
when working on an assignment because simulator provided to you will not support such macros.
 ::= 

compileFormula()
that takes as its input the parse tree, and a variable heap
that represents the highest address in memory that contains a computed value (in our case, 1
can represent true and 0
can represent false). The function returns a list of machine language instructions that computes that value, as well as the memory address where that value would be stored if anyone actually executed (or simulated) the machine language program.
true
is represented as 1, then our compiler would first emit the instruction set 8 1 and would store inside the environment a mapping {x ↦ 8}. It would then use this information to compile the expression condition and assignment statement in the body of the loop.
 ::= 
 
 
 
 
 
 
 
  
 ::= 

call printProc;
statement, or to the point corresponding to the second call printProc;
statement?
call printProc;
are compiled into machine code that stores the index of the current instruction at each call point at a particular memory location (in this example, suppose it is memory address 7
). Then, the block of machine code into which the procedure body is compiled can always consult the same memory location to determine to which point it should pass control back after the machine code for the procedure body finishes executing. Because control should not go back to the goto instruction but beyond it, it may be necessary to increment the value representing the program location before using the jump instruction. We provide machine language pseudocode below that does just that.
7
, making it impossible to determine the program location to which control must return after the outermost procedure body finishes. To address this, it is possible to have the machine language program maintain a stack starting at memory location 7
that will keep track of all the program locations to which control must return after each nested procedure call finishes. We do so in the machine language pseudocode below.
hw3/parse.py
(there is no need to modify this file);hw3/interpret.py
;hw3/machine.py
;hw3/compile.py
.hw3
directory.
hw3tests.py
. You should be able to place it in the same directory with the other assignment files and run it. Feel free to modify or extend it as you see fit.
hw3/interpret.py
. The abstract syntax for the language is presented below:
 ::= 
 
 ::= 
 
 ::= 
 
 ::= 
 
 
 
 
 
 
 
 
 
 
 
  
!=
):[TermVariable] 

[FormulaVariable] 

[FormulaNot] 

[FormulaXor] 

[FormulaAndShort] 

[FormulaAnd] 

[StatementAssign] 

[StatementProcedure] 

[StatementCall] 

evalTerm(env, t)
that takes an environment env
and a parse tree t
as its two arguments. The function should return the value that corresponds to the evaluation of the parse tree t
, as determined by the operational semantics.evalFormula(env, f)
that takes an environment env
and a parse tree f
as its two arguments. The function should return the value that corresponds to the evaluation of the parse tree t
, as determined by the operational semantics.execProgram(env, s)
that takes an environment env
and a parse tree s
as its two arguments. The function should return a tuple containing an updated environment and the output (represented as a list of values) that represent the result of the execution of the program s
as determined by the operational semantics.hw3/machine.py
already contains a simulator for the machine language with which you will be working; the machine language is defined in detail in a previous example in the lecture notes.
Although any correct implementation is acceptable, it is suggested that you follow the conventions below:
1
, for the stack;7
to store the memory address of the top of the stack;8
and higher for the heap (i.e., results of computations).hw3/machine.py
.increment(addr)
that takes a single integer argument addr
. The function should return a list of instructions (in which each instruction is represented as a string in the list). These instructions should correspond to a machine language program that increments by 1
the integer stored in the memory location addr
and cleans up any memory addresses it used in the process by setting them back to 0
.decrement(addr)
that takes a single integer argument addr
. The function should return a sequence of instructions (represented as a Python list of strings). These instructions should correspond to a machine language program that decrements by 1
the integer stored in the memory location addr
and cleans up any memory addresses it used in the process by setting them back to 0
.call(name)
that takes a single argument: name
is a string corresponding to the name of a procedure. The function should return a sequence of instructions that performs the following operations:
name
supplied;procedure(name, body)
that takes two arguments: name
is a string corresponding to the name of a procedure, and body
is a sequence of machine language instructions (represented as a Python list of strings). The function should return a sequence of instructions that includes:
hw3/compile.py
. The source language of the compiler will be the language for which you implemented an interpreter in Problem #1. The target language will be the machine language with which you worked in Problem #2.compileTerm(env, t, heap)
that takes three arguments: env
is a mapping from variables to memory addresses, t
is a term parse tree, and heap
is the memory address of the current top of the heap. The function should return a tuple (insts, addr, heap)
in which insts
is a sequence of machine language instructions (represented as a Python list of strings) that perform the computation represented by the parse tree, addr
is the address of the result, and heap
is an integer representing the memory of the top of the heap after the computation is performed.compileFormula(env, f, heap)
. The requirements for this function are the same as those for compileTerm(env, t, heap)
, except that it must handle formula parse trees.compileProgram(env, s, heap)
that takes three arguments: env
is a mapping from variables to memory addresses, s
is a program parse tree, and heap
is the memory address of the current top of the heap. The function should return a tuple (env, insts, heap)
in which env
is an updated environment, insts
is a sequence of machine language instructions (represented as a Python list of strings) that perform the computation represented by the parse tree, and heap
is an integer representing the memory of the top of the heap after the computation is performed.compile(s)
that takes a single string s
that is a concrete syntax representation of a program in the source programming language and returns its compiled form: a sequence of instructions in the target machine language.hw3/parse.py
has been updated to support this operator and to emit the 'Equal'
node when this operation appears in the concrete syntax. You must extend both the interpreter and compiler to support this operator. The inference rule for this operation is provided below.[FormulaEqual] 

source language abstract syntax 
⇒ ⇐ 
optimization algorithm #1 
⇓  
compilation algorithm A 

⇓  
intermediate representation (IR) #1 
⇒ ⇐ 
optimization algorithm #2 
⇓  
compilation algoritm B 

⇓  
intermediate representation (IR) #2 
⇒ ⇐ 
optimization algorithm #3 
⇓  
compilation algorithm C 

⇓  
target language abstract syntax 
⇒ ⇐ 
optimization algorithm #4 
i
corresponds to memory address 123
).
{'Mult':[{'Number':[3]}, {'Number':[4]}]}
can be evaluated in a finite amount of time, and then replace it with the value to which it evaluates. The overall tree would then look as follows.
[StmtFor] 

[ExpComprehension] 

highlevel source language abstract syntax 
⇒ ⇐ 
loop unrolling 
⇓  
SSA conversion  
⇓  
IR in SSA form  ⇒ ⇐ 
constant folding 
⇓  
compilation  
⇓  
lowlevel IR  ⇒ ⇐ 
register allocation 
⇓  
instruction selection 

⇓  
lowlevel target machine language abstract syntax 
source language abstract syntax 
⇒  compilation algorithm 
⇒  target language abstract syntax 
⇓  ⇓  
source language operational semantics (interpreter) 
target language operational semantics (simulator) 

⇓  ⇓  
source language abstract syntax for values/outputs 
⇒  simple conversion algorithm for values/outputs 
⇒  target language abstract syntax for values/outputs 
 = 

 ::= 

 ::= 
 
 
 
 
 
 
 
 

 ::= 
 
 

[FormulaTrue] 

[FormulaFalse] 

[FormulaNot] 

[FormulaAnd] 

[FormulaOr] 

[TermNumber] 

[TermPlus] 

 ::= 
 
  
 ::= 
 
 
 
 
 
 
 
 
 
 
 
 
 
 ::= 
 
 ::= 
 
 ::= 
 
 ::= 
 
 
 
 

[StatementAssign] 

[StatementEnd] 

[Variable] 

[FormulaTrue] 

[FormulaFalse] 

[FormulaNot] 

[FormulaAnd] 

[FormulaOr] 

[TermNumber] 

[TermPlus] 

[TermMult] 

[StatementWhileEvalTrue] 

[StatementWhileType] 

 ::= 
 
 
 
  
 ::= 
 
 ::= 
 
 ::= 
 
 ::= 
 
 
 
 

[StatementAssign] 

[StatementFunction] 

[StatementEnd] 

[TermVariable] 

[TermNumber] 

[TermPlus] 

[TermApply] 

 ::= 
 
 
 
  
 ::= 
 
 ::= 
 
 ::= 
 
 ::= 
 
 
 
 
 
 ::= 
 
 
 
[StatementAssign] 

[StatementFunction] 

[StatementEnd] 

[TermVariable] 

[TermString] 

[TermConcat] 

[TermApply] 

 ::= 
 
 
 
 
 
 
 
  
 ::= 
 
 ::= 
 
 ::= 
 
 ::= 

[ExpressionNumber] 

[ExpressionMult] 

[StatementPrint] 

[StatementFor] 

[StatementProcedure] 

[StatementCall] 

[StatementEnd] 

 ::= 
 
 
 
 
 
 
 
  
 ::= 
 
 
 
 
 
 ::= 
 
 ::= 
 
 ::= 
 
 ::= 

[DurationMinutes] 

[DurationHours] 

[DurationDays] 

[StatementReserve] 

[StatementFor] 

[StatementProcedure] 

[StatementCall] 

[StatementEnd] 

 ::= 
 
 
 
 
 
  
 ::= 
 
 ::= 
 
 ::= 
 
 ::= 

[StatementAssignSecret] 

[StatementAssignPublic] 

[StatementPrint] 

[ExpressionVariable] 

[ExpressionNumberSecret] 

[ExpressionNumberPublic] 

[ExpressionPlusSecretSecret] 

[ExpressionPlusSecretPublic] 

[ExpressionPlusPublicSecret] 

[ExpressionPlusPublicPublic] 

midterm/parse.py
(Problem #1);midterm/interpret.py
(Problem #2);midterm/analyze.py
(Problem #3);midterm/optimize.py
(do not modify);midterm/machine.py
(do not modify);midterm/compile.py
(Problem #4);midterm/validate.py
(Problem #5).midterm
directory (the directory and file names are casesensitive).
 ::= 
 
 
 
 
 
  
 ::= 
 
 
 
 
 
 
 
 
 
 
 
 ::= 
 
 ::= 

parse.py
, implement a function tokenizeAndParse(s)
that takes a single string s
as its one input. For any string that conforms to the above grammar, the function should return an abstract syntax instance (i.e., a parse tree). Your tokenizeAndParse(s)
implementation should perform both the tokenization and parsing steps; you will need to determine what kind of parser is needed for each production, and whether any adjustments must be made to the production before a parser can be implemented. Please adhere to the following guidelines:
'Assign'
for the assignment statement (e.g., the concrete syntax @ a := [1+2,4,6];
would be {'Assign':[{'Variable':['a']}, {'Plus':[{'Number':[1]},{'Number':[2]}]}, {'Number':[4]}, {'Number':[6]}, 'End']}
);'Loop'
for the loop case;'Element'
for the @ variable [ expression ] case;'True'
and 'False'
, and trees of the form {'Number':[...]}
, respectively):
 ::= 
 
 
 
 

[ExpBooleanTrue] 

[ExpBooleanFalse] 

[ExpNumber] 

[ExpVariable] 

[ExpPlus] 

[ExpElement] 

[StatementAssign] 

[StatementPrint] 

[StatementLoopZero] 

[StatementLoop] 

[StatementEnd] 

interpret.py
, implement a function interpret(s)
that takes a single string as its argument. The function should tokenize the string, parse it to generate a parse tree for a program, and then execute the parse tree it to obtain an output (i.e., a list of values that can be printed). It should then return the output. Please adhere to the following guidelines:
evalExpression(env, e)
helper function should be used to evaluate expression parse trees;execProgram(env, p)
helper function should be used to execute program parse trees;analyze.py
, complete the implementation of a monomorphic type checking algorithm for this programming language, and modify your implementation of interpret()
in interpret.py
to only execute programs that are welltyped.typeExpression()
and typeProgram()
should either return the types of the parse trees supplied to them if they conform to the type checking inference rules, or they should return None
. There are four possible types in this programming language (represented in your Python implementation as "TyNumber"
, "TyBoolean"
, "TyArray"
, and "TyVoid"
):
 ::= 

[ExpBooleanTrue] 

[ExpBooleanFalse] 

[ExpNumber] 

[ExpVariable] 

[ExpPlus] 

[ExpElement] 

[StatementAssign] 

[StatementPrint] 

[StatementLoop] 

[StatementEnd] 

compile.py
, the function compile(s)
takes a single string as its argument, tokenizes and parses it to generate a parse tree for a program, performs type checking and optimizations, and uses the functions compileExpression(env, e, heap)
and compileProgram(env, p)
to compile the parse tree into a machine language program (i.e., a Python list of strings in which each string is a machine language instruction) conforming to the machine language defined in detail in a previous example. You must complete the implementations of the functions compileExpression()
, compileProgram()
, and compile()
. Please adhere to the following guidelines:
compileExpression(env, e, heap)
function should be used to compile expression parse trees;compileProgram(env, p)
function should be used to compile program parse trees, and it should work even when it is given only two arguments (you may add more arguments, but you must supply default values for them);'Assign'
, the machine language code generated by your compiler should:
a
, a+1
, and a+2
)'Element'
case (of the form @ x [ e ]), the machine language code generated by your compiler should:
compile()
function should parse the input concrete syntax string into an abstract syntax tree, and then should call the type checking, constant folding, and dead code elimination algorithms (the latter two are found in optimize.py
) before compiling it (to receive full credit, you must call these in an order that minimizes the amount of work being done by each function).p
that have at most some specified depth k:
validate.py
, there exists an incomplete bounded exhaustive testing algorithm for validating your compiler implementation against your interpreter implementation. Complete the convertValue()
function that converts a single value abstract syntax tree into its corresponding machine language representation. Finally, complete the recursive functions programs()
and expressions()
so that they generate an exhaustive list of program and expression parse trees, respectively, of the depth specified in the parameter. Please adhere to the following guidelines:
'a'
(though it can occur any number of times);'x'
(though it can occur any number of times);1
(though it can occur any number of times);'Plus'
nodes ever occur in any parse tree (this is to avoid array outofbounds errors for larger values of n, since arrays always have three elements).imperative  procedural  object oriented 
declarative  functional  static type checking 
garbage collection 
side effects 

machine languages 
some support (indexed memory, branching) 
no support  no support  no support  no support  none  no  yes  
FORTRAN (initial release) 
full support  no support  no support  no support  no support  very limited 
no  yes  
C  full support  full support  no support  no support  some support (function pointers) 
very limited 
no  yes  
C++  full support  full support  full support  class declarations, some type checking 
function pointers and some extensions (e.g., STL) 
limited  no  yes  
Java  full support  full support  full support  type system, class declarations, abstract classes, interfaces, type checking, and type inference 
generics, interfaces, objects as wrappers, anonymous classes 
extensive  yes  yes  
PHP  full support  full support  full support  no support  some support  none  yes (≥ 5.3) 
yes  
Python  full support  full support  full support  no support  full support  none  yes  yes  
JavaScript  full support  full support  full support  no support  full support  none  yes  yes  
Haskell  metasupport (monads) 
metasupport (functions) 
no support  full support  full support  extensive  yes  some (IO monads) 

SQL  some support (insertion/ deletion; sequences of statements in versions like MSSQL) 
some support (procedures in versions like MSSQL) 
no support  full support (definition of schemas and tables; ability to evaluate queries over tables) 
some support (custom map and aggregation procedures in MSSQL) 
some  yes  some  
Amazon Web Services APIs 
some support  no support  no support  some support  some support  none  n/a  yes 
 ::= 
 
 ::= 

 = 

 = 

 = 

 = 

 = 

 ≠ 

 ::= 
 
 ::= 

 = 

 = 
 
 = 
 
 = 
 
 = 
 
 = 
 
 = 
 
==
operator), and Haskell supports the automatic derivation of this algorithm for any data type (using the deriving Eq
directive).
Tree
algebraic data type. However, in order to actually display instances of the data type, it is necessary to ask the Haskell interpreter to derive a definition of a show
function automatically from the definition of the algebraic data type (something that is done implicitly in, for example, Python). Haskell requires that this be requested explicitly because it also allows users to define their own variants of such functions.
_
) in patterns makes it possible to write even more concise definitions.
==
. To have the Haskell interpreter or compiler derive it automatically, you can use deriving Eq
. These can be combined with other deriving
directives, such as in the example below.
hw4/parse.py
(there is no need to modify this file);hw4/interpret.py
;hw4/Tree.hs
;hw4/Interpret.hs
.hw4
directory.
hw4tests.py
. You should be able to place it in the same directory with the other assignment files and run it. Feel free to modify or extend it as you see fit.
hw4/interpret.py
.subst(s, a)
that takes two arguments: a substitution s
(represented as a Python dictionary), and an abstract syntax tree a
. The function should return a new abstract syntax tree in which every variable in the tree that is in the domain of s
has been substituted according to the substitution s
. You may assume that variables are always represented using a subtree of the form {"Variable":[ ... ]}
, as in all previous examples and assignments.unify(a, b)
that takes two arguments: two syntax trees a
and b
. The function should return the smallest substitution s
that satisfies the following equation:
hw4/interpret.py
. The abstract syntax for the language is presented below; a complete parser that parses the concrete syntax for this language into the abstract syntax below can be found in hw4/parse.py
. Notice that in the concrete syntax, constructor names always begin with an uppercase letter, while variable names always begin with a lowercase letter.
 ::= 
 
  
 ::= 
 
 
 
 
 
 
 
 
 
 ::= 
 
 
 
 
 
 
 
 
 
 
 
 
 
 ::= 
 
 ::= 
 
 ::= 

build(m, d)
that takes a finite map m
(i.e., a Python dictionary mapping names to lists of (pattern, expression) tuples) and a declaration parse tree d
. The function should return the finite map m
that represents the module definition that is assembled according to the operational semantics presented below:[DeclarationFunctionFirst] 

[DeclarationFunctionMore] 

[DeclarationEnd] 

 ::= 
 
 
 
 

evaluate(m, env, e)
that takes a module m
, an environment env
, and an expression abstract syntax tree e
as its three arguments. The function should return the value that corresponds to the evaluation of the abstract syntax tree e
, as determined by the operational semantics presented below. Note that the [ExpressionApply] requires using a unification algorithm to obtain a substitution σ that unifies p and v_{1}.[ExpressionConstructorArgs] 

[ExpressionConstructorNoArgs] 

[ExpressionNumber] 

[ExpressionVariable] 

[ExpressionMult] 

[ExpressionApply] 

build()
and evaluate()
are implemented, you can use the interact()
function in hw4/interpret.py
to query modules.hw4/Tree.hs
. You will be working with the following data type definition, which has already been included in the file:
Tree
in Tree.hs
so that the functions (==)
and show
are automatically generated for this data type.twigs :: Tree > Integer
that returns an integer representing the total number of Twig
values in the input tree.
branches :: Tree > Integer
that returns an integer representing the total number of branches in the input tree.
width :: Tree > Integer
that returns an integer representing the width of the input tree. Trees consisting of a Leaf
or Twig
are defined to have a width of 1
; all other trees are as wide as the sum of the widths of their children.
perfect :: Tree > Bool
that returns True
if the input tree is perfect and False
otherwise.
degenerate :: Tree > Bool
that returns True
if the tree supplied is degenerate, and False
otherwise.
infinite :: Tree
that is infinitely large, but in which every branch has a left Leaf
child, a right Leaf
child, and a Branch
middle child.hw4/Interpret.hs
.evaluate :: Term > Value
function so that it evaluates abstract syntax trees of the type Term
to obtain values (i.e., integers).
execute :: Stmt > Output
function so that it executes abstract syntax trees of the type Stmt
to obtain an output.
 ::= 
 
 
 
 
 
 
 
 
 
 
 
 
 
⋮  
 
 
 
 
 
 
⋮  
[Integer]
.
Tree
data type that we have already seen. If we were to define our own algebraic data type for lists in Haskell, for example, we could do so as follows (the terms "nil" and "cons" are commonly used in the functional language community for the empty list and for the list node constructor, respectively):
[]
, and the list node constructor is the operator (:)
, which can also be written as an infix operator :
. The following three syntaxes are equivalent, and evaluate to the same list [1,2,3]
:
String
; strings in Haskell are just lists of characters:
GameStrategyAlgorithm
is equivalent to the type [[(Integer, Integer)]] > String
(i.e., the type for functions that take lists of lists of integer pairs and return a string):
1
in every position:
(:)
list node constructor:
n
elements of a list (note that this function is already included in the Prelude
; we use take_
for our examples to avoid a name collision).
take
.
addInfiniteLists
function to add two infinite lists. Suppose we have the following definitions:
3
:
g
is a function that takes an integer as an argument. It then throws this argument away and returns the function f
. Its type reflects this. Thus, the result of evaluating an expression such as g(4)
is a function of type Integer > Integer
, and this function can be applied to another argument.
g
that can be partially applied to only some of its arguments, but that still uses all of its arguments? We can do so by listing multiple, spaceseparated parameters in the definition of g
.
g
to arguments one at a time, as before.
f
of type Integer > Integer
and returns a function f'
that is its reflection across the xaxis (i.e., for all x
, f(x)
= f'(x)
).
(Integer > Integer) > (Integer > Integer)
because >
is rightassociative. This makes it more obvious that reflect
takes a function of type Integer > Integer
as an input and returns a function of type Integer > Integer
as an output.
f
of type Integer > Integer
and returns an integer approximation of its definite integral starting at x = 0.
f
of type Double > Double
and returns an approximation of its definite integral starting at x = 0 for a given precision.
f
of type Double > Double
and returns an approximation of its derivative.
user()
and owns()
either as tables in a database (with each declaration for the function acting as a row in the table) or as logical predicates.
file()
and user()
must take place when the program is running. We can make the code more efficient by instead declaring distinct types. Since type checking is performed only at compile time in Haskell, this will still ensure that there are no type errors, but it will be more efficient because fewer function invocations will occur when the program is running.
0
, 1
, 2
, and so on) and builtin operators (such as +
, *
, and so on).
Meters
.
Meters
to strings.
$
syntax to easily write down abstract syntax trees.
parseTree
will yield the actual algebraic data type instance (i.e., a parse tree).
if ... then ... else ...
for writing conditional expressions:
if ... then ... else ...
syntax (or function, as we have seen above) can be useful in certain situations.
String
and a value (e.g., an Integer
). For example:
lookup
. However, since it returns a result that is wrapped in a Maybe
constructor, we can unwrap it using, e.g.:
Maybe
in more detail in a subsequent section.
/=
on booleans):
Red
has the lowest frequency and Blue
has the highest frequency, we can define an ordering on all instances of this data type by making the Color
type a member of the Ord
type class. Note that we could use wildcards (_
) to make the definition more concise.
<
, <=
, >
, and >=
to compare colors, as well as library functions such as max :: Color > Color > Color
, min :: Color > Color > Color
, maximum :: [Color] > Color
, and minimum :: [Color] > Color
.
compose
?
compose
; let's substitute one the rightmost Function
synonym with its definition:
>
type operator is rightassociative; this means we can remove the rightmost set of parentheses without changing the meaning of the type:
compose
: it takes two Function
arguments, and one Integer
argument:
(.)
. So we could also have defined compose
as follows:
<
can be any pattern. If a pattern appears on the lefthand side of <
, then each element in the list on the righthand side will be checked against the pattern. Only those elements that unify with the pattern will be considered, and for each of them, the subsequent conditions in the comprehension, as well as the body of the comprehension, will be evaluated.
example
. The second expression builds a list containing the right subtrees that are not leaves. The last expression builds a list of subtree pairs for each nonleaf tree.
()
for each matching entry and then checking the length of the list resulting from the comprehension, or by comparing it to the empty list.
X
or O
somewhere on the board. We can model this using a function that, given a move, transforms one board into another board. We use a list comprehension that updates only the cell with the corresponding position.
X
and O
players take alternating turns.
Board
), and can have any number of child trees (including zero).
nextMoves
takes a Board
and a Cell
(to indicate which player is taking a turn), and uses a list comprehension to choose only the board cells that are empty. For each empty cell, it adds a new board state to the list in which that position is filled with the specified player's symbol.
win :: Cell > Board > Bool
that checks if a given board is already a win for a particular player. We can then add an Ord
instance to compare boards from player X
's perspective: a board is "greater" (or "better") than another board if either the board is a win for X
, or the other board is a loss for O
.
<
, <=
, >
, and >=
to compare boards, as well as library functions such as max :: Board > Board > Board
and maximum :: [Board] > Board
to choose the "best" boards.
len
for computing lengths of lists of integers.
Bool
values because the type [Integer]
does not unify with the type [Bool]
. We could write an alternative type to address this.
len
on lists of type [Bool]
, but we can no longer use it on lists of type [Integer]
. Notice that the only thing that has changed is the type of the function; the body is the same in both cases. Haskell allows us to take advantage of this by employing a type variable.
len
function to a list containing any type of element. This is because the type variable a
unifies with any other type (including Bool
and Integer
), so the input type [a]
will unify with both [Integer]
and [Bool]
, as well as types such as [String > String]
(a list of functions that take String
inputs and return String
outputs).
len
is a function that can be applied to different input types, but has only a single definition that works for all input types, it is an example of parametric polymorphism.
Set
data structure for holding sets of Integer
values that has an underlying list implementation.
Type > Type
, and so on).
Set
data structure for holding sets of values of any type a
.
Ord a
constraint on the element type a
in case we want to compare them (e.g., if we are using a binary search tree implementation of sets).
Set
data structure for holding sets of values of any type a
that has a corresponding size :: Set a > Int
function.
<
operator by making the type Set a
(for any possible a
) a member of the Ord
type class.
a
must be in the Eq
type class. This is because the definition of Set a
contains a deriving Eq
clause, which means that a type Set a
can only be used in a module if the parameter type a
is a member of the Eq
type class (otherwise, Haskell would not be able to compare two sets for equality using the automatically derived definition of (==)
generated by the deriving Eq
clause).
Integer
lists that does nothing.
(:)
and []
) in the input list with a binary operator and a base case (in this case, (+)
and 0
). This way, the fold
function now computes the sum of the elements in the list.
(:)
, so it must be of type Integer > Integer > Integer
, and a replacement for []
, which must be of type Integer
.
fold
implementation to define many common functions on lists of integers.
(:)
and []
) is called foldr
.
foldr
function can be used to implement many of the usual library functions one might find useful when working with lists, and this is what is done in the Haskell libraries. Below are some examples (some of these are simplified versions of what is actually found in the libraries).
1
to every element in an Integer
list. For example, we want to achieve the following using a call to foldr
:
1
to its first argument, then builds the list node:
foldr
:
addOneThenCons
:
foldr
:
consIfPositive
:
(Name, Age)
:
Name
entries in the table; this query amounts to a map operation:
Age
range; this query amounts to a filter operation:
Age
fields of all the entries; this query amounts to fold operation (it is often called an aggregation operation):
map
, filter
, foldr
, and λ abstractions instead of list comprehensions (the Haskell compiler simply converts list comprehensions into some combination of these during compilation):
map
, filter
, reduce
(similar to Haskell's foldr
), and lambda abstractions:
li
elements and updates the text content of each to indicate the item number corresponding to that element. Notice that the .text()
function takes a function as an argument (the function supplied as an argument takes an index as an input and returns a string):
.each()
function.
hw5
directory):
hw5tests.hs
. You should be able to place it in the same directory with the other assignment files and load it. Feel free to modify or extend it as you see fit.
Graph.hs
and Algorithm.hs
will contain libraries for building state space graphs and state space search algorithms. The modules BinPacking.hs
and SuperString.hs
will use these libraries to solve specific optimization problems.
BinPacking.hs
is defined as follows: given a list of items, each of a certain integer size, put each item into one of two bins so that once all items have been allocated, the sum of the items in the first bin is as close as possible to the sum of the items in the second bin (this problem is NPcomplete and is equivalent to the subset sum problem, so no efficient solution for the problem is believed to exist).BinPacking
, in which the two bin totals so far and remaining list of items are captured under the constructor BinPack
:
b :: BinPacking
is better than another packing b' :: BinPacking
if the difference between the two bin totals in b
is less than the difference between the two bin totals in b'
. Add an instance declaration for the BinPack
type so that it is possible to use the builtin Haskell infix operators <
and <=
to compare two bin packings according to their differences. You may use the Haskell library function abs :: Integer > Integer
to compute the absolute value of an integer. Note that the list of items is ignored, since we are only comparing the two packed bins. If you're getting a stack overflow when testing <
, make sure you also define <=
explicitly.
BinPacking
type a member of the State
type class by adding an instance
declaration defining the outcome :: BinPacking > Bool
and choices :: BinPacking > (BinPacking, BinPacking)
functions (in an instance
declaration, Haskell does not require or allow type annotations, so use them for reference but do not add them explicitly).BinPacking
represents an outcome only if there are no more items left to allocate; each nonoutcome BinPacking
value has two possible choices on how to proceed next: by adding the item at the top of the item list to the left bin, or by adding the item at the top of the list to the right bin.
SuperString.hs
is defined as follows: find the shortest superstring that can be built from a sequence of arriving strings (a superstring is a string that contains all the strings in the sequence, with overlapping allowed). Variants of the shortest superstring problem are encountered in both DNA sequencing and data compression, among many other areas; the problem in its general form is NPcomplete.SuperString
, in which the current superstring and the remaining string sequences are captured under the constructor SuperStr
:
s :: SuperString
is better than another state s' :: SuperString
if the superstring built in s
is shorter. Add an instance declaration for the SuperString
type so that it is possible to use the builtin Haskell infix operators <
and <=
to compare two states according to superstring lengths.
SuperString
a member of the State
type class by adding an instance
declaration defining the outcome :: SuperString > Bool
and choices :: SuperString > (SuperString, SuperString)
functions. You are given a function merge :: String > String > String
for merging strings (order matters):
SuperString
represents an outcome only if there are no more strings left to merge; each nonoutcome SuperString
value has two possible choices on how to proceed next: by merging the string at the top of the string sequence to the superstring on the left side, or by merging it to the superstring on the right side.
Graph.hs
module is used to build state space graphs (finite or infinite). We represent the graph of possible states that an algorithm can traverse given a starting state using the polymorphic Graph a
data type:
mapTuple :: (a > b) > (a, a) > (b, b)
that takes a function of type a > b
and applies it to both elements of a tuple.
state :: Graph a > a
that takes a state space graph and returns the state at the root node of that graph. We provide an example of how this function can be used from inside the BinPacking.hs
module:
g :: Graph a
is better (i.e., less than) than another graph g' :: Graph a
if according to the ordering on states, the contents of g
are less than the contents of g'
. Add an instance declaration for the Graph a
type so that it is possible to use the builtin Haskell infix operators <
and <=
to compare two graphs (regardless of what the type a
of the states in the graph might be).
graph :: State a => a > Graph a
that takes a starting state and returns the full graph (possibly infinite) of all possible state space traversals starting at that state. An example of this function being used from inside the SuperString.hs
module is provided (spacing has been adjusted for legibility):
depths :: Integer > Graph a > [Graph a]
that returns a list of all the graphs that are at depth exactly n
within the supplied state space graph (the root node is at depth 0
).fold :: State a => (a > b) > (a > (b, b) > b) > Graph a > b
that takes two functions o :: a > b
and c :: a > (b, b) > b
and performs a fold operation over the last (third) argument: a state space graph g :: Graph a
.outcomes :: State a => Graph a > [Graph a]
that returns the list of all the leaf nodes of the supplied state space graph. Solutions must use fold
and must fit on one line; all other solutions will receive no credit.Algorithm.hs
, you will implement a small library for creating optimization algorithms; a function of type Algorithm a
takes a graph, chooses some descendant node in the graph, and returns the subgraph for which that descendant node is the root.
greedy :: Ord a => Algorithm a
that takes a graph as an input. It should choose and return the better of the two children of the root.patient :: Ord a => Integer > Algorithm a
that takes an integer argument n
and a graph as its two inputs, and chooses and returns the best descendant of the supplied graph that can be found at depth n
(an allocation is best if it is better than all others at that depth). If n
is 0
, the function should simply return the graph supplied to it as an input.optimal :: (State a, Ord a) => Algorithm a
that takes a state space graph, and returns the leaf node in the state space graph that has the best allocation. Hint: you can define this function in one lines.metaCompose :: Algorithm a > Algorithm a > Algorithm a
that takes two algorithms as arguments. It should apply the first algorithm to obtain a new subgraph, and then it should apply the second algorithm to that subgraph and return the result.
metaRepeat :: Integer > Algorithm a > Algorithm a
that repeats a given algorithm a specified number of times. Your definition of metaRepeat
should take three arguments having types Integer
, Algorithm a
, and Graph a
. If the integer is 0
, it should simply return the graph unchanged. If the integer is positive, it should apply the algorithm the specified number of times to the graph and then return the result.metaGreedy :: Ord a => Algorithm a > Algorithm a > Algorithm a
that takes two algorithms as its inputs. It should apply the two algorithms to its input graph and should return the better of the two results that it obtains.impatient
algorithm below is superior to patient
, and one way in which it is inferior.
fit :: Graph a > [Algorithm a] > Algorithm a
that takes a graph and a list of algorithms as inputs; this metaalgorithm should choose the algorithm in the list of algorithms that has the best performance on the given graph, and return it. Hint: define a way to compare strategies and use minimumBy
.interpret :: (State a, Ord a) => Alg > Algorithm a
function for embedded algorithm descriptions of type Alg
. Running the interpreter should produce exactly the algorithm described by the supplied data type value of type Alg
.Time
a member of the Num
type class so that numeric literals, +
, and *
can all be used to operate on values of type Time
. Adding infinity or multiplying by infinity should yield infinity.performance :: Alg > Time
function, which should compute the number of steps (i.e., state comparisons) that an algorithm might perform in the worst case. You may use the ^
operator for exponentiation of integers.Tree a
:
foldTree
function for values of type Tree a
, we first list all the constructors for Tree a
(we could also get this information using the :t
command in GHCi):
Node
and Leaf
in a tree with different value or function (for example, let's call Node
's replacement n
and Leaf
's replacement l
); this means that the replacement values and functions must have a type that has the same structure (i.e., they must take the same number of arguments) as the constructors they replace. However, they will fold into some new type of value b
, so we replace every instance of Tree a
with the new result type b
:
foldTree
function to take the above two arguments, and then the tree that is being folded as the last argument. Every Node
will be replaced with n
and every Leaf
will be replaced with l
. Notice that in the case of Node
, we also recursively fold the trees, since we need to turn them into the result type b
before we can apply n :: a > b > b > b
.
mapTree
function is just a special case of fold in which n x t1 t2 = Node (f x) t1 t2
for some function f
, and where l = Leaf
, since we want to change each value of type a
inside the tree, but not the nodes of the tree themselves. This also means we no longer need n
and l
as arguments; we only need f
.
mapTree f = foldTree (\x > Node (f x)) Leaf
(and we could have defined mapTree
in this way):
Node
is partially applied to its result, which means Node (f x)
is a function still waiting for two more arguments (the two subtrees). Alternatively but equivalently, we could have written the below definition:
Formula
(we could also get this information using the :t
command in GHCi):
And
's replacement a
and T
's replacement t
); this means that the replacement values and functions must have a type that has the same structure (i.e., they must take the same number of arguments) as the constructors they replace. However, they will fold into some new type of value b
, so we replace every instance of Formula
with the new result type b
:
foldFormula
function to take the above five arguments, and then the formula tree that is being folded as the last argument.
Bool
result, and one that evaluates the tree as a native Haskell Int
result:
foldFormula
function acts as a form of encapsulation for the Formula
data structure. Suppose we want to change the data type definition to the following (using a terminology for formulas that corresponds to terminology from the study of logic):
foldFormula
once, and would not need to change our implementations of evalAsBool
and evalAsInt
at all.
log2
is that the output 1
is ambiguous: it may be an actual result or an error. We can document that this result is exceptional by introducing a new parametric polymorphic data type that allows us to return either a result or a constructor indicating that there is no result. In the Haskell Prelude
, the Maybe a
data type is defined for such situations (although we could create our own, as well).
log2
.
neg
directly to the result of log2
because Maybe Integer
and Integer
do not unify, so the Haskell type inference rule for function application is not satisfied. To address this, we can create a parametric polymorphic map function for Maybe a
.
neg
to an input of type Maybe Integer
.
mapMaybe
can be described as "lifting" a function of type a > b
to a more powerful function Maybe a > Maybe b
, another common name for the map operation is lift.
Maybe a
.
 = 

 = 

"cat"
is the subject?
 ::= 
 
 
 
 

[???] 

[???] 

 ::= 
 
 ::= 
 



x
and y
, has no delimiters between statements, and is casesensitive):
 ::= 
 
 
 
  
 ::= 
 
 ::= 
 
 ::= 
 
 ::= 

Error.hs
, add an Eq
instance declaration for ErrorOr a
so that values of any type ErrorOr a
can be compared against each other using the ==
operator. Errors are never equivalent, even if they are identical; only results can be equivalent, and they are equivalent only if they are equal with respect to their own definition of equality.
Error.hs
, complete the definitions of liftErr :: (a > b) > (ErrorOr a > ErrorOr b)
and joinErr :: ErrorOr (ErrorOr a) > ErrorOr a
so that they allow users to create functions that can handle inputs or errors.
AbstractSyntax.hs
, complete the definitions of fold :: (Var > b) > (Bool > b) > (b > b > b) > (b > b > b) > Exp > b
and size :: Exp > Integer
to make it possible to calculate the size of an expression (expressions consisting of only a value or variable are of size 1
, and every operator node adds 1
to the sum of the sizes of its children). You must used fold
to implement size
in order to receive credit.
AbstractSyntax.hs
, implement a foldStmt
function for abstract syntax trees of type Stmt
. You must specify both the correct type for the function and its definition.Parse.hs
, identify how the functions in the module correspond to the concrete syntax definition, and what cases are missing in the parser definitions. Add the missing cases to the instance declarations so that the parsers can parse all concrete syntax String
inputs that correspond to syntactically valid expressions and programs according to the BNF notation.
Take advantage of left recursion elimination, the fact that Haskell unification gives you backtracking "for free", and the higherorder parser
and parsers
functions to write very short cases (about 5060 characters per line/case). See the hint below.
Parse.hs
, complete the definitions of tokenizeParse :: String > ErrorOr Stmt
. Remember to use the Error
module and see the hint below.
TypeCheck.hs
, complete the instance declarations that make it possible to use the check :: TypeEnvironment > a > ErrorOr Type
function to check the type of a Stmt
or an Exp
abstract syntax tree according to the type checking rules below. Extra credit: implement the check
function for expressions using fold
.


Interpret.hs
, complete the definitions of the evaluate :: ValueEnvironment > Exp > Value
and execute :: ValueEnvironment > Stmt > (ValueEnvironment, Output)
functions so that they implement an interpreter that conforms to the operational semantics specified below. You must use fold
to implement evaluate
in order to receive credit.


Interpret.hs
, complete the definition of interpret :: Stmt > ErrorOr Output
. Your definition should first check that the input program is welltyped according to the type system. If it is welltyped, it should be executed and the resulting output should be returned. Otherwise, interpret
should return the type error that prevented execution. Hint: write a oneline definition by "lifting" your own functions as well as library functions such as snd :: (a, b) > b
.
Exp
and Stmt
abstract syntax trees into machine code instruction lists of type [Instruction]
(defined in Machine.hs
). This machine language is a subset of the one used on previous assignments and exams (only including the instructions set a i, copy, and add), with the exception of the additional instruction mul, which works just like add (it takes the two arguments stored in memory addresses 1 and 2, and stores the product in address 0). Use the Program
wrapper to get a friendly representation of a machine language program that is compatible with the machine simulator.
Compile.hs
, complete the instance declarations to make Exp
and Stmt
members of the Compilable
type class by implementing compilation algorithms compile
for both types of abstract syntax trees. You do not need to keep track of the heap with auxiliary variables in this case because you can predict how much memory any particular expression or subexpression needs using the size
function. Also, remember that you have only two variables in total, so you may want to permanently assign memory addresses to each one.Compile.hs
, implement the function compileSimulate :: Stmt > ErrorOr Buffer
that type checks the Stmt
abstract syntax tree, and if that succeeds, it compiles and simulates it using the simulator in Machine.hs
.Validate.hs
a bounded exhaustive testing validation algorithm for the interpreter and compiler. Because this language has a fixed number of possible variables, it is possible to generate every abstract syntax tree of a certain height. Every leaf node (values, variables, or End
) is of height 1
, and every other node that has children is one level higher than its maximumheight child.Compile.hs
, complete the implementation of converts :: Output > Buffer
so that it correctly converts outputs into corresponding machine state buffers. You must use the Haskell function map :: (a > b) > [a] > [b]
to receive credit.Validate.hs
, complete the instance declarations to make Exp
and Stmt
members of the Exhaustive
type class. The exhaustive :: Integer > a
function should take an integer and return a list containing every abstract syntax tree having height equal to or less than that integer. You may find the concat :: [[a]] > [a]
function helpful.Validate.hs
, complete the implementation of the validate :: Height > Quantity > Bool
function. This function should compare the behavior of interpret
and compileSimulate
on at most k :: Quantity
welltyped inputs that have height as most n :: Height
. If they have different behaviors on any one of these inputs, the function should return False
; otherwise, it should return True
. Hint: a list comprehension is the easiest and most concise approach; add appropriate condition expressions and use pattern matching unificiation to only consider the cases you want.Validate.hs
, complete the implementation of complete :: String > ErrorOr Buffer
, which should parse the concrete syntax input string, check the type of the resulting abstract syntax tree, and compile and simulate it if it is welltyped.gsubmit
. This section reproduces and extends some of the instructions already made available by the BU Computer Science Department.csa
machines maintained by the CS Dept. You will need to physically visit the undergraduate computing lab located at 730 Commonwealth Avenue, on the third floor in room 302.csa
home directory. If you are using Windows, you will also need an SSH client (such as PuTTY).
local device

your csa2 /csa3 home directory 
your gsubmit directory for CS 320 
csa2
or csa3
using an SCP or SSH client and create a directory for your submission in your CS account home directory. Note that in the examples below %>
represents a terminal prompt, which may look different on your system.
local device

your csa2 /csa3 home directory

your gsubmit directory for CS 320 
csa2
or csa3
using an SCP client and copy your completed file(s) into that directory.
local device

⇒ 
your csa2 /csa3 home directory

your gsubmit directory for CS 320 
csa2
or csa3
using an SSH client and run the gsubmit
commands to copy the files from your CS account home directory to the gsubmit
directories to which the course staff has access.
local device

⇒ 
your csa2 /csa3 home directory

⇒ 
your gsubmit directory for CS 320

csa2/csa3
..py
. For example, suppose the following is contained within a file called example.py
:
python3
, python3.2
, python3.3
, etc. depending on the Python installation you're using). Note that in the examples below %>
represents a terminal prompt, which may look different on your system.
True
and False
.and
, or
, and not
.+
, *
, 
, and /
. The infix operator //
represents integer division, and the infix operators **
represents exponentiation. Negative integers are prefixed with the negation operator 
.==
, !=
, <
, >
, <=
, >=
are available.int()
function can convert a string that looks like an integer into an integer.'
or "
characters. Strings can be treated as lists of singlecharacter strings. Another way to look at this is that there is no distinction between a character and a string: all characters are just strings of length 1. Multiline strings can be delimited using """
or '''
(i.e., three quotation mark characters at the beginning and end of the string literal).''
or ""
.+
.len()
returns the length of a string.s[i]
). These characters are also strings themselves.[
and ]
, with the individual list entries separated by commas. Lists cannot be members of sets.[]
.+
.len()
returns the length of a list.a[i]
).in
relational operator.(
and )
, with entries separated by commas. The main distinction between lists and tuples is that tuples are hashable (i.e., they can be members of sets).()
.x
is denoted using (x, )
.+
.list()
function.tuple()
function.len()
returns the length of a tuple.t[i]
).in
relational operator.set()
..union()
and .intersect
correspond to the standard set operations.set()
function.list()
or list()
function, respectively.len()
returns the size of a set.in
relational operator.frozenset()
function.
{}
.list(d.keys())
.list(d.values())
.len()
returns the number of entries in the dictionary.d[key]
).example()
is defined as follows:
*
operator, the scatter operator, or the splat operator) involves providing a list to the function, preceded by the *
symbol; the arguments will be drawn from the elements in the list.
**
operator) involves providing a dictionary to the function, preceded by the **
symbol; each named paramter in the function definition will be looked up in the dictionary, and the value associated with that dictionary key will be used as the argument passed to that parameter.
for
clause. This will filter which combinations are actually used to add a value to the resulting list.
type()
can be used to determine the type of a value. Below, we provide examples of how to check whether a given expression has one of the common Python types:
if
, else
, and/or elif
), an iteration construct (i.e., a for
or while
loop), a return
statement, or a break
or continue
statement.def
keyword, followed by the name of the function or procedure, and then by one or more arguments (delimited by parentheses and separated by commas).
else
). The body of each branch is an indented sequence of statements.
if
). The body is executed over and over until the expression in the condition evaluates to False
, or a break
statement is encountered.
csa2/csa3
.Example.hs
should contain the Example
module definition). The module body consists of a series of declarations, which can appear in any order; the only exception is that if there are multiple declarations for a function, they must all be grouped together.
True
and False
.&&
, 
, and not
.if ... then ... else ...
is also available (similar to Python's ... if ... else ...
and C/C++/Java's ternary operator ... ? ... : ...
.Int
and an unboundedsize type Integer
.+
, *
, 
, /
, and ^
. The Prelude
function div
represents integer division. The Prelude
functions min
and max
make it possible to compute the minimum and maximum of two integers, respectively, and the function minimum
and maximum
make it possible to compute the minimum and maximum of a list of integers. Negative integers are prefixed with the negation operator 
.==
, /=
(not equal), <
, >
, <=
, >=
are available."
characters (individual characters are always delimited using '
). Strings can be treated as lists of characters.""
.++
.Prelude
function length
returns the length of a string.[
and ]
, with the individual list entries separated by commas.[]
.:
.++
.Prelude
function length
returns the length of a list.Prelude
function elem
(note that this relies on the ==
operation being defined on the type of the elements in the list).(
and )
, with entries separated by commas.()
.(e)
is equivalent to e
for any Haskell expression e
.