Logic and CombinatoricsIntroductory Logic and Combinatorics for Computer Science Applications


[link]  1. Introduction, Background, and Motivation

[link]  1.1. Overview

The physical laws in our universe appear to be predictable (i.e., they are consistent across time and space, and under similar circumstances, we can expect matter and energy to behave in similar ways). We can exploit this to make models and predictions about how events might unfold at a certain time and place in the universe, and we can test these predictions (we might call this the scientific method).
We can also exploit the predictability of physical laws by identifying what conditions match a particular goal we may wan to bring about, or task that we may want to accomplish, and then setting up initial conditions (such as inside a mechanical, electronic, chemical, biological, and/or computational device) that will eventually lead to those circumstances.
Many physical processes that currently understand and use to accomplish our tasks are based on only a small collection of building blocks (typically, these are collections of rules that are consistently followed by systems). Just as humans have developed symbolic systems (i.e., natural languages such as English) to name, describe, and think about day-to-day objects and processes in their environment, communities of people within the mathematics, applied mathematics, computer science, and related disciplines have developed symbolic systems for describing the simple building blocks according to which phenomena in our physical environment (including everything from mechanical devices, to physical computers, to software) seem to operate.
In this course, we will study a particular mathematical language that is well-suited for describing certain structures (as well as their properties and behaviors) that are routinely and widely employed within mathematical and computational modelling, data representation and storage, data processing, and other software applications. These structures include logical formulas, proofs, sets, trees, and graphs. We will also study the combinatorial properties of these structures, including how to define them recursively, how to quantify and compute their size, as well as how to categorize, count, and measure the ways in which they can be constructed, combined, and explored. We will become familiar with induction and recursive and self-referential definitions for these structures and their properties. We will also look at how these structures and associated techniques can be used in some practical applications.

[link]  1.2. Expected background, prerequisites, and course format

In covering the material for this course, we will use the standard language and conventions for discussing mathematical structures that have been developed by the mathematics and computer science communities. You should be familiar with high school level algebra and calculus (including mathematical functions of a single variable, e.g., f(x) = x2). We will introduce the terminology and concepts associated with boolean arithmetic, logic, set theory, graph theory, and combinatorics as part of the course material, but you should have some familiarity with logical operators (i.e., "and", "or", and "not"), at least within the context of a programming language, and the mathematical notion of an abstract set that contains zero or more elements.
The homework assignments in this course will involve the assembly of programs in the Python programming language, as well as the manual assembly of solutions to mathematical problems. A basic familiarity with programming is expected, but no previous knowledge of the Python programming language is expected. Relevant features of the Python programming language will be introduced as needed within the course material.

[link]  2. Boolean Arithmetic, Boolean Algebra, and Logical Formulas

In this section, we will introduce the fundamental building blocks of boolean arithmetic and logic, including the symbolic language used to represent logical formulas and proofs. Simultaneously, we will employ the modern programming language Python, together with some specialized libraries, to assemble and work with logical formulas and proofs.
As with most human languages that have developed organically over time, mathematics has a rich and often redundant vocabulary. We introduce terminology in this section that we will use consistently in this course. However, keep in mind that there are often other synonyms within mathematics and computer science for the concepts we encounter.
Those who may wish to review and practice the material in this section in the form of a board game can be accommodated.

[link]  2.1. Boolean formulas and boolean arithmetic

Definition: A boolean formula or formula is a string of symbols that follow a certain syntax. If the formula is written using a correct syntax, we say it is a well-formed formula (or WFF). The symbols ∨ (which represents the binary or operation, also called disjunction), ∧ (which represents the binary and operation, also called conjunction) , ¬ (which represents the not operation, also called negation), (which represents the implies relationship), and (which represents the if and only if or iff relationship) are boolean operators.
boolean formula conditions
always a well-formed formula
always a well-formed formula
f1f2 is a formula if f1 and f2 are both well-formed formulas
f1f2 is a formula if f1 and f2 are both well-formed formulas
¬ f is a formula if f is a well-formed formula
f1f2 is a formula if f1 and f2 are both well-formed formulas
f1 f2 is a formula if f1 and f2 are both well-formed formulas
f1 f2 is a formula if f1 and f2 are both well-formed formulas
( f ) is a formula if f is a well-formed formula
Notice that we can "build up" formulas from the basic building blocks ⊤, ⊥, and by combining existing formulas using the operators ∨, ∧, ¬, ⊕, , and . We can say that formulas are defined in a self-referential way, or inductively, because a formula is built up of other, smaller formulas. We will introduce a more formal definition of induction later, but it is accurate to say that the definition for well-formed formulas is inductive.
In the same way that we can assign meaning to sequences of symbols such as "sky", "tree", or "the girl threw the frisbee" by pointing to the physical objects and events they represent, we can assign meaning to each sequence of symbols that happens to be a well-formed formula.
Definition: There are only two possible meanings for a well-formed formula: true and false. The following table specifies how we can determine the meaning of a given well-formed formula.
boolean
formula
meaning
(true or false)
example using the representation
of the Python programming language
always true True
always false False
f1f2 true if f1 or f2 (or both) are true True or (False and True)
f1f2 only true if both f1 and f2 are true True and False
¬ f true if f is false not (True or (False and True))
f1f2 true if f1 or f2,
but not both, are true
True != False
f1 f2
  • if f1 is true, then f2 must be true
  • f1 is false, or f2 is true
  • f1 is "less than or equal to" f2
    (if ⊥ is 0 and ⊤ is 1)
True <= False
f1 f2
  • f1 is true if and only if f2 is true
  • f1 and f2 are either
    both true or both false
True == False
( f ) true if f is true (True and False)
Notice that like the definition of well-formed formulas, the process for assigning a meaning to a formula is self-referential, or recursive: to determine the meaning of a formula, we determine the meanings of the smaller formulas from which is it built and combine the results. We will introduce a more formal definition of recursion later, but it is accurate to say that the process of assigning a meaning to a formula is recursive.
Definition: The process of converting a well-formed formula to its meaning is often called evaluation (especially within the context of a programming language), because the sequence of symbols that comprises the formula is converted into one of two possible values: true or false.
It is worth noting that the terms inductive and recursive are often used interchangeably to refer to self-referential definitions, whether the definitions specify a method for "building up" a self-referential structure, or an algorithm for "breaking down" a self-referential structure.
Example: We can determine the meaning of a formula by working our way up from the innermost subformulas, replacing each subformula with its meaning. This process is guaranteed to terminate and produce the meaning of the formula. For example:
(¬ ⊥)
(⊤ ∨ (⊥ ⊤))
(⊤ ∨ (⊥ ⊤))
(⊤ ∨ )
    
Example: For each of the following formulas, determine its meaning.
  1. ⊤ ∨ ⊥
  2. ⊤ ∧ ⊤ ∧ ⊤
  3. ⊤ ⊕ ⊤
  4. (⊤ ⊤) ∧ (⊥ ⊤)
  5. (⊤ ∨ ⊥)
  6. (⊤ ∨ ⊥) (⊤ ⊕ ⊤)
  7. ⊤ ∧ (¬ ⊤)

[link]  2.2. Boolean algebra

Algebra of the integers and real numbers is often introduced as a method for solving equations with one or more unknown variables (in fact, this was historically the purpose for which some of the techniques we now call algebra were developed). Equations are either true or false; they are, in fact, formulas. Solving an equation over the integers, such as the one below, usually involves finding a value with which x can be replaced so that the formula is true.
x + 4
=
7
In the above example, if x is replaced with the integer 3, the equation is true. If we instead only allow our variables to have the values ⊤ and ⊥, we can ask the same question about any formula that contains a variable.
Definition: We can extend our definition of a well-formed boolean formula to also allow variables:
boolean formula conditions
x a well-formed formula if x is a variable name
Sometimes, we will want to write very long variables names; in those cases, we will write the variable in underlined italics. For example, a formula with two variables the sun is visible and it is daytime might look as follows:
the sun is visible it is daytime
Example: For each of the following formulas, determine what combinations of values for x and y ensure that the formula is true (in some cases, there may be no solution for x and y that would make the formula true).
  1. x ∨ ⊥
  2. yx ∧ ⊤
  3. ⊤ ⊕ y
  4. (x y) ∧ (y x)
  5. x y
  6. x x
  7. x ∧ (¬ x)
Example: Suppose we want to find what values for x and y make the following formula false:
(xy)
Note that the if and only if operator behaves just like equality; thus, we could instead find the values for x and y that make the following "equation" true:
((xy) ⊤)
Equivalently, we could find the values for x and y that make the logical negation of the original formula true:
¬ (xy)
Definition: When we assign some value, such as ⊤, to a variable x, we say that the value ⊤ is assigned to x or that x maps to ⊤, and we denote this using mathematical notation as:
x ↦ ⊤
Definition: Suppose we have a mapping that assigns multiple variables to values, such as the following:
{x1 ↦ ⊤, x2 ↦ ⊥, x3 ↦ ⊥, ..., xk ↦ ⊤}
We call such a mapping an assignment or a model.
Definition: Suppose that a formula f contains some variables x1, ..., xk. If we can find some assignment m of the values ⊥ and ⊤ to each of the variables such that substituting each variable in the formula with its assigned value results in a formula with the meaning true, we say that this particular assignment of values to x1, ..., xk solves or satisfies the formula f. We denote this fact using the following notation:
mf
Example: Consider the formula x ∨ (yz) and the assignment {x ↦ ⊤, y ↦ ⊥, z ↦ ⊥}. Replacing x with ⊤, y with ⊥, and z with ⊥ in x ∨ (yz) results in the formula:
⊤ ∨ (⊥ ∧ ⊥)
Since the above formula has the meaning true, we can say that {x ↦ ⊤, y ↦ ⊥, z ↦ ⊥} satisfies x ∨ (yz), which we denote:
{x ↦ ⊤, y ↦ ⊥, z ↦ ⊥} ⊨ x ∨ (yz)
When equations involving logical formulas are used to model possible facts about some system, the variables will often represent some state or property of the system.
Example: Suppose we have the following formula involving the variables the sun is visible and it is daytime:
the sun is visible it is daytime
This formula might describe a property of our real-world experience of a person that is in a particular fixed location on the surface of the Earth. We could state that the above formula is always true (i.e., it is always an accurate description of the system it describes). For every possible assignment of values to each variable, the above formula is indeed accurate, in that it is true exactly in those situations that might occur on Earth, and false in any situation that cannot occur:
the sun is visible it is daytime meaning interpretation
true a sunny day
false
true a cloudy day
true nighttime
In particular, only one set of values causes the formula to be false: if the sun is in the sky, but it is not daytime. This is indeed impossible; all the others are possible (it may be day or night, or it may be cloudy during the day). In other words, the assignments that represent a sunny day, a cloudy day, and nighttime all satisfy the formula the sun is visible it is daytime. For example:
{the sun is visible ↦ ⊤, it is daytime ↦ ⊤} ⊨ the sun is visible it is daytime
One way to find all the solutions for a boolean formula with one or more variables is to build a truth table that enumerates every possible combination of values that can be assigned to the variables, and then shows how the formula (and possibly its subformulas) evaluate under that assignment.
Example: Suppose we want to solve the following formula:
(zx) (yz)
We can build a truth table for the above formula.
x y z zx yz (zx) (yz)
Thus, there are four possible solutions for the formula (zx) (yz). For example, the assignment x ↦ ⊤, y ↦ ⊤, and z ↦ ⊥ is one possible solution.
Just as operators on the real numbers or integers have algebraic properties (such as x + y = y + x, or 2 ⋅ x = x + x, and so on), boolean operators also have algebraic properties that can be derived from their meaning. In fact, every formula that contains variables and is true for every possible assignment of values to those variables is technically an algebraic property. However, we will list several common algebraic properties.
Fact: We define ≡ to be equivalence of formulas: if fg for two formulas f and g, this means that f and g have the same meaning (i.e., under any given assignment of variables to values, both formulas are either true or false, but never different; in other words, their truth tables are identical).
Note that we do not need ≡, since would suffice for a presentation of the properties below. However, we introduce ≡ to emphasize that we are defining algebraic properties that govern formulas which may themselves contain .
Fact: The ∨, ∧, and ¬ operators have some familiar algebraic properties:
xy
yx
    
(commutativity of ∨)
(xy) ∨ z
x ∨ (yz)
    
(associativity of ∨)
x ∨ ⊥
x
    
(identity of ∨)
xy
yx
    
(commutativity of ∧)
(xy) ∧ z
x ∧ (yz)
    
(associativity of ∧)
x ∧ ⊤
x
    
(identity of ∧)
x ∧ (yz)
(xy) ∨ (xz)
    
(distributivity of ∧ across ∨)
¬(¬(x))
x
    
(invertibility of ¬)
Furthermore, ∨ and ∧ can be expressed in terms of one another with the help of  ¬:
¬(xy)
(¬ (x)) ∨ (¬ (y))
    
(De Morgan's Law)
Using the above, we can define ∧ in terms of ∨ by negating both sides:
xy
¬((¬ (x)) ∨ (¬ (y)))
Alternatively, we could define ∨ in terms of ∧ by negating x and y (e.g., negating x everywhere should not change anything, because the formula is true for any value of x, whether it is ⊤ or ⊥):
¬(¬(x) ∧ ¬(y))
xy
Example: If an algebraic property holds, this means that the final rows of the truth tables for the two formulas are exactly the same. For example, consider the following algebraic law:
¬(xy)
¬(x) ∨ ¬(y)
If we build truth tables for both formulas, we see that the final row in both tables is the same:
x y xy ¬(xy)
x y ¬(x) ¬(y) ¬(x) ∨ ¬(y)
Fact: The ⊕, , and operators have many algebraic properties. We present a few important properties. In particular, these operators can be defined using ∨, ∧, and ¬:
xy
(xy) ∧ (¬(xy))
x y
(¬(x)) ∨ y
(x y)
(xy) ∨ (¬(x) ∧ ¬(y))
Furthermore, can be defined in terms of , and is an equivalence relation:
x y
(x y) ∧ (y x)
x x
    
(reflexivity of )
x y
y x
    
(symmetry of )
(x y) ∧ (y z) (x z)
    
(transitivity of )
One important property of ⊕ is that taking ⊕ of a formula with itself always yields ⊥:
xx
Since ⊕, , and can all be defined in terms of ∨, ∧, and ¬, we could actually eliminate them completely without affected the power of our boolean formula language. Since ∨ and ∧ are also related, we could also eliminate one of these. The fact that ∧ and ¬ would be sufficient to express all possible logical formulas makes ¬ and ∧ functionally complete (i.e., capable of expressing all possible truth tables on any number of variables).
In fact, we can go even further; the following single boolean operator (also called the Sheffer stroke, or NAND) is functionally complete all by itself, because it can be used to define ∧ and ¬:
There are other such operators, such as NOR.
Note on operator precedence. Because different communities of mathematicians and computer scientists sometimes use different precedence conventions for logical operators, in this course we will include parentheses in formulas where the order of operations might otherwise be ambiguous. One common convention is that ¬ has highest precedence, followed by ∧ (since it can corresponding to multiplication when ⊤ is represented with 1 and ⊥ is represented with 0), then by ∨ (since it can corresponding to addition), then by (since it is often used to organize relationships and dependencies between "simpler" formulas involving operations with higher precedence), and then by ⊕.
Example: Using the programming language Python, we can represent a formula with variables as a function that takes boolean arguments and returns a boolean result. For example, suppose we have the following formula:
x ∧ (y ∨ (¬ (z)))
We could represent this as the following Python function:

def f(x, y, z):
  return x and (y or (not(z)))
        
We can then evaluate the above for a given assignment by calling the function f on some arguments. For example, if we want to evaluate the formula under the assignment {x ↦ ⊤, y ↦ ⊥, z ↦ ⊥}, we can make the following function call:

>>> f(TrueFalseFalse)
True
        
We can also use the Python unpacking operators to explicitly represent our assignment as a Python list or dictionary. For example:

>>> m = [TrueFalseFalse]   # Python list.
>>> f(*m)                      # Argument unpacking (also known as splat) operator.
True
>>> m = {'x'True'y'False'z'False}   # Python dictionary.
>>> f(**m)                                    # Keyword argument unpacking operator.
True
        

[link]  2.3. Boolean satisfiability, recursive backtracking, and applications

Finding an assignment or model that satisfies a boolean formula is a fundamental and ubiquitous problem in computer science. An immense variety of different real-world and theoretical constructs, systems, devices, concepts, and problems can be represented as boolean formulas (where the solutions to the formula might provide a solution to the problem, or might answer some question about the properties of whatever is being represented). In this section we define two versions of this problem, and describe one standard algorithmic approach for solving the problem.
Definition: We define an instance of the boolean satisfiability problem as follows: given a formula f, find an assignment (or model) m such that m satisfies f:
mf
Definition: We define an instance of the maximum boolean satisfiability problem as follows: given a formula f, find an assignment (or model) m such that m satisfies f, and such that m has at least as many variables assigned to ⊤ as any other model that satisfies f.

In other words, if we list all the models that satisfy f, we want the one that has the most variables mapped to ⊤ (or at least as many as any other, since more than one model may map the same number of variables to ⊤).
Unfortunately, unless something is known about the particular structure of the formula f, the problem of finding a model that satisfies a formula cannot be solved efficiently. Essentially, the best known algorithm is as fast as any algorithm that builds the entire truth table for the formula.
One standard way to implement an algorithm that exhaustively tries all possible combinations of assignments to a set of variables is to use recursive backtracking.
Algorithm (recursive backtracking): The recursive backtracking algorithm is a way to exhaustively try all possible solution to a problem by nesting an arbitrarily large number of loops inside each other using recursion. The following pseudocode describes a generic version of the recursive backtracking algorithm for some problem f for which we are seeking a candidate solution s.
  1. inputs: some function f, and some partial candidate solution s
    1. if s is a full-size candidate solution
      1. if s solves the problem f
        1. return the result s
      2. otherwise
        1. return nothing
    2. otherwise s is not a full-size solution and must be extended
      1. for every possible way that s can be extended to some s' that is closer to a full-size solution
        1. call this algorithm recursively on f and s'
Example: A simple example is the problem of opening a safe. Suppose that a safe combination consists of some sequence of the letters a, b, and c. We do not know the length of the sequence, or what sequence will unlock the safe. Suppose that there is a function that checks if the sequence is correct and unlocks the safe (we do not know how it works, but we know its name and that if it unlocks the safe, it returns True):

def unlock(password):
   # ... returns True of password is correct, False otherwise ...
   if password == 'abcbac':
     return True
   else:
     return False
        
We want to write a Python function that tries passing every possible string containing a, b, and c to unlock(). If we know the length of the safe combination (and if the length were short, e.g., three letters), then we could simply use nested for loops to try all the combinations and print the combination that unlocks the safe:

def three():
   for x in {'a''b''c'}:
     for y in {'a''b''c'}:
       for z in {'a''b''c'}:
         if unlock(x + y + z):
           print(x + y + z)
        
However, if we do not know the length of the sequence, we might want to make an arbitrary hierarchy of nested for loops to find the password:

def any():
   for x1 in {'a''b''c'}:
     for x2 in {'a''b''c'}:
       # ...
         for xN in {'a''b''c'}:
           # ...
             if unlock(x1 + x2 + ... + xN + ...):
               print(x1 + x2 + ... + xN + ...)
        
Of course, it is not possible to write the above explicitly. However, by using recursion, we can write a function that behaves exactly in this way: as an arbitrarily nested hierarchy of for loops. First, suppose we simply want to print all the combinations. We could do the following:

def any(guess = ""):
  print(guess)
  for c in {'a''b''c'}:
    newGuess = guess + c
    any(newGuess)
        
The above algorithm actually keeps trying combinations of all lengths. If we run the above, we'll see that it keeps printing combinations as they get longer and longer. How can we make this process stop at a certain length of password? We would need to check the length of our guess and not make a recursive call if the guess is too long. We can do this by using an if statement:

def any(guess = ""):
  if len(guess) <= 4:
    print(guess)
    for c in {'a''b''c'}:
      newGuess = guess + c
      any(newGuess)
        
Suppose we only want to print the guess if it actually unlocks the safe. We can do so in the following way:

def any(guess = ""):
   if unlock(guess):
     print(guess)
   else:
     for c in {'a''b''c'}:
       newGuess = guess + c
       any(newGuess)
        
Suppose that you knew the length of the combination would be supplied as an integer argument to the function. How would you modify the above to actually return the correct guess instead of just printing it? One way to do the latter is presented below.

def any(guess = ""):
   if unlock(guess):
     return guess
   else:
     for c in {'a''b''c'}:
       newGuess = guess + c
       result = any(newGuess)
       if result != None:
         return result
        
How could you modify the above to only try combinations of a specific length, and nothing longer?
Example: As a simpler example, suppose a password consists of a string of some length containing only the letter a, but we do not know its length.

def unlock(password):
   if password == "aaaaa":
     return True
   else:
     return False
        
We can write a recursive function that tries every possible length, and returns the working password once it is found:

def find(guess = ""):
   if unlock(guess):
     return guess
   else:
     return find(guess + "a")   # Return result of recursive call.
        
Notice that the above function find() will recursively call itself and "build up" its parameter guess until it is long enough to match the true password.
Now, suppose we are worried that the password might contain letters other than a, in which case the above function will keep calling itself recursively forever. We may want to set an upper bound on how long the guess can become, and then allow the function to fail gracefully by returning None:

def find(guess = ""):
   if unlock(guess):
     return guess
   elif len(guess) > 100:
     return None   # Give up.
   else:
     return find(guess + "a")
        
Example: Suppose we want to write a recursive algorithm that returns the largest integer in a list. We can do so by first splitting the list two two halves, and then finding the largest element in each half. We can then take the larger of these two results. This is guaranteed to give us the largest element in the list.

def largest(xs):
  if len(xs) == 1:
    return xs[0]
  else:
    # All elements from index 0 to len(xs)/2 - 1.
    leftHalf = xs[:len(xs)/2] 
    
    # All elements from index len(xs)/2 to the last element.
    rightHalf = xs[len(xs)/2:]
    
    return max(largest(leftHalf), largest(rightHalf))
        
Many real-world situations and devices, or systems, can be modeled mathematically by identifying a set of discrete system states in which that system can be (e.g., a lamp can either be "on" or "off").
In many cases, it is possible to model the various interesting, discrete properties of the system as variables (e.g., the variable the lamp is on can be used to keep track of whether the lamp is actually on or off by assigning the values ⊤ and ⊥ to it, respectively). Then, a given state can be modeled using assignments (or models) of values to variables (e.g., the model {the lamp is on ↦ ⊤} is a model of a system state in which the lamp is on, and {the lamp is on ↦ ⊤} is a model of the system state in which the lamp is off).
If a system can be described using a set of models of possible system states, we can then use formulas to describe logical constraints over (and relationships between) the various parts of the system. For example, we can then say that the only "allowed" or "possible" system states are those that satisfy a particular formula.
Example: Suppose we want to model a system consisting of a single light switch that can either be in an "on" position or an "off" position, and a light controlled by the switch, which may be either "on" or "off". We can introduce two predicates for describing this system state: switch and light, where the first represents the state of the switch, and the second represents the state of the light (in both cases, ⊤ is "on" and ⊥ is "off").
While there are 22 = 4 possible ways to assign values to these two variables, not all would correspond to a possible state of the system if the light switch works correctly. In particular, any model m of a state must adhere to the following:
mswitch light
In other words, the switch and light should always correspond to the same setting. The two system states that satisfy the above formula are:
m1
{switch ↦ ⊤, light ↦ ⊤}
m2
{switch ↦ ⊥, light ↦ ⊥}
Example: Note that there is no single correct way to model a situation using formulas. In an alternative form of the light switch example, we can introduce four predicates for describing system states: switch is off, switch is on, light is off, light is on.
While there are 24 = 16 possible ways to assign values to these four variables, not all would correspond to a possible state of the system if the light switch works correctly. For example, any model m of a state must adhere to the following:
mswitch is offswitch is on
In other words, the switch cannot be in two positions at once. Likewise, the light can only be in one state at a time:
mlight is offlight is on
Finally, the switch and light should always correspond:
m ⊨ (switch is off light is off) ∧ (switch is on light is on)
Thus, the two system states that satisfies the conjunction (i.e., ∧) all these formulas are:
m1
{switch is off ↦ ⊤, switch is on ↦ ⊥, light is off ↦ ⊤, light is on ↦ ⊥}
m2
{switch is off ↦ ⊥, switch is on ↦ ⊤, light is off ↦ ⊥, light is on ↦ ⊤}
Example: Suppose we want to model a system consisting of an airlock (i.e., a room with two doors, one leading outside and one leading into an enclosure). If the airlock's outer door is open, the inner door must be closed.
First, suppose we use outer door and inner door to represent whether the doors are open (⊤) or closed (⊥). A naive programmer implemented the airlock software so that every possible state m of the airlock door adheres to the following requirement:
mouter doorinner door
Does the above constraint satisfy the practical requirements of an airlock? If the airlock needs to allow someone to depressurize or pressurize before they exit the airlock, it should be possible to have both doors of the airlock closed at the same time. Is this possible under the given constraint? We can answer this question by determining if any state of the airlock satisfies the following formula:
m ⊨ (outer doorinner door) ∧ ¬(outer door) ∧ ¬(inner door)
In other words, is it possible for the constraint guaranteed by the programmer to be satisfied, and for the two doors to both be locked? If we try to solve the above boolean satisfaction problem for m, we will find that it has no solution (states that satisfy the formula are in green, and states that do not satisfy the formula are in red):
outer
door
inner
door
¬(outer door) ¬(inner door) outer door

inner door
overall
formula
⊤ (open)⊤ (open)
⊤ (open)⊥ (closed)
⊥ (closed)⊤ (open)
⊥ (closed)⊥ (closed)
How can this be fixed? Instead of using the ⊕ operator, the programmer should have used , so that every state of the airlock m should satisfy:
mouter door ¬(inner door)
Then, we can try to solve the following formula:
m ⊨ (outer door ¬(inner door)) ∧ ¬(outer door) ∧ ¬(inner door)
We see that the above formula does have a solution:
outer
door
inner
door
¬(outer door) ¬(inner door) outer door

¬(inner door)
overall
formula
⊤ (open)⊤ (open)
⊤ (open)⊥ (closed)
⊥ (closed)⊤ (open)
⊥ (closed)⊥ (closed)
Example: Suppose we want to improve our airlock by installing a fire detector inside the enclosure. The fire detector should ensure that someone in the airlock cannot go inside the enclosure if there is a fire detected inside.
A new programmer comes along and incorporates the functionality of the fire detector into the software, and now the airlock's software adheres to the following constraint:
m ⊨ (outer door ¬(inner door)) ∧ (fire ¬(inner door))
In other words, in addition to the first constraint, the inner door locks if there is a fire. Is there a potentially dangerous flaw in the software? One problem is that if there is a fire, it should not be possible for both doors to be closed because this may be dangerous (e.g., due to heat accumulation in the airlock). For concision, let us call the formula to which the software adheres f:
f
(outer door ¬(inner door)) ∧ (fire ¬(inner door))
We can then check whether the dangerous situation might occur by looking for states that adhere to the following:
mffire ∧ ¬(outer door) ∧ ¬(inner door)
We can examine the truth table to determine if there is a solution (states that satisfy the formula are in green, and states that do not satisfy the formula are in red, and we abbreviate some of the variable names):
outer inner ¬(outer) ¬(inner) fire outer
¬(inner)
fire
¬(inner)
⊤ (open)⊤ (open)⊥ (safe)
⊤ (open)⊥ (closed)⊥ (safe)
⊥ (closed)⊤ (open)⊥ (safe)
⊥ (closed)⊥ (closed)⊥ (safe)
⊤ (open)⊤ (open)⊤ (fire)
⊤ (open)⊥ (closed)⊤ (fire)
⊥ (closed)⊤ (open)⊤ (fire)
⊥ (closed)⊥ (closed) ⊤ (fire)
Thus, there exists an unsafe state. One way to avoid this is to modify the software so that it ensure that the outer door is open during a fire (i.e., that the formula fire outer door is always true).
Example: You are running a cloud computing service and receive several customer requests for a particular computing resource that fall into specific, fixed time windows:
  • request a: from 2 AM - 4 AM
  • request b: from 3 AM - 7 AM
  • request c: from 6 AM - 10 AM
  • request d: from 2 AM - 8 AM
  • request e: from 8 AM - 9 AM
  • request f: from 9 AM - 10 AM
Since the resource can only be allocated to one customer at a time, you cannot satisfy all the requests. How can you satisfy the maximum number of requests?
One approach is to assemble a collection of logical formulas that capture the fact that certain requests are mutually exclusive. If each request has a corresponding variable, then a model that assigns ⊤ to that variable would indicate that the request will be allocated, while a model that assigns ⊥ to that variable would indicate that the request will not be allocated. The formulas in this case would be:
a (¬(b))
a (¬(d))
b (¬(c))
b (¬(d))
c (¬(d))
c (¬(e))
c (¬(f))
If we take the conjunction (i.e., ∧) of all of the above formulas to make a new formula h, then any model m that satisfies the formula h would correspond to a possible allocation of the requests. If we choose the model m that allocates the maximum number of requests (i.e., if we solve the maximum boolean satisfiability problem), we can determine how to allocate the maximum number of requests.


[link]  2.4. Assignment #1: Boolean Arithmetic, Algebra, and Satisfiability

In this assignment you will define several Python functions for solving problems involving boolean algebra.

Your solution may not import any modules or employ any external library functions (unless the problem statement explicitly permits this). The different problems and problem parts rely on the lecture notes and on each other; carefully consider whether you can use functions from the lecture notes, or functions you define in one part within subsequent parts.
  1. Convert each of the following boolean formulas into their corresponding Python syntax. For each problem part, define an uppercase single-letter variable and assign the formula to that variable in a single assignment statement. For example, suppose the formula for part (f) were specified as follows:
    ⊥ ∧ (⊤ ∨ (¬ (⊥)))
    You would then add the following line to your Python file:

    F = False and (True or (not (False)))
              
    1. (⊤ ∨ ⊥)
    2. (⊤ (⊤ ∨ ⊥)) ⊕ ⊥
    3. (⊤ ⊕ (¬(⊤))) (¬ (⊤))
    4. (⊤ ⊥) ⊕ (⊥ ⊤)
    5. (⊤ ⊥) ∧ (¬(⊥) ¬(⊤))
  2. Convert each of the following boolean formulas that contain variables into a corresponding Python function. For each problem part, define a lowercase single-letter named function that takes one or more named arguments (one corresponding to each variable in the formula) and returns the result of evaluating that formula given those arguments. For example, suppose the formula for part (f) were specified as follows:
    z ∧ (x ∨ (¬ (y)))
    You would then add the following function definition to your Python file:

    def f(x, y, z):
      return z and (x or (not (y)))
              
    1. x y
    2. x (yz)
    3. ((x y) ∧ x) y
    4. (x y) (¬(x) ¬(y))
    5. ((x y) ∨ (w v)) ((xw) (z ∧ (yv)))
  3. In this problem, you will implement functions for printing truth tables for formulas with variables. You may use the following helper function, which prints a tab-delimited list of values.

    def prints(values):
      print("\t".join([str(value) for value in values]))
              
    The above function can be used as follows:

    >>> prints([TrueFalseTrue])
    True   False  True
              
    You may also use the following helper function, which returns a list of the argument names of a function:

    def variables(f):
        return list(f.__code__.co_varnames)
              
    The above function can be used as follows:

    >>> def h(x,y,z): return (y or x) and (not(z) <= x)

    >>> variables(h)
    ['x''y''z']
              
    1. Implement a function truthtableXY(f) that takes as its input a single function f (i.e., a Python function corresponding to a formula such as those you defined in Problem #2 above). You may assume f takes two boolean arguments x and y. The function should print a truth table for f.

      >>> def f(x,y): return x and y

      >>> truthtableXY(f)
      y      x      formula
      True   True   True
      True   False  False
      False  True   False
      False  False  False
                    
    2. Implement a recursive function truthtable(f) that takes as its first argument a single function f (i.e., a Python function corresponding to a formula). The function f may take any non-zero quantity of arguments. The function should print a truth table for f.

      >>> def h(x,y,z): return (y or x) and (not(z) <= x)

      >>> truthtable(h)
      x       y       z       formula
      True    True    True    False
      True    True    False   False
      True    False   True    False
      True    False   False   False
      False   True    True    True
      False   True    False   False
      False   False   True    False
      False   False   False   False
                    
      Your truthtable() function should employ the recursive backtracking approach, and can be organized as follows:
      • the function should have a second parameter values with a default value of [], which will be the list of values the function builds up and eventually passes to f;
      • if the list values is empty, the function should print a row containing all the variable names (one column header per variable);
      • if the list values is the same length as the list of variables of f, the function should print a row of values containing all the values in values, as well as the result of applying f to that list of values (use the *-operator to apply f to the list of arguments);
      • if the list values is shorter than the list of variables of f, the function should make recursive calls to truthtable(), with approprate changes to the arguments of truthtable().
    3. Implement a function rows(f) that takes as its first argument a single function f (i.e., a Python function corresponding to a formula). The function should return the number of rows in the truth table for f.

      >>> def h(x,y,z): return (y or x) and (not(z) <= x)

      >>> rows(h)
      8
                    
  4. In this problem you will implement an algorithm for solving the maximum boolean satisfiability problem. You will then use it to solve a simple instance of the maximum boolean satisfiability problem. You may use the following helper function, which counts the number of True values in a list:

    def count(values):
      return len([v for v in values if v == True])
              
    The above function can be used as follows:

    >>> count([TrueFalseTrueFalse])
    2
              
    1. Implement a recursive function solve(f) that takes as its first argument a single function f (i.e., a Python function corresponding to a formula). The function f may take any non-zero quantity of arguments. The function solve(f) should return a list of values that satisfies f and has the largest possible number of True values (as many as any other list of values that satisfies f).
    2. Suppose you need to transport some animals (a rhino, a lion, a zebra, a hyena, and a snake) across a river in a boat. Since certain animals are predators or may poison or step on each other, there are certain constraints in this situation that you want to follow, and you model these using boolean formulas:
      • if a snake is in the boat, then the hyena is not in the boat;
      • if a lion is in the boat, then the hyena is not in the boat;
      • if a rhino is in the boat, then the snake is not in the boat;
      • if a lion is in the boat, then the snake is not in the boat;
      • if a lion is in the boat, then the zebra is not in the boat;
      • if a snake is in the boat, then the zebra is not in the boat.
      Define a formula transport() containing five variables (rhino, lion, zebra, hyena, and snake). If a variable corresponding to a particular animal is assigned the value True, this represents that the animal is in the boat. The formula should capture all of the constraints.
      Define a variable maximum to be an integer that represents the largest number of animals that can be in the boat at the same time. You must use the solve() function in your definition of maximum.



[link]  3. Set Theory and its Relationships to Boolean Algebra

Set theory and boolean algebra were both derived from intuitive processes and techniques that humans developed for solving problems in their environment. Thus, both set theory and boolean algebra are languages for describing (in an abstract way that eliminates many of the concrete details) the same universal principles we observe in our environment. This also means that set theory and boolean algebra can be defined in terms of one another: it is possible to define boolean algebra using set theory, and it is possible to define set theory using boolean algebra. In this section, we will study and explore both directions of this relationship.

[link]  3.1. Sets, set operations, and their algebraic properties

Sets are an intuitive mathematical concept that can capture the idea of a collection of distinct items. Since every item in a set must be distinct, there can be no duplicates. Thus, it is not a good way to represent collections containing various quantiities of each distinct item; it is more suitable for representing collections of more "abstract" things, like concepts.
As with formulas, we will use sequences of certain symbols to describe sets. We define this symbolic language below.
Definition: The following table specifies various notations for sets, what they mean, and how they can be represented within the Python programming language.
set notation name meaning example in
Python
{a, b, c} set a set containing
four distcint elements
a, b, and c
{'a', 'b', 'c'}
empty set an empty set that
contains no elements
set()
AB union of
A and B
the set containing all
elements in A and all
elements in B
A.union(B)
A | B
AB intersection of
A and B
the set containing only
elements that are both
in A and in B
A.intersection(B)
A & B
AB set difference the set containing all
elements in A that are
not in B
A - B
A complement
of A
the set containing only
elements that are not in A
U - A
U universe the set containing
all possible elements
(depends on context)
must be defined
by user
Typically (but not always, so it is always a good idea to confirm this), all the set operations have the same level of precedence, and should be performed left-to-right (i.e., they are left-associative). For example:
ABCD
=
((AB) ∩ C) ∪ D
Example: Determine what elements are contained in each of the following sets:
  1. ∅ ∩ {a, b, c}
  2. ({a, b, c} ∪ {c, d, e}) - {c}
  3. {a, b, c} where U = {a, b, c, d, e, f}
  4. {a, b, c} ∩ {b, d, f} where U = {a, b, c, d, e, f}
  5. U where U = {a, b, c, d, e, f}
  6. A where A = {d, e, f} and U = {a, b, c, d, e, f}
Fact: The set operations have many algebraic properties. We list some of the ones we encounter frequently below:
AB
=
BA
    
(commutativity of ∪)
(AB) ∪ C
=
A ∪ (BC)
    
(associativity of ∪)
A ∪ ∅
=
A
    
(identity of ∪)
AB
=
BA
    
(commutativity of ∩)
(AB) ∩ C
=
A ∩ (BC)
    
(associativity of ∩)
AU
=
A
    
(identity of ∩)
A ∩ (BC)
=
(AB) ∪ (AC)
    
(distributivity of ∩ across ∪)
The ∪ and ∩ operations can be expressed in terms of one another with the help of the set complement operation:
AB
=
AB
    
(De Morgan's Law)

[link]  3.2. Sets of models

In this section we introduce a way to interpret (or, equivalently, understand, represent, or think of the meaning of) formulas as sets, and particularly sets of models. For the purposes of this discussion, we assume that the number of variables that can appear in our formulas is finite and fixed (i.e., it never changes once we agree on what the variables are). Having a finite, fixed number of variables immediately puts an upper bound on the number of distinct models that can exist, because each model is just a distinct way to assign a value (i.e., ⊤ or ⊥) to every one of the variables.
Definition: Suppose that we have a collection of k different models m1,...,mk (remember that each model is an assignment of variables to values that looks something like {x ↦ ⊤, y ↦ ⊥, ...}). Suppose also that each model satisfies some formula f, so that we can write:
m1
f
m2
f
m3
f
mk-1
f
mk
f
Then we say that the set of models {m1,...,mk} satisfies f, and we denote this using the following notation:
{m1,...,mk}
f
Example: Suppose that we have only two variables: x and y. There are 22 = 4 distinct models if there are only two variables:
m1
=
{x ↦ ⊤, y ↦ ⊤}
m2
=
{x ↦ ⊤, y ↦ ⊥}
m3
=
{x ↦ ⊥, y ↦ ⊤}
m4
=
{x ↦ ⊥, y ↦ ⊥}
Suppose we set our universe U to be the set of models {m1, m2, m3, m4}. Then consider the following three formulas contianing x and y:
f
xy
g
xy
h
x y
Notice that m1, m2, and m3 all satisfy f, but only m1 satisfies g, and only m1 and m4 satisfy h. We could write the following:
{m1, m2, m3}
xy
{m1}
xy
{m1, m4}
x y
But this means that we could interpret xy as the set of models {m1, m2, m3}, we could interpret xy as the set of models {m1}, and we could interpret x y as the set of models {m1, m4}. In other words, each formula gives us a way to specify which models we want to include in a set.

[link]  3.3. Set algebra of model sets and boolean algebra

Fact: Assume we are only working with models and formulas that contain a certain set of variables x1,...,xk. Suppose we have two model sets A and B that satisfy the two formulas f and g, respectively. In other words:
A
f
B
g
Then all of the following are true:
AB
fg
AB
fg
A
¬ f
In other words:
  • the union of two model sets satisfies the disjunction (or) of two corresponding formulas;
  • the intersection of two model sets satisfies the conjunction (and) of two corresponding formulas;
  • the complement of a model set satisfies the negation (not) of a corresponding formula.
Example: Suppose we have the following sets of models:
A
=
{{x ↦ ⊤, y ↦ ⊥}, {x ↦ ⊥, y ↦ ⊤}}
B
=
{{x ↦ ⊤, y ↦ ⊤}}
Use A, B, and set operations (union, intersection, difference, and compelement, but not ∅ or U) to build the largest model set you can that satisfies each of the following formulas:
  1. xy
  2. xy
  3. ¬(xy)
  4. x
There is exactly one possible set for each part, because there is always a largest set of models that satisfy the formula (if there were two largest sets that both satisfy a formula, they could be combined into a larger set that still satisfies the formula, since every model found in either set satisfies the formula). However, there may be more than one way to build that set using A and B. For each part below, we provide one or two of the ways to build that set.
  1. The set A satisfies xy.
  2. The set B satisfies xy.
  3. The set B satisfies ¬(xy). We could also say AB satisfies it (since it refers to the same set of models).
  4. The set can only contain models in which x is assigned ⊤. However, there is no way to isolate the model in A where x ↦ ⊤, so the best we can do is the set B, which contains one of the two models in which x is assigned ⊤.
  5. No models can satisfy ⊥, so we need to build the empty set. One example of an empty set is AB.
  6. All models satisfy ⊤, so we can obtain all the model sets using AA, or BB, among others.
Example: Suppose we are considering formulas with two variables x and y, and we have the following sets of models:
A
=
{{x ↦ ⊤, y ↦ ⊥}, {x ↦ ⊥, y ↦ ⊥}}
B
=
{{x ↦ ⊤, y ↦ ⊤}}
C
=
{{x ↦ ⊤, y ↦ ⊥}, {x ↦ ⊥, y ↦ ⊤}}
Find a formula that corresponds to each of the following model sets.
  1. B
  2. B
  3. BC
  4. ABC
  5. ACB
There are multiple solutions to each of the parts above. Below, we provide a few for each.
  1. Since both x and y are ⊤, some formulas that are satisfied include xy, as well as (x ⊤) ∧ (y ⊤).
  2. One formula is ¬(xy), and another is ¬(x) ∨ ¬(y).
  3. One formula is (xy) ∨ (xy) because xy is satisfied by B and xy is satisfied by C. Another formula is xy, since at least one of the two variables is always true in B or C.
  4. Since ABC contains all possible models, one formula that is satisfied is ⊤ (all models satisfy ⊤ because any assignment of values to variables results in ⊤ being true). Another formula could be xy ∨ ¬(x) ∨ ¬(y).
  5. A formula that is always false is not satisfied by any models. This can include formulas such as ⊥, as well as x (¬(x)) because a variable must always be equal to itself.
  6. First, we can break down the notation to determine what set it describes:
    AC
    =
    {{x ↦ ⊤, y ↦ ⊥}}
    AC
    =
    {{x ↦ ⊥, y ↦ ⊤}, {x ↦ ⊤, y ↦ ⊤}, {x ↦ ⊥, y ↦ ⊥}}
    ACB
    =
    {{x ↦ ⊥, y ↦ ⊤}, {x ↦ ⊤, y ↦ ⊤}, {x ↦ ⊥, y ↦ ⊥}}
    At this point, we see the set only omits {x ↦ ⊤, y ↦ ⊥}, so possible formulas include x y, as well as ¬(x) ∨ y.
Fact: Given a formula f, there is a unique (i.e., single) largest set S of models that satisfies f.
To understand why, let us assume that this is not true. Then there is some formula f for which there exist two sets A and B that are distinct, and yet we have:
A
f
B
f
But because all the models in A and B satisfy f, the above implies that:
AB
f
Thus, either A and B are the same set, or AB is larger than both of them, contradicting our assumption that A and B are each as large as any other set of models that satisfies f.
The boolean satisfiability problem is solved by finding any one model that can satisfy a formula, and the maximum boolean satisfiability problem is solved by finding any model that satisfies a formula, and has as many variables mapped to ⊤ as any other formula that satisfies a formula. What if we are interested in knowing the number of models that satisfy a formula?
Definition: The size of a set S is the number of elements in that set, and is denoted using |S|, where |S| is always an integer that is 0 or greater.
Example: Below, we show the sizes of some sets:
|{a, b, c, d, e}|
=
5
|∅|
=
0
Fact: Given a formula f, there is a unique (i.e., single) largest set S of models that satisfies f.
To understand why, let us assume that this is not true. Then there is some formula f for which there exist two sets A and B that are distinct, and yet we have:
A
f
B
f
But because all the models in A and B satisfy f, the above implies that:
AB
f
Thus, either A and B are the same set, or AB is larger than both of them, contradicting our assumption that A and B are each as large as any other set of models that satisfies f.
Definition: We define an instance of the counting boolean satisfiability problem as follows: given a formula f, find the total number of models that satisfy f. In other words, find the size |S| of the unique largest set S that satisfies the formula:
S
f
Knowing the number of models that satisfy a formula can be useful. It can allow us to get a better idea of how easy it might be to find a model that satisfies (i.e., solves) the formula, or how restrictive a formula is.
Fact: Suppose that we are working with formulas that can only contain any of the variables x1, ..., xk (so there are k distinct variables). Then, the size of the universe U of models that may or may not satisfy a formula has size 2k:
|U|
=
2k
Recall that for a given number of variables, we can determine the number of rows a truth table may have for a formula. We also know that the number of rows is equivalent to the largest possible set U that may satisfy a formula for that number of variables. But what about the number of formulas that only use variables from a collection of variables x1, ..., xk?
The number of well-formed formulas is infinite, because we can always build a larger formula. But for a given collection of variables x1, ..., xk, the number of distinct truth tables is finite. This is because all truth tables are of size 2k, so they are finite, and there is only a finite number of ways that we can assign a value to the formula for every one of those 2k rows in the truth table.
Example: Suppose we are working with the variables x and y. How many truth tables are there involving these two variables? We can list them all (there are two variables, so there are 22 = 4 rows in each table, and so there are 2# rows = 2(22) = 222 = 24 = 16 truth tables).
xy
xy
xy
xy
xy
xy
xy
xy
xy
xy
xy
xy
xy
xy
xy
xy
Notice that no matter what well-formed formula we assemble that uses the variables x and y, we can always evaluate that formula to determine its meaning for each of the 22 = 4 possible assignments of values to variables (i.e., models), and since the meaning of the formula must be either ⊤ or ⊥, we will in the end have one of the truth tables above.
How can we compute the number of truth tables for a given number of variables? Notice that each truth table corresponds to a particular set of models that satisfy the formula (each row for which the formula is ⊤ corresponds to a model in the set of satisfying models).
Definition: Given a set B, we say that A is a subset of B if every element in A is also in B. In other words, if A - B = ∅, or if AB = A, then A is a subset of B. We denote this as follows:
A
B
Example: Below are some examples of subset relationships:
{b, c, d}
{a, b, c, d, e, f}
{a, b, c, d, e, f}
A
AB
AB
A
AB
B
Definition: Given a set S, the power set of S, which we denote as ℘(S), is the set of subsets of S. In other words, every possible subset of S is inside the set ℘(S), and each element in ℘(S) is a set.
Fact: Given a set S of size |S|, the size of the powerset is denoted |℘(S)|, where:
|℘(S)|
=
2|S|
This is because each possible element inside S is either in a subset, or it is not in a subset. Thus, since each of the |S| elements have two possible states (in and out), there are 2 ⋅ 2 ⋅ ... ⋅ 2 states in total (where we have |S| factors of 2 in the product).
Example: Below are some examples of power sets of specific sets:
℘({a, b, c})
=
{∅, {a}, {b}, {c}, {a,b}, {b,c}, {a,c}, {a,b,c}}
℘({a, b})
=
{∅, {a}, {b}, {a,b}}
℘(∅)
=
{∅}
Notice that the empty set is a subset of the empty set, so the power set of the empty set is 20 = 1.
Fact: Suppose we are working with formulas that contain the variables x1, ..., xk. Every truth table for these variables corresponds to a particular set of models (i.e., those that satisfy the formula and correspond to rows in the table for which the formula is ⊤). Every set of models must be a subset of the universe U for this collection of variables. Thus, all the possible sets of models correspond exactly to the elements of ℘(U), the power set of the universe U:
row in a truth table
    
corresp. to
    
a model (assignment of values to variables)
all rows in a truth table
    
corresp. to
    
the universe of models U
a particular truth table
    
corresp. to
    
a set of models, subset of U (rows for which formula is ⊤)
set of all truth tables
    
corresp. to
    
set of all sets of models, power set of U
Thus, the number of truth tables is exactly the size of ℘(U):
# truth tables
=
|℘(U)|
=
2|U|
=
2# rows
=
2(2k)
=
22k.
Example: Suppose we are working with the variables x and y. Each possible truth table for these variables corresponds to a particular subset of the rows (i.e., the rows that have a value of ⊤ in the formula column), and thus it corresponds to a particular subset of the universe. We can pair each truth table with the subset of the universe to which it corresponds.
xy
{{x ↦ ⊤, y ↦ ⊤},
{x ↦ ⊤, y ↦ ⊥},
{x ↦ ⊥, y ↦ ⊤},
{x ↦ ⊥, y ↦ ⊥}}
xy
{{x ↦ ⊤, y ↦ ⊤},
{x ↦ ⊤, y ↦ ⊥},
{x ↦ ⊥, y ↦ ⊤}}
xy
{{x ↦ ⊤, y ↦ ⊤},
{x ↦ ⊤, y ↦ ⊥},
{x ↦ ⊥, y ↦ ⊥}}
xy
{{x ↦ ⊤, y ↦ ⊤},
{x ↦ ⊤, y ↦ ⊥}}
xy
{{x ↦ ⊤, y ↦ ⊤},
{x ↦ ⊥, y ↦ ⊤},
{x ↦ ⊥, y ↦ ⊥}}
xy
{{x ↦ ⊤, y ↦ ⊤},
{x ↦ ⊥, y ↦ ⊤}}
xy
{{x ↦ ⊤, y ↦ ⊤},
{x ↦ ⊥, y ↦ ⊥}}
xy
{{x ↦ ⊤, y ↦ ⊤}}
xy
{{x ↦ ⊤, y ↦ ⊥},
{x ↦ ⊥, y ↦ ⊤},
{x ↦ ⊥, y ↦ ⊥}}
xy
{{x ↦ ⊤, y ↦ ⊥},
{x ↦ ⊥, y ↦ ⊤}}
xy
{{x ↦ ⊤, y ↦ ⊥},
{x ↦ ⊥, y ↦ ⊥}}
xy
{{x ↦ ⊤, y ↦ ⊥}}
xy
{{x ↦ ⊥, y ↦ ⊤},
{x ↦ ⊥, y ↦ ⊥}}
xy
{{x ↦ ⊥, y ↦ ⊤}}
xy
{{x ↦ ⊥, y ↦ ⊥}}
xy
         
Example: Suppose we want to know the number of truth tables for the set of formulas that have up to 4 variables. We can compute it:
# truth tables
=
2# rows
=
2(24)
=
224
=
216
=
65536.
Example: Suppose you are implementing or maintaining a small social networking application with five members: Alice, Bob, Carl, Dan, and Eve. For any given instance (i.e., session) of the application running in a single browser window, the following constraints must be satisfied:
  • if Alice is logged in, Bob is visible and Dan is visible;
  • if Bob is logged in, Alice is visible and Dan is visible;
  • if Carl is logged in, Eve is visible;
  • if Dan is logged in, Alice is visible and Bob is visible;
  • if Eve is logged in, Carl is visible;
  • only one person can be logged in at a time.
Answer each of the following.
  1. How can we represent the above constraints using a formula? What is the formula?
  2. How many distinct models should satisfy the formula? In other words, in how many rows in the truth table should the formula column contain ⊤?
  3. Suppose you are running the server and database for this application, and the members are all allowed to either make themselves visible or not visible to one another. From your perspective, at any given time, any of the members might be logged in or logged out, and the visibility properties between the members might have different settings depending on their current preferences. The database keeps track of everything: whether each member is logged in, and what the visibility preferences are. How many different database configurations can exist?
  4. The above constraints suggest a particular set of relationships between the members: Alice, Bob, and Dan are all friends, and Carl and Eve are friends with each other, but no one else is a friend with anyone else. Suppose we wanted to use a single variable to represent each pair-wise relationship between members. How many variables would we need?
We know that each user must be either logged in or not logged in, and each user must be either visible or not visible. Since there are five users and two states to track for each user, we will use 2 ⋅ 5 variables: Alice logged in, Bob logged in, Carl logged in, Dan logged in, Eve logged in, Alice visible, Bob visible, Carl visible, Dan visible, and Eve visible. Thus, the size of our universe is 210 = 1024. We answer each part, with some discussion (there is sometimes more than one right answer).
  1. How we represent the constraints using a formula depends on what the unspecified constraints might be. If we assume that anything that is not mentioned explicitly in the constraints should not be restricted, then the above constraints would allow for the possibility that everyone can be visible to, e.g., Alice, it is just required that Bob and Dan must be visible. Then, we can represent each of the first five constraints as follows using a formula:
    • Alice logged in (Bob visibleDan visible)
    • Bob logged in (Alice visibleDan visible)
    • Carl logged in Eve visible
    • Dan logged in (Alice visibleBob visible)
    • Eve logged in Carl visible
    On the other hand, this is not necessarily what the person who wrote the constraints meant. Maybe by saying "Bob and Dan are visible", the person implied that the others should not be visible. Then we would need to write the constraints as follows:
    • Alice logged in (Bob visible ∧ ¬(Carl visible) ∧ Dan visible ∧ ¬(Eve visible))
    • Bob logged in (Alice visible ∧ ¬(Carl visible) ∧ Dan visible ∧ ¬(Eve visible))
    • Carl logged in (¬(Alice visible) ∧ ¬(Bob visible) ∧ ¬(Dan visible) ∧ Eve visible)
    • Dan logged in (Alice visibleBob visible ∧ ¬(Carl visible) ∧ ¬(Eve visible))
    • Eve logged in (¬(Alice visible) ∧ ¬(Bob visible) ∧ Carl visible ∧ ¬(Dan visible))
    The only way to determine which of the above is the "correct" interpretation of the constraints written in English is to ask the original author of the constraints. If we imagine an application in which everyone can set their own privacy settings, we might imagine the first interpretation is more correct, because we want everyone to be able to make their profile public if they wish (e.g., Alice must be able to see her friends Bob and Dan, but she might also be able to see Eve if Eve made her profile public).

    The last constraint is difficult to represent using only the well-formed formulas we have seen so far; we must do so by enumerating all the possibilities:
    • Alice logged in (¬(Bob logged in) ∧ ¬(Carl logged in) ∧ ¬(Dan logged in) ∧ ¬(Eve logged in))
    • Bob logged in (¬(Alice logged in) ∧ ¬(Carl logged in) ∧ ¬(Dan logged in) ∧ ¬(Eve logged in))
    • Carl logged in (¬(Alice logged in) ∧ ¬(Bob logged in) ∧ ¬(Dan logged in) ∧ ¬(Eve logged in))
    • Dan logged in (¬(Alice logged in) ∧ ¬(Bob logged in) ∧ ¬(Carl logged in) ∧ ¬(Eve logged in))
    • Eve logged in (¬(Alice logged in) ∧ ¬(Bob logged in) ∧ ¬(Carl logged in) ∧ ¬(Dan logged in))
  2. If users are not allowed to make their profile public, the only possibilities are that one of the five users is logged in, or that no one is logged in. In that case, there are 5 + 1 = 6 possible rows in the truth table.

    However, if users are allowed to set their profiles to be public or private, then computing the number of rows that satisfy the formula is more complicated (unless we do it automatically by simply submitting the appropriate formula to a solver for the counting boolean satisfiability problem).

    If no one is logged in but everyone controls their own public profile privacy settings independently, then there are actually 25 = 32 possible combinations (there are five users, and the profile for each user is either public or private). Then, if any of the users are logged in, it must be that there are a number of valid configurations. For example, consider the case in which Alice is logged in: Carl and Eve may or may not be visible to Alice, and all of these are allowed:
    Alice logged in Bob visible Carl visible Dan visible Eve visible
    Thus, for Alice, Bob, and Dan, there are 22 = 4 possible configurations of Carl's and Eve's profile privacy settings. From Carl and Eve's perspective, there are 23 = 8 possible configurations for Alice's, Bob's, and Dan's profile privacy settings. If we add up all the possibilities, we get:
    1 ⋅ 25 + 3 ⋅ 22 + 2 ⋅ 23
    =
    32 + 12 + 16
    =
    60
    Thus, there would be 60 rows in the table in which the column corresponding to the overall formula has the value ⊤.
  3. This question is simply asking about the number of possible configurations of the ten variables we have considered, which is the same as the size of the universe. We know that in this problem, the size of the universe is 210 = 1024, so there are 1024 database configurations.
  4. Since there are five users, we would have 4 + 3 + 2 + 1 = 10 pair-wise relationships: four relationships between Alice and the others, three more between Bob and the others (excluding his relationship with Alice, which we already counted), two more between Carl and the others (excluding relationships with Alice and Bob, which we already counted), and so on. Thus, we would need 10 variables if we had one variable for each pair-wise relationship.


[link]  3.4. Assignment #2: Algebraic Properties of Formulas and Sets

In this assignment you will define a collection of Python variables and functions involving formulas and sets.

Your solution may not import any modules or employ any external library functions (unless the problem statement explicitly permits this). The different problems and problem parts rely on the lecture notes and on each other; carefully consider whether you can use functions from the lecture notes, or functions you define in one part within subsequent parts.
  1. For each of the following formula definitions, use De Morgan's Law and other algebraic properties of boolean operators to convert it into an equivalent formula that uses only ∧ and ¬, and no other operators. You only need to include the new versions of these definitions (that use only ∧ and ¬); you do not need to include the original definitions.

    1. def oneA(x, y):
        return (x or (not y))
                    

    2. def oneB(x, y):
        return ((not x) <= (not y))
                    

    3. def oneC(x, y, z):
        return ((x and y) <= z)
                    

    4. def oneD(w, x, y, z):
        return (x or y) and (w or z)
                    

    5. def oneE(w, x, y, z):
        return ((not w) or (not x)) or (not (y and z))
                    

    6. def oneF(v, w, x, y, z):
        return (((v <= (not w)) and x) <= (y <= z))
                    
  2. Include the following set definitions in your Python file:

    U = {'a''b''c''d''e''f''g''h'}
    X = {'a''b''c'}
    Y = {'c''d''e''f'}
    Z = {'f''g'}
              
    For each of the following sets, use some number of set intersection, union, and difference operations to build the specified set using U, X, Y, and Z. You can only use set operations and the four provided set variables; you may not include any sets built using the brace notation { ... }. There may be more than one correct solution to each part; you only need to supply one correct solution for each part.
    1. Define a set twoA in your Python file that is equivalent to the set { 'f' }.

      twoA = ?
                    
      Your definition cannot contain any explicit sets; it must use set operations to build the set { 'f' } using U, X, Y, Z. It should then be possible to test your definition as follows:

      >>> twoA == { 'f' }
      True
                    
    2. Define a set twoB in your Python file that is equivalent to the set {'a', 'b'}.
    3. Define a set twoC in your Python file that is equivalent to the set {'d', 'e'}.
    4. Define a set twoD in your Python file that is equivalent to the set {'a', 'b', 'g', 'h'}.
    5. Define a set twoE in your Python file that is equivalent to the set {'c', 'd', 'e', 'h'}.
    6. Define a set twoF in your Python file that is equivalent to the set {'a', 'b', 'd', 'e', 'g', 'h'}.
    7. Define a set twoG in your Python file that is equivalent to the empty set set().
  3. For each of the following functions, use De Morgan's Law for sets to convert each function on sets into a function that returns the same result, but uses only set union and set difference operations, and does not use set intersection. You only need to include the new versions of these definitions (that use only set union and set difference); you do not need to include the original definitions. You may assume that U is always the set corresponding to the universe from which the elements in all the other sets are drawn.

    1. def threeA(U, X, Y):
        return X & Y
                    

    2. def threeB(U, X, Y):
        return (U - X) & (U - Y)
                    

    3. def threeC(U, X, Y, Z):
        return U - ((U - X) & (Y & Z))
                    

    4. def threeD(U, X, Y, Z):
        return (X & Y) | (Y & Z)
                    
  4. In this problem, you will implement Python functions for working with sets of models that satisfy particular formulas. You may include the following functions in your Python file: variables(f) returns the list of variables in the definition of a formula f, and universe(n) returns the set of all boolean value tuples of a certain length n.

    def variables(f):
        return list(f.__code__.co_varnames)
        
    def universe(n, values = tuple()):
        if len(values) == n:
            return {values}
        else:
            return universe(n, values + (True,)) | universe(n, values + (False,))
              
    1. Implement a function models(f) that takes as its input a single function f (i.e., a Python function corresponding to a formula). The function should return a set containing all the tuples that satisfy f. Hint: you could define a local variable M to be an empty set, then use a for loop to look through all the formulas in an appropriate universe, and add each tuple that satisfies the formula to the set using M.add().
    2. Implement a function satisfies(S, f) that takes as its input a set of tuples S and a function f that represents a formula. The function should return True if every tuple in S satisfies f, and False, otherwise.
    3. Implement a function equivalent(f, g) that takes as its inputs two functions f and g that represent formulas. If both formulas take the same number of variables and are equivalent (i.e., they have the same truth tables), then equivalent(f, g) should return True; otherwise, it should return False. Hint: use models() to write a one- or two-line definition for this function.
    4. Implement a function both(f, g) that takes as its inputs two functions f and g that represent formulas. It should return a set of tuples that contains only those tuples that satisfy both f and g. Your implementation must be at most two lines.
    5. Implement a function either(f, g) that takes as its inputs two functions f and g that represent formulas. It should return a set of tuples that contains only those tuples that satisfy either f or g. Your implementation must be at most two lines.
    6. Implement a function neither(f, g) that takes as its inputs two functions f and g that represent formulas. It should return a set of tuples that contains only those tuples that do not satisfy either f or g. Your implementation must be at most two lines.
  5. Suppose that Alice, Bob, Carl, Dan, and Eve are all trying to schedule a one-hour appointment with you. They each have one of two possible times that they can meet:
    • Alice can meet at 2 PM or 4 PM;
    • Bob can meet at 1 PM or 3 PM;
    • Carl can meet at 1 PM or 5 PM;
    • Dan can meet at 2 PM or 3 PM;
    • Eve can meet at 4 PM or 5 PM.
    How many different schedules exist in which everyone is scheduled for an appointment, and no appointments overlap? Define a single variable schedules to be an integer that corresponds to the number of different schedules that will satisfy everyone:

    schedules = ?   # Replace ? with appropriate function calls.
              
    Hints:
    • define a new formula function that takes ten arguments (one for each possible time and person) and uses logical operations (you only need ⊕ and ∧) to determine whether a schedule is possible by returning True or False;
    • by calling the functions you implemented in Problem #4 above, compute the set of tuples that satisfies the formula you assembled and use this to determine the number of possible schedules.



[link]  4. Logic and Formulas as Proofs

In this section we will formally define the notion of a proof and a theorem (a.k.a., a fact, lemma, and so on) using concepts that have been introduced in previous sections. In particular, we will define these as a certain kind of formula: a formula that is always true. We will also introduce logical rules that allow us to build such formulas.

[link]  4.1. Building true formulas

Suppose we are interested in building boolean formulas that are always true. In other words, we want to build formulas that are true under every possible assignment of the variables to truth values. We could say that such formulas f are satisfied by the universe (i.e., Uf); that is, these formulas can accurately model the properties of the universe of assignments.
One naive, brute force strategy might be to begin listing all possible well-formed formulas, and then checking their truth tables. We could throw away formulas that are not always true, and keep those that are always true. However, this process would be extremely inefficient.
Another approach might be to assemble a particular collection of algebraic laws that identify exactly those algebraic manipulations that can grow a formula but keep the formula's meaning true. In other words, we might want algebraic laws that explain to us how we can expand the simplest true formulas, such as ⊤, into larger and larger true formulas. We can then use these laws as building blocks to build larger and larger formulas that are always true, or even to more efficiently check if a particular formula is always true.
We will call these particular algebraic rules for formulas logical rules, or just logic. We can begin by introducing two formulas that are always guaranteed to be true.
Fact: The formula ⊤ is always true.
Fact: For any variable x, the following formula is always true:
x x
We can confirm this by consulting the truth table for this formula:
x x x
In fact, for any formula f, the following formula must be true:
f f
We can confirm this by observing that no matter what assignment of variables we consider, the meaning of f is the same on both the left and right side of the operator:
x1 x2 ... xk f f f
...?
...?
Regardless of what the meaning of f is in a particular row, f f will always be true.
Once we have a way to build simple formulas that are always true, we can expand these into larger formulas that are always true by converting them into formulas using transformations that are guaranteed not to make a formula that is already true into a formula that is false. Note that any transformation of the formula into an equivalent formula using algebraic laws satisfies this condition, but there are other transformations that can satisfy this condition, as well.
Definition: A logical inference rule (or simply inference rule) is a generalization of the notion of an algebraic law for equivalent formulas. Instead of requiring that the truth table for two formulas f and g must be the same, an inference rule only requires that for any row in the truth table for f in which f has a meaning of ⊤, the same row in the truth table for g must also be true. An inference rule is written using the following notation:
 f 
 g 
The formula f above the line is called the premise, and the formula g below the line is called the conclusion.
Example: The following is a valid inference rule:
 xy 
 xy 
We can confirm it is a valid inference rule by consulting the truth tables for the two formulas:
x y xy xy
Notice that in some rows, if the premise has the meaning ⊥, the conclusion has the meaning ⊤. But there is no row in which a premise has the meaning ⊤ but the conclusion has the meaning ⊥.
Note that if an inference rule holds in both directions for two formulas, then the formulas are equivalent.
Fact: Suppose that for two formulas f and g, both of the following inference rules are true:
 f 
 g 
 g 
 f 
Then we can be certain that:
f
g
This is because in any row in which f has meaning ⊤, g must also have the meaning ⊤ (according to the first inference rule), and in any row in which g has meaning ⊤, f must have the meaning top (according to the second inference rule):
f g
allowed by both inference rules
violates first inference rule
violates second inference rule
allowed by both inference rules
We introduce inference rules in addition to equivalence of formulas because we are working with special formulas in this section: formulas that are always true. For these kinds of formulas, what seems like a weaker restriction imposed by an inference rule is actually not weaker.
Fact: Suppose we have two formulas f and g and f is always true, and suppose that the following inference rule holds:
 f 
 g 
Then we can be certain that:
f
g
This is because f has the meaning ⊤ in all of the rows in its truth table, which means g must also have the meaning ⊤ (according to the inference rule) in all of its rows:
f g
allowed
violates the inference rule
violates condition that f is always true
violates condition that f is always true
We can now introduce some additional logical inference rules that can be useful for converting formulas that are always true into larger formulas that are always true.
Fact: For any three formulas f, g, and h, we have:
 f h 
 (fg) h 
We can confirm this by consulting the truth table for the two formulas above:
f g h f h (fg) h
Note that because ∧ is commutative, this inference rule can be combined with algebraic laws to yield:
 f h 
 (gf) h 
Furthermore, because ∧ is associative, this inference rule means that we can always introduce any number of formulas into a conjunction on the left-hand side of the operator:
 (f1 ∧ ... ∧ fm) h 
 (f1 ∧ ... ∧ fmg1 ∧ ... ∧ gn) h 
Example: Suppose that we believe the following relationships to hold between a light source and the time:
  • if the sun is in the sky, then it is daytime.
We can represent the relationship above using the formula:
sun is in the sky
it is daytime
Notice that if the meaning of the above formula is true, introducing additional constraints on the left-hand side of the operator can never make new formula false. In some cases, it would just be additional, irrelevant information:
(giraffes are yellowsun is in the sky)
it is daytime
What if the additional formula is false? This does not matter, either, because then the left-hand side of the operator is false, which automatically makes the whole formula true (i.e., ⊥ ⊤ and ⊥ ⊥ are always true):
(giraffes are redsun is in the sky)
it is daytime
This holds even in the following extreme example, where the left-hand side becomes contradictory because of what we introduce:
(¬(sun is in the sky) ∧ sun is in the sky)
it is daytime
Fact: For any three formulas f, g, and h, we have:
(fg) h     
     (fg) (hg)
In the form of inference rules, this would be:
 (fg) h 
 (fg) (hg
 (fg) (hg
 (fg) h 
We can confirm this by consulting the truth table for the two formulas above:
f g h (fg) h (fg) (hg)
Intuitively, this means that as long as a subformula appears on the left-hand side of the operator, we are free to remove or add it to the right-hand side of the operator. In other words, "we can always conclude that what we assumed is true, and we are also free to remove that conclusion".
Example: Suppose that we believe the following relationships to hold between a light source and the time:
(sun is in the sky it is daytime) ∧ sun is in the sky
it is daytime
Then we can add the fact that the sun is in the sky on the right-hand side of the outermost operator without changing the meaning of the formula:
(sun is in the sky it is daytime) ∧ sun is in the sky
it is daytimesun is in the sky
Fact (modus ponens): For any two formulas f and g, we have:
 (f g) ∧ f 
 (f g) ∧ fg 
The above can be simplified further into either or both of the following:
 (f g) ∧ f 
 fg 
 (f g) ∧ f 
 g 
We can confirm this by consulting the truth table for the two formulas above:
f g (f g) ∧ f (f g) ∧ fg fg g (as conclusion)
In fact, after inspecting the truth table, we may notice that in this case, we can make a more restrictive claim:
(f g) ∧ f
(f g) ∧ fg
fg
Example: Suppose that we believe the following relationships to hold between a light source and the time:
  • if the sun is in the sky, then it is daytime.
Then we intuitively know that if sun is in the sky, it must be daytime. The modus ponens inference rule captures this intuition (we introduce wider in this instance spacing for legibility):

 (sun is in the sky it is daytime) ∧ sun is in the sky
 
 
(sun is in the sky it is daytime) ∧ sun is in the skyit is daytime 

Below are the variants of the rule with more concise conclusions:
 (sun is in the sky it is daytime) ∧ sun is in the sky 
 sun is in the skyit is daytime 
 (sun is in the sky it is daytime) ∧ sun is in the sky 
 it is daytime 
Example: Suppose that we believe the following relationships to hold between a light sensor, a light source, and the time:
  • if the light sensor is activated, the sun is in the sky;
  • if the sun is in the sky, then it is daytime.
We can represent the two relationships above using two formulas:
sensor
sun
sun
daytime
We can use the facts we introduced in this subsection to build a formula that is always true, no matter what values we assign to the three variables in this formula. We can start with a formula that must always be true:
daytime
daytime
Using a known fact, we can introduce our two assumptions about the world into our formula, as well as sun, all on the left-hand side of the operator:
((sensor sun) ∧ (sun daytime) ∧ sundaytime)
daytime
Modus ponens and associativity of ∧ tell us that we can simplify a subformula in the above:
(sun daytime) ∧ sundaytime
(sun daytime) ∧ sun
Then we have the following formula, which is still always true:
((sensor sun) ∧ (sun daytime) ∧ sun)
daytime
Using a fact once more, we can introduce one more subformula on the left-hand side of the operator:
((sensor sun) ∧ sensorsun ∧ (sun daytime))
daytime
Using modus ponens and associativity of ∧ once again, we get:
((sensor sun) ∧ (sun daytime) ∧ sensor)
daytime
We have now built a formula that is still always true, but it expresses a relationship that is not as direct. In English, the above formula says: "given our two assumptions about the relationship between the light sensor, the sun, and the time of day, if the sensor is activated, it must be daytime". This formula is always true; there is no condition under which it can be false. This formula can be called a theorem. We can confirm this by building the truth table:
sensor sun daytime ((sensor sun) ∧ (sun daytime) ∧ sensor) daytime
Notice that in the rows where daytime is true, the overall formula is true, as expected. If daytime is false, then either sensor does not imply sun, or sun does not imply daytime, so the left-hand side of the operator is false, making the overall formula true.
In the example above, we used inference rules to ensure that the formula we were constructing would always be true. However, our approach might seem "backwards" in that particular example because the order of the transformations we performed on the formula do not correspond to the order in which we might intuitively reason about the relationships being modeled. Furthermore, copying the entire formula at every stop is burdensome. To address this, we will need to use a slightly different notation that can keep track of how our formula grows without requiring that we copy the entire formula over each time.

[link]  4.2. Proofs and theorems

In the previous subsection, we saw that proofs and theorems are both formulas. The distinction between a proof and a theorem is that it is possible to efficiently verify that a proof is indeed always true, while it may not be possible to verify a theorem is true without building the entire truth table for it (i.e., solving the counting boolean satisfiability problem and confirming that the total count is equal to the size of the universe).
We will introduce a slightly different notation for formulas that are proofs. This notation will make it easier to think of proofs as something we can "build" in a step-by-step manner.
Definition: A proof is a well-formed formula written as a vertical list of subformulas f1,...,fk where that looks as follows:
f1
f2
fk
This notation is equivalent to the following notations (note that these are all algebraically equivalent):
(f1f2 ∧ ... ∧ fk-1)
fk
f1 (f2 (f3 ( ... ( fk-1
fk) ... )))
If the above formula is always true (i.e., all rows in its truth table have the value ⊤ in the column corresponding to the overall formula), then we say the sequence of formulas constitutes a valid proof.
Typically, in a proof sequence f1,...,fk, it is required that after some point, each subformula fj must be derived using a simple inference rule from the formulas f1, ..., fj-1 that come before it. Effectively, this all means that the following formulas must also all be true. For example, the following might be the case for a proof:
f1
f2
(f1f2)
f3
(f1f2f3)
f4
(f1f2 ∧ ... ∧ fk-1)
fk
Sometimes, this is restricted even further: every fj must either be exactly the same as some previous fi where i < j, or it must be the direct conclusion of an inference rule for which the premises are the one, two, or three preceding formulas fj-1, fj-2, and fj-3. Effectively, this means all the following formulas might also be true:
(f1f2)
f3
(f2f3)
f4
(f3f4)
f5
(fk-2fk-1)
fk
If either of the above collections of formulas are true, we can be certain the proof is efficiently verifiable.
To see how each "subsequence" of formulas can correspond to a true formula in this way, we can consider the following. Suppose we have formula f, g, and h, and we are sure that the following is true (e.g., due to modus ponens):
fg
(fg) ∧ h
The following series of logical and algebraic manipulations ensures the formula being manipulated (found on the first line) remains true all the way to the end:
f
g
f g
g
    
using
    
a known fact
(fg) ∧ h
g
    
because
    
fg ≡ (fg) ∧ h
(fg) ∧ h
g h
    
using
    
a known fact
(fg) ∧ h
h
    
using
    
associativity of ∧ and a known fact
fg
h
    
because
    
fg ≡ (fg) ∧ h
Notice that we can summarize this as follows in English: "if f and g can be used to derive h, then the proof f g can be extended into the proof fg h".
Definition: An axiom is any subformula within a proof that we assume to be true. In other words, we do not require that the axiom is always true. For example, suppose we have three axioms a1, a2, and a3 in the following proof:
a1
a2
a3
f1
f2
fk
This means that we do not require that a1, a2, or a3 must always be true; they will only ever appear on the left-hand side of the operator. All the following would be required to be true, however:
a1a2a3
f1
(a1a2a3f1)
f2
(a1a2a3f1f2)
f3
(a1a2a3 f1f2 ∧ ... ∧ fk-1)
fk
Notice that because a1, a2, and a3 always appear on the left-hand side of , they do not necessarily need to always be true. If they are false, then all of the above formulas are true, because ⊥ ⊤ is always true.
Example: Suppose we have the following axioms governing several components on a Mars rover:
light sensor
sun is visible
sun is visible
battery charging
battery charging
communications work
battery charging
motors work
motors work
antenna can turn
((antenna can turncommunications work)
can call Earth)
Because these are axioms, we will assume they are always true and will not require that they be proven. Equivalently, we can say we simply will ignore any rows in a truth table in which any of the above are false.
Suppose we encode all of the above axioms within a single formula a (that is, f is a conjunction of all the axioms). Then we can begin to assemble some simple proofs.
a
light sensor
light sensor
The above is a simple proof corresponding to the formula (alight sensor) light sensor. It is true because of this fact and this fact. We can keep adding more to the proof:
a
    
axioms (no proof necessary, assumed to be true)
light sensor
    
light sensor sun is visible
    
exactly the same as one of the axioms
sun is visible
    
by modus ponens of the two subformulas immediately above
sun is visible battery charging
    
exactly the same as one of the axioms
battery charging
    
by modus ponens of the two subformulas immediately above
The above, slightly longer proof represents a formula that is always true. We note that each formula along the way (i.e., each subsequence of steps in the list) is also a true formula:
    
asensor           (sensor sun)
    
asensor ∧ (sensor sun)           sun
    
asensor ∧ (sensor sun) ∧ sun           (sun battery)
    
asensor ∧ (sensor sun) ∧ sun ∧ (sun battery)           battery
What is interesting about such proofs is that we can then use the equivalence version of modus ponens (i.e., (f g) ∧ fg ≡ (f g) ∧ f) along with the fact that a lot of things repeat (i.e., ff is equivalent to f) to get rid of steps while keeping the formula true. Suppose we have the following very long proof:
a
    
axioms (no proof necessary, assumed to be true)
light sensor
    
light sensor sun is visible
    
exactly the same as one of the axioms
sun is visible
    
by modus ponens of the two subformulas immediately above
sun is visible battery charging
    
exactly the same as one of the axioms
battery charging
    
by modus ponens of the two subformulas immediately above
battery charging motors work
    
exactly the same as one of the axioms
motors work
    
by modus ponens of the two subformulas immediately above
motors work antenna can turn
    
exactly the same as one of the axioms
antenna can turn
    
by modus ponens of the two subformulas immediately above
We can eliminate many of the steps above because they are redundant, or because they do not change the meaning of the formula:
a
    
light sensor
    
antenna can turn
    
This is very interesting, because it means that the formula represented by the above proof has the same meaning as the original proof, so like the proof, it must always be true. Thus, the following formula is always true:
(alight sensor)
antenna can turn
However, it is not immediately obvious that the above formula is always true. In fact, we would need to either rebuild the proof and check that each step was a valid logical or algebraic manipulation, or we would need to build the entire truth table and check all 27 = 128 rows (i.e., solve the counting boolean satisfiability problem for this proof). Thus, the above formula is a theorem (because it is always true), but it is not necessarily a proof, anymore (depending on what we consider to be an efficiently verifiable proof).
Fact: We can provide sufficient conditions for an efficiently verifiable proof as follows: a formula f is an efficiently verifiable proof if it can be converted into a vertical list in which each non-axiom step is either (1) an exact duplicate of an axiom or another step above it, or (2) a direct conclusion of applying modus ponens to the two steps immediately before it.
Example: For each of the following formulas, determine if it conforms to the definition of a proof:
  1. (abcde) c
  2. (a ∧ (b c) ∧ b) c
  3. ((b c) ∧ (a b) ∧ a) c
We discuss each of the formulas.
  1. Since c is an exact duplicate of a formula on the left-hand side of the operator, if the formulas on the left-hand side of are axioms then this is a proof.
  2. By modus ponens, we can obtain c from (b c) ∧ b (because (b c) ∧ b ≡ (b c) ∧ bc is an algebraic law). Thus, c is a direct conclusion of applying modus ponens to the two steps immediately before c, so this formula is a proof.
  3. This formula is a theorem, but it is not a proof. The final step c cannot be derived directly from the two previous steps using modus ponens, and it does not appear on the left-hand side of the operator.
Example: Suppose we adopt the same collection of axioms a is in the example above. However, suppose we are given a theorem and are asked to assemble a proof for it. For example, suppose we are given the following theorem:
(alight sensor)
communications work
In the form of a vertical list, this theorem would look as follows:
a
    
light sensor
    
communications work
    
Our task is to find a list of subformulas that conforms to our definition of a proof: each step is either an axiom, a duplicate of a previous step, or the consequence of applying modus ponens to the two previous steps.
Under these circumstances, one strategy we can employ is to work backwards. In particular, we know that because there is no exact duplicate of communications work anywhere in the theorem, this must be derived using modus ponens. Thus, we know that whatever the proof might contain, it must at least contain the following;
a
    
light sensor
    
    
battery charging communications work
    
exactly the same as one of the axioms
communications work
    
by modus ponens
However, the above tells us that we also need battery charging for modus ponens to work:
a
    
light sensor
    
    
battery charging
    
battery charging communications work
    
exactly the same as one of the axioms
communications work
    
by modus ponens of the two subformulas immediately above
But where can we get battery charging? Also from modus ponens, since it is again not an axiom or anywhere else in the vertical list of subformulas:
a
    
light sensor
    
    
sun is visible battery charging
    
exactly the same as one of the axioms
battery charging
    
by modus ponens of the two subformulas immediately above
battery charging communications work
    
exactly the same as one of the axioms
communications work
    
by modus ponens of the two subformulas immediately above
At this point, we again notice that sun is visible can only be obtained using modus ponens:
a
    
light sensor
    
light sensor sun is visible
    
exactly the same as one of the axioms
sun is visible
    
by modus ponens of the two subformulas immediately above
sun is visible battery charging
    
exactly the same as one of the axioms
battery charging
    
by modus ponens of the two subformulas immediately above
battery charging communications work
    
exactly the same as one of the axioms
communications work
    
by modus ponens of the two subformulas immediately above


[link]  4.3. Assignment #3: Theorems, Proofs, and Modus Ponens

In this assignment you will assemble a simple proof verifier and a short proof.

Your solution may not import any modules or employ any external library functions (unless the problem statement explicitly permits this). The different problems and problem parts rely on the lecture notes and on each other; carefully consider whether you can use functions from the lecture notes, or functions you define in one part within subsequent parts.
  1. For each of the following formulas, determine whether it is a theorem, an efficiently verifiable proof, or neither. In your Python file, define the variables oneA, oneB, oneC, oneD, oneE, oneF, and oneG to be either 'Theorem', 'Proof', or 'Neither'. For example, if the question had a part (h) and the formula for that part were a proof, you would include in your Python file the following line:

    oneH = 'Proof'
              
    In the above manner, answer the following.
    1. (xy) x
    2. ((y z) ∧ (x y) ∧ (w x) ∧ w) z
    3. x x
    4. (uvwxy) z
    5. ((u v) ∧ (w x) ∧ u ∧ (u v) ∧ v) (w x)
    6. (xy) (xy)
    7. ¬(x) ∨ (yz)
  2. For the remainder of this assignment, we will represent proofs in Python usings lists. Each step in the proof will either be a list such as ['x', 'y'] containing two items (representing an implication such as x y), or a single string such as 'z' (representing a predicate or variable such as z). For example, suppose we have the following proof:
    (zx ∧ (x y)) y
    We would represent this as the following Python list:

    ['z''x', ['x''y'], 'y']
              
    In this problem, you will implement a Python function that can verify that a proof supplied using the above representation is actually valid (i.e., that the proof represents a formula that is always true). Implement a Python function valid(axioms, proof) that takes two lists are arguments: axioms is a list of implications that are assumed to be true, while proof is a list of implications and predicates where each entry is:
    • an exact copy of something that appears in axioms or earlier in the list proof, or
    • a conclusion of modus ponens where the two entries in the list immediately before it are the two premises of modus ponens (the order of the two premises does not need to matter).
    If the input proofs satisfies the above conditions, then valid(axioms, proofs) should return True. Othewise, it should return False. Some example inputs and outputs are found below:

    >>> valid(['a'], ['a''a'])
    True
    >>> valid(['a''b'], ['c''d'])
    False
    >>> valid(['x', ['x''y']], ['x', ['x''y'], 'y'])
    True
    >>> valid([['a''b'], ['b''c'], 'a'], ['a', ['a''b'], 'b', ['b''c'], 'c'])
    True
    >>> valid([['a''b'], ['b''c'], 'a'], ['a''c'])
    False
              
    Hint: You can write a for loop that traverses a list xs as follows:

    for i in range(0, len(xs)):
      print(xs[i])
              
  3. You may include the following axiom list in your Python file:

    axiomsThree = [\
        ['user can log in as admin''user can change admin password'],\
        ['user can own files''user can delete files'],\
        ['user can log in as admin''user can set security policy'],\
        ['user can log in as admin''user can see files'],\
        ['user can set security policy''user can set file ownership'],\
        ['user knows admin password''user can log in as admin'],\
        ['user can set file ownership''user can own files'],\
        'user knows admin password'\
      ]
              
    Define a proof (i.e., a Python list) called proofThree in your file that concludes with the predicate 'user can delete files':

    proofThree = [\

        # Fill this in with valid proof steps.

        'user can delete files'\
      ]
              
    Your proof should be valid according to valid():

    >>> valid(axiomsThree, proofThree)
    True
              


[link]  4.4. Predicates, constants, and universal quantification

In this subsection, we generalize the notion of a predicate in order to make it possible to describe many properties of a system more concisely using logical formulas.
Recall that we can treat variables as predicates that represent particular binary (true or false) states of a system, object, or other entity being modelled. This is particularly reasonable within the context of proofs: proofs and theorems are formulas that are always true, no matter how we assign truth values to predicates. Thus, while a predicate might still technically "vary" in terms of its meaning (true or false), we might also work with it (e.g., in assembling proofs) without ever considering its meaning. In that case, the proofs would encode relationships between predicates, and not constraints on what values they can be assigned.
Definition: A domain of discourse, or simply domain (which we usually denote using D) is a set of distinct objects. Usually, predicates represent relationships between or properties of elements of the domain of discourse. We sometimes call the elements in the domain of discourse constants.
So far, we have used variables to represent particular states of a system whose properties we might be modelling using formulas. However, we often had variable names that shared structure. For example, Alice has an appointment at 1 PM and Bob has an appointment at 2 PM are two variables that have a common structure: ... has an appointment at .... We can make our reuse of such patterns within our formula variables more explicit by introducing predicates.
Definition: We can extend our definition of a well-formed boolean formula to also allow predicates that talk about elements in a domain of discourse. When elements from a domain of discourse appear inside a predicate, we usually call them terms.
boolean formula comments
example a predicate without any terms
example x example a predicate with one term x from D
x example y a predicate with two terms x, y from D
Typically, predicate, predicate instance, and predicate instantiation will refer to a predicate in which the slots are occupied by particular terms. We can think of each predicate instance as a distinct variable that may have the meaning ⊤ or ⊥.
Example: Suppose our domain of discourse is D = {1 PM, 2 PM, Alice, Bob} and we have as predicates x has appointment at y and x is busy where x and y are from the domain of discourse. We can build well-formed formulas such as the following (each of the examples below is a formula consisting of a predicate instance):
Alice has appointment at 2 PM
Bob has appointment at 1 PM
Bob is busy
However, notice that the following are also well-formed formulas given our definitions so far:
2 PM has appointment at 1 PM
1 PM has appointment at Bob
1 PM is busy
The above is not a flaw in our definitions. It is perfectly valid to ask whether 1 PM has appointment at Bob is a true formula; it is just desirable in this particular example to ensure that 1 PM has appointment at Bob must have the meaning ⊥ under any circumstances that we might consider to be an accurate model of a possible state of reality.
Fact: If we know the number of elements in the domain of discourse D and the predicates that are available, we can compute the total number of predicate instantiations (i.e., variables) that can be constructed. For example, for each predicate that has one slot, there are |D| possible instantiations; for each predicate that has two slots, there are |D| ⋅ |D| different predicate instantiations, and so on.
In general, of pi represents the number of distinct predicates with i slots, then we can say that the total number of predicate instantiations is:
# variables
=
p0 + (|D| ⋅ p1) + (|D|2p2) + (|D|3p3) + ... + (|D|ipi) + ...
We could write the above using summation notation:
# variables
=
Σi = 0 (|D|ipi)
We can go further and say that there are 2# variables possible models (i.e., rows in a truth table) for a given D and collection of predicates.
Example: Suppose our domain of discourse is D = {2 PM, Alice, Bob} and we have as predicates ... has appointment at ..., ... is a person, and ... is a time. That means there is one predicate that has two slots, and two predicates that have one slot. Thus, the total number of predicate instantiations (a.k.a., variables) we can have is:
# variables
=
(|D| ⋅ 2) + (|D|2 ⋅ 1)
=
3 ⋅ 2 + 9 ⋅ 1 = 15
We can confirm this by listing all the instantiations:
2 PM is a person
Alice is a person
Bob is a person
2 PM is a time
Alice is a time
Bob is a time
2 PM has appointment at 2 PM
2 PM has appointment at Alice
2 PM has appointment at Bob
Alice has appointment at 2 PM
Alice has appointment at Alice
Alice has appointment at Bob
Bob has appointment at 2 PM
Bob has appointment at Alice
Bob has appointment at Bob
We also know that there are 2# variables = 215 possible models (i.e., rows in a truth table) for any formula involving the above variables.
By introducing predicates, we see that a model (i.e., a particular assignment of values to instantiated predicates) becomes a way to specify the properties and relationships between the constants in a domain of discourse. Thus, models become a way to represent knowledge (i.e., concepts and their relationships), or information. For example, a database consisting of one or more tables can be represented mathematically as a collection of predicates, and a given model can represent which entries actually appear in a database (thus, each individual model would be one possible state of the database).
Example: Suppose our domain of discourse is D = {1 PM, 2 PM, Alice, Bob} and we have as predicates x is a person , x is a time and x has appointment at y where x and y are from the domain of discourse. We can use universal quantification to encode the fact that the predicate x has appointment at y can only be true for some x and y from D if it is also true that x is a person and that y is a time by requiring that the following universally quantified formula must always be true:
x, y from D,      x has appointment at y (x is a persony is a time)
If we call the above formula f, we have the following truth table:
x has appointment at y x is a person y is a time f meaning
a person with an appointment
meaningless in context, so false
meaningless in context, so false
meaningless in context, so false
a person without an appointment
meaningless in context, so false
meaningless in context, so false
meaningless in context, so false
Notice that in the cases in which the formula f is true, it is certain that in order for x has appointment at y to be true, x is a person and y is a time must both be true.
Definition: We can extend our definition of a well-formed boolean formula to also include univerally quantified formulas in which a variable appears that can be substituted with any element in the domain of discourse D.
boolean formula comments
x, f f is a formula with predicates in
which x can appear as a term
We read the ∀ symbol as "forall", and if a formula or subformula begins with ∀, we say the formula is universally quantified. The meaning of a universally quantified formula can be defined using conjunction. Suppose that D = {a, b, c, ...} is the domain of discourse. Then we have that:
f where x is substituted with af where x is substituted with b ∧ ...
x, f
Fact (∀-instantiation): For any domain of discourse D, any element in the domain of discourse a, and any formula f that might contain some predicates in which x is a term, we have the following inference rule:
 ∀ x, f 
 f with all occurrences of x substituted with a 
Example: Suppose our domain of discourse is D = {crow, pigeon} and we have as predicate ... is a bird. The the following are two examples of the ∀-instantiation inference rule being used:
 ∀ x, x is a bird 
 crow is a bird 
 ∀ x, x is a bird 
 pigeon is a bird 
Notice that the meaning of ∀ x, x is a bird is determined by the value assigned to the two possible variables that can be built: crow is a bird and pigeon is a bird:
pigeon is a bird crow is a bird x, x is a bird
The above suggests that we can simply view ∀ x, x is a bird as a short way of writing a conjunction of every possible instantiation of the predicate ... is a bird:
pigeon is a birdcrow is a bird
x, x is a bird
Example: Suppose our domain of discourse is D = {crow, pigeon, wug, ...} and we have the predicates ... is a bird and ... has wings. Suppose we have the following formula as an axiom:
    
x, x is a bird x has wings
    
wug is a bird
We can use ∀-instantiation together with modus ponens to assemble a proof stating that wugs have wings.
x, x is a bird x has wings
    
axiom
wug is a bird
    
axiom
wug is a bird wug has wings
    
by ∀-instantiation
wug has wings
    
by modus ponens
Predicates, universal quantification, and implication can allow us to represent category relationships in a more abstract (and a more concise) manner. Instead of having separate predicates such as ... is a bird or ... is an animal for each category, we can include the category names in the domain of discourse and use a single universally quantified formula that makes it possible to use category relationships to derive conclusions.
Example: Suppose our domain of discourse is D = {crow, pigeon, wug, animal, bird, ...} and we have as predicates all ... are ... and ... is a .... Suppose we have the following formula as an axiom:
    
x, a, b, (all a are bx is a a) x is a b
    
all bird are animal
    
wug is a bird
We can use ∀-instantiation, modus ponens, and associativity of ∧ to assemble a proof stating that wugs are animals.
x, a, b, (all a are bx is a a) x is a b
    
axiom
wug is a bird
    
axiom
all bird are animal
    
axiom
all bird are animalwug is a bird
    
exact duplicates of axioms, grouped using associativity of ∧
(all bird are animalwug is a bird) wug is a animal
    
∀-instantiation
wug is a animal
    
by modus ponens of the two steps immediately above

[link]  4.5. Infinitely large domains and logical induction

In the subsection above, we have seen how a collection of predicates can be used to represent the properties and relationships between a set of entities (i.e., the domain of discourse). However, writing down formulas that capture all the assumptions we may want to make (e.g., in a theorem or proof) about all the elements in a very large (or even an infinitely large) domain of discourse can be impractical or impossible. In this subsection, we look at one way in which we can address this difficulty.
Example: Suppose the domain of discourse is a set of strings (where every string consists of some number of copies of the letter a) and the set of all positive integers:
D = {a, aa, aaa, ...} ∪ {1, 2, 3, 4, 5, ...}
We introduce the predicates ... is a number and ... is a string. The following axioms allow us to derive that anything in the subset {a, aa, aaa, ...} is a string:
    
a is a string
    
s, s is a string sa is a string
The following axioms allow us to derive that anything in the subset {1, 2, 3, 4, 5, ...} is a number:
    
1 is a number
    
n, n is a number n+1 is a number
We could now constrain what constants from the domain of discourse are allowed in each slot of a predicate that specifies the length of a string: ... is of length .... We can do so by requiring that the following formula must always be true:
s, n, s is of length n (s is a stringn is a number)
Example: Suppose the domain of discourse is a set of strings (where every string consists of some number of copies of the letter a) and the set of all positive integers:
D = {a, aa, aaa, ...} ∪ {1, 2, 3, 4, 5, ...}
We want to use the predicate ... is of length ... to keep track of the sizes of strings. We can introduce two axioms that allow us to build up a proof about the length of any string:
a is of length 1
    
s, n, s is of length n sa is of length n+1
    
Notice that the above two cases actually correspond exactly to the two cases of a recursive implementation of a length function for strings:

def length(s):
  if s == 'a':
    return 1
  else:
    n = length(s[0:-1])
    return n+1
        
We can assemble a proof as follows to show that the particular string aaaa has the length 4:
a is of length 1
    
axiom
s, n, s is of length n sa is of length n+1
    
axiom
a is of length 1
    
duplicated step
a is of length 1 aa is of length 2
    
∀-instantiation
aa is of length 2
    
modus ponens
aa is of length 2 aaa is of length 3
    
∀-instantiation
aaa is of length 3
    
modus ponens
aaa is of length 3 aaaa is of length 4
    
∀-instantiation
aaaa is of length 4
    
modus ponens


[link]  4.6. Assignment #4: Review of Boolean Algebra and Logic

In this assignment you will complete a variety of problems that review the material we have seen so far in the course.

There is no programming in this assignment. However, please use Python syntax for boolean and set operations in order to make grading easier and to avoid any confusion or grading issues:
  • use and for ∧;
  • use or for ∨;
  • use not for  ¬;
  • use & for ∩;
  • use | for ∪;
  • use U - X for X.
You must show your work in your solutions.
  1. For each of the formulas in this problem, do the following:
    1. (xy) x
    2. xy
    3. x (¬(x))
    4. (xy) z
    5. ⊤ ∧ (¬(x))
    6. (xy) (xy)
  2. For each of the algebraic laws and inference rules below, either indicate that the law holds or the rule holds and explain why by referring to known facts (e.g., distributivity, De Morgan's laws, modus ponens, and so on), or find a single model that shows that the law or rule does not hold.
    1. x  ≡  ¬(x)
    2. ¬(x ∧ ¬(y))  ≡  x y
    3. x y  ≡  y x
    4. (xy) ∨ (zx)  ≡  ¬(¬(x) ∨ ¬(yz))
    5. The following inference rule:
       (¬(x) ∨ y) ∧ x 
       y 
  3. Suppose tha f, g, and h are well-formed formulas and that A, B, and C are three sets of models (where U is the universe) such that each set is the largest possible model set that satisfies its respective formula:
    A
    f
    B
    g
    C
    h
    Using A, B, C, U, and the set operations ∪, ∩, and −, write out the largest model set that satisfies each of the following formulas. Hint: in some cases, you may need to apply some algebraic laws for logical operations first.
    1. f ∨ ¬(h)
    2. f ∧ (gh)
    3. ¬(fgh)
    4. f g
    5. f ¬(f)
  4. Complete each of the following proofs so that each step in the proof is one of the following:
    • an exact duplicate of an axiom or a previous step;
    • an application of modus ponens using the two steps immediately above it;
    • a grouping of subformulas using associativity of ∧;
    • a ∀-instantiation;
    • a conversion of a base case and inductive step into a single universally quantified formula via the axiom of induction.
    Label each step you add with one of the above (you may use the roman numerals to make things more concise).
    1. Complete the following proof:
          
      b c
          
      d e
          
      d b
          
      c d
          
      c a
          
      a b
          
      a
          
          
      e
    2. Suppose the domain of discourse is D = {products, boxes, containers, warehouses}. Complete the following proof:
          
      x, y, z, (x are in yy are in z) x are in z
          
      products are in boxes
          
      boxes are in containers
          
      containers are in warehouses
          
          
      products are in warehouses
    3. Suppose the domain of discourse is the set of strings that are made using the two letters a and b. Complete the following proof:
          
      a is a string
          
      b is a string
          
      s, s is a string sa is a string
          
      s, s is a string sb is a string
          
          
      abaa is a string
    4. Suppose the domain of discourse is the set of positive integers D = {0, 0+1, 0+1+1, 0+1+1+1, ...}. Complete the following proof:
          
      0 is less than or equal to 0
          
      n, 0 is less than or equal to n 0 is less than or equal to n+1
          
          
      0 is less than or equal to 0+1+1+1+1
    5. Extra credit: Suppose the domain of discourse is the set of positive integers D = {0, 0+1, 0+1+1, 0+1+1+1, ...}. Complete the following proof:
          
      k, k is less than or equal to k
          
      m, n, m is less than or equal to n m is less than or equal to n+1
          
          
      0+1+1+1+1 is less than or equal to 0+1+1+1+1+1+1+1+1



[link]  Review #1. Boolean Algebra, Logic, and Induction

The following is a breakdown of what you should be able to do at this point in the course (and of what you may be tested on in an exam). Notice that many of the tasks below can be composed. This also means that some problems can be solved in more than one way.
  • boolean formulas, their meanings, and truth tables
    • determine whether a formula is a well-formed formula
    • determine the meaning of a formula without variables
    • solve for an assignment of values to variables that makes a formula true
    • build a truth table for a formula with variables
    • for a given model (assignment of variables to values), find the meaning of a formula
  • boolean algebra
    • common algebraic laws, and applying algebraic laws
      • commutativity, associativity, and identity of ∨ and ∧
      • distributivity of ∧ and invertibility of  ¬
      • De Morgan's laws
      • representing ⊕, , and using ∨, ∧, and  ¬
  • models, model sets, satisfaction of formulas, and common problems involving formulas
    • determine whether a given model satisfies a given formula
    • given a formula, find the set of all models that satisfy a formula
    • given a formula, solve the boolean satisfiability problem
    • given a formula, solve the maximum boolean satisfiability problem
    • given a formula, solve the counting boolean satisfiability problem
    • counting problems
      • determine the number of rows in a truth table for a formula, or a given number of variables
      • determine the number of possible truth tables for a given number of variables
  • using formulas to represent properties of "real-world" systems and entities
    • representing states of a system using boolean variables
    • representing constraints and relationships between system states using formulas
    • answering questions about systems and entities being modeled
      • finding a system state that satisfies a formula
      • finding a system state with the maximum number of variables assigned to ⊤
      • counting the number of system states that satisfy a formula
  • set theory, algebra of sets, and relationships to boolean algebra
    • perform computations involving set union, intersection, difference, and complement
    • find the size of a set
    • find the power set (or the size of a power set) of a set
    • apply algebraic laws of set operations
    • convert logical operations on formulas into corresponding set operations on model sets (and vice versa)
  • theorems, proofs, predicates, universal quantification, and induction
    • given a domain of discourse and collection of predicates, count the number of predicate instantiations possible
    • determine whether a well-formed formula is a theorem
    • determine whether an inference rule holds (i.e., ⊥ rows can become ⊤, but ⊤ rows must stay ⊤)
    • identify or apply common inference rules and logical (algebraic) laws
      • modus ponens
      • ∀-instantiation
      • induction
    • assemble a proof given axioms and a goal (using the representation involving a vertical list of subformulas)
    • determine whether a proof is valid
Exercise: Solve the maximum satisfiability and the counting satisfiability problem for the following formulas:
  1. x ∧ (¬(y))
  2. x (yz)
  1. The model {x ↦ ⊤, y ↦ ⊥} is the only model that satisfies this formula. Since this is the only satisfying model, this is also the one with the maximum number of variables assigned to ⊤, so it is the solution to the maximum satisfiability problem. Since there is only one satisfying formula, the solution to the counting satisfiability problem is 1.
  2. There are no models that satisfy ⊥, since ⊥ has the meaning false no matter what values are assigned to any collection of variables. Thus, there is no solution to the maximum satisfiability problem for this formula, and the solution to the counting satisfiability problem is 0.
  3. We can write out the truth table for this formula:
    x y z x (yz)
    From the above, we see that any model that contains two variables with the value ⊤ is a solution to the maximum satisfiability problem, such as {x ↦ ⊥, y ↦ ⊤, z ↦ ⊤}, and we see that the solution to the counting satisfiability problem is the number of rows for which the formula's meaning is true, which is 4.
Exercise: Suppose the domain of discourse is D = {Alewife, Kenmore, Park St.}. Complete the following proof:
    
x, y, z, (x can be reached from yy can be reached from z) x can be reached from z
    
Park St. can be reached from Alewife
    
Kenmore can be reached from Park St.
    
    
Kenmore can be reached from Alewife
The complete proof is as follows:
x, y, z, (x can be reached from yy can be reached from z) x can be reached from z
    
axiom
Park St. can be reached from Alewife
    
axiom
Kenmore can be reached from Park St.
    
axiom
(Kenmore can be reached from Park St.Park St. can be reached from Alewife)
    
associativity of ∧
(Kenmore can be reached from Park St.Park St. can be reached from Alewife) Kenmore can be reached from Alewife
    
∀-instantiation
Kenmore can be reached from Alewife
    
modus ponens
For the sake of the exercise, we could extend the problem with an additional axiom that shows that the reachability predicate is symmetric and prove the relationship in the other direction:
x, y, x can be reached from y y can be reached from x
    
axiom
x, y, z, (x can be reached from yy can be reached from z) x can be reached from z
    
axiom
Park St. can be reached from Alewife
    
axiom
Kenmore can be reached from Park St.
    
axiom
(Kenmore can be reached from Park St.Park St. can be reached from Alewife)
    
associativity of ∧
(Kenmore can be reached from Park St.Park St. can be reached from Alewife) Kenmore can be reached from Alewife
    
∀-instantiation
Kenmore can be reached from Alewife
    
modus ponens
Kenmore can be reached from Alewife Alewife can be reached from Kenmore
    
∀-instantiation
Alewife can be reached from Kenmore
    
modus ponens
Exercise: Suppose that we have the following relationships between two sets of models A and B (but subsets of the universe U), and three formulas f and g:
A
f
B
g
Determine what set of models satisfies the following formula:
f ⊕ (¬(g))
To solve this problem, we must first convert the formula into a form that uses only the logical operators ∧, ∨, and ¬, because there is no direct correspondence between ⊕ and a set operation. Thus, using the algebraic law that relates ⊕ to these three operators, we have:
f ⊕ (¬(g))
(f ∨ (¬(g))) ∧ (¬(f ∧ (¬(g))))
We can now convert each of the logical operators in the new formula (f ∨ (¬(g))) ∧ (¬(f ∧ (¬(g)))) to the corresponding set operations, being careful to also use the corresponding model setss:
(A ∪ (U - B)) ∩ (U - (A ∩ (U - B)))
Using the set complement notation, we could write:
(AB) ∩ (U - (AB))

[link]  5. Trees, Graphs, Measurements, and Probabilities

Trees and graphs are abstract concepts that are ubiquitous in computer science and applied mathematics (including many peripheral scientific disciplines that involve applied mathematics, such as engineering, biology, chemistry, physics, neuroscience, and so on). Trees and graphs allow us to represent networks of relationships that may exist between entities (whether they are neurons, computers, people, locations, abstract concepts, or other entities). They can also be used to represent flows (of materials, products, people, data, and so on) or other processes (such as growth of plants, how diseases propagate, how resources are allocated within an economy, or how algorithms explore a space of possibities to find a solution). As a result of their general applicability, graphs and trees are commonly found as data structures within software applications.
In this section we will use the concepts we have defined in previous sections on logic and induction to formally define the concept of a tree and a graph. We will also use logical induction to define different ways to measure trees and graphs along dimensions of interest (such as size, height, and so on). We will use recurrence relations to recursively define measurements of trees and graphs, and we will learn how to solve recurrence relations to obtain closed-form functions that can be used to compute these measurements.

[link]  5.1. Defining trees and their properties using logical induction

In this subsection, we demonstrate how logical induction can be used to define trees, and we show how these definitions can be converted into corresponding Python code.
Definition: We can define a binary tree using the following two induction axioms (one base case and one inductive step):
    
•   is a binary tree
    
t1, t2,      (t1 is a binary treet2 is a binary tree)           t1 /\ t2  is a binary tree
In the above, we omitted the usual underlined notation for predicates to improve legibility. Any tree that fits the above definition is called a binary tree because every node • in the tree has exactly two children, which are also known as subtrees. The top node in the tree is called the root node.
Example: We can convert the definition of a binary tree into a Python function that checks whether a given Python value is a binary tree. Suppose we represent a tree node using a Python tuple ( ... ) and subtrees as nested tuples. Then the following function checks whether a value consisting of nested tuples is a binary tree:

def is_binary_tree(t):
   if type(t) == tuple and len(t) == 0:
     return True
   elif type(t) == tuple and len(t) == 2:
     return is_binary_tree(t[0]) and is_binary_tree(t[1])
   else:
     return False
        
Definition: We can define a perfect binary tree using the following two induction axioms (one base case and one inductive step):
    
•   is a perfect binary tree
    
t,      t is a perfect binary tree           t /\ t  is a perfect binary tree
Notice that in a perfect binary tree, every node has two subtrees that are exactly the same.
Definition: We can define a degenerate binary tree using the following three induction axioms (one base case and one inductive step):
    
•   is a degenerate binary tree
    
t,      t is a degenerate binary tree           t /\  is a degenerate binary tree
    
t,      t is a degenerate binary tree           /\ t  is a degenerate binary tree
Notice that in a degenerate binary tree, every node has at most one subtree child and one leaf child.

[link]  5.2. Measurements of trees and recurrence relations

Given a definition for a set of trees, we can use induction to define various ways to measure the trees in that set along certain dimensions, sugh as size, height, width, and so on.
Definition: The following is the definition of the size (i.e., number of nodes) of a binary tree:
    
•   has size 1
    
t1, t2, n1, n2      (t1 has size n1t2 has size n2)           t1 /\ t2 has size n1 + n2 + 1
The following is the definition of the height of a binary tree:
    
•   has height 1
    
t1, t2, n1, n2      (t1 has height n1t2 has height n2)           t1 /\ t2 has height max(n1, n2) + 1
The following is the definition of the width of a binary tree:
    
•   has width 1
    
t1, t2, n1, n2      (t1 has width n1t2 has width n2)           t1 /\ t2 has width n1 + n2
Definition: The following is the definition of the size (i.e., number of nodes) of a perfect binary tree:
    
•   has size 1
    
t, n      t has size n           t /\ t has size n + n + 1
The following is the definition of the height of a perfect binary tree:
    
•   has height 1
    
t, n      t has height n           t /\ t has height n + 1
The following is the definition of the width of a perfect binary tree:
    
•   has width 1
    
t, n      t has width n           t /\ t has width n + n
Definition: The following is a recursive definition (i.e., a recurrence relation) of the relationship between the height of a tree and its size:
height_to_size(1)
=
1
height_to_size(n)
=
height_to_size(n-1) + height_to_size(n-1) + 1
The above can be simplified as follows:
height_to_size(1)
=
1
height_to_size(n)
=
2 ⋅ height_to_size(n-1) + 1
Example: We can implement a Python function that computes the size of a binary tree (i.e, the number of nodes):

def size(t):
   if type(t) == tuple and len(t) == 0:
     return 1
   elif type(t) == tuple and len(t) == 2:
     return size(t[0]) + size(t[1]) + 1
        
Example: We can implement a Python function that computes the height of a binary tree (the max() operator returns the larger of its two integer arguments):

def height(t):
   if type(t) == tuple and len(t) == 0:
     return 1
   elif type(t) == tuple and len(t) == 2:
     return max(height(t[0]), height(t[1])) + 1
        
Example: We can implement a Python function that computes the size of a tree given its height:

def height_to_size(n):
   if n == 1:
     return 1
   elif n > 1:
     return 2 * height_to_size(n-1) + 1
        

[link]  5.3. Solving recurrence relations for closed-form functions

Example: How can we find a closed-form function that defines the size of a degenerate binary tree in terms of its height? Recall that the relationship between height and size can be defined using the following recurrence relation:
f(1)
=
1
f(n)
=
f(n − 1) + 2
Thus, we are looking for a non-recursive definition of a function f that satisfies the above two equations for all positive integers n. One such f is:
f(x)
=
2 ⋅ x − 1
We can confirm this by substituting the above f into the recurrence relation:
f(1)
=
2 ⋅ 1 − 1
=
1
f(n)
=
2 ⋅ n − 1
=
2 ⋅ n − 2 − 1 + 2
=
(2 ⋅ n − 2) − 1 + 2
=
(2 ⋅ (n − 1) − 1) + 2
=
f(n − 1) + 2
Fact: Suppose we have a recurrence relation of the following form:
f(1)
=
b
f(n)
=
f(n − 1) + a
The the closed-form function that satisfies the two equations above is:
f(x)
=
(ax) + (ba)
Example: How can we find a closed-form function that defines the size of a perfect binary tree in terms of its height? Recall that the relationship between height and size can be defined using the following recurrence relation:
f(1)
=
1
f(n)
=
2 ⋅ f(n − 1) + 1
Thus, we are looking for a non-recursive definition of a function f that satisfies the above two equations for all positive integers n. One such f is:
f(x)
=
2x − 1
We can confirm this by substituting the above f into the recurrence relation:
f(1)
=
21 - 1
=
1
f(n)
=
2n − 1
=
2n − 2 + 1
=
2 ⋅ (2n-1 − 1) + 1
=
2 ⋅ f(n − 1) + 1
Example: How can we find a closed-form function that defines the size of a perfect ternary tree in terms of its height? For perfect ternary trees, the relationship between height and size can be defined using the following recurrence relation:
f(1)
=
1
f(n)
=
3 ⋅ f(n − 1) + 1
Thus, we are looking for a non-recursive definition of a function f that satisfies the above two equations for all positive integers n. One such f is:
f(x)
=
(1/2) ⋅ (3x − 1)
We can confirm this by substituting the above f into the recurrence relation:
f(1)
=
(1/2) ⋅ (31 - 1)
=
1
f(n)
=
(1/2) ⋅ (3n − 1)
=
(1/2) ⋅ (3n − 3 + 2)
=
(1/2) ⋅ (3n − 3) + 1
=
3 ⋅ ((1/2) ⋅ (3n-1 − 1)) + 1
=
3 ⋅ f(n − 1) + 1
Fact: Suppose we have a recurrence relation of the following form:
f(1)
=
1
f(n)
=
cf(n − 1)
The the closed-form function that satisfies the two equations above is:
f(x)
=
cx − 1
Fact: Suppose we have a recurrence relation of the following form:
f(1)
=
1
f(n)
=
cf(n − 1) + 1
The the closed-form function that satisfies the two equations above is:
f(x)
=
(1/(c − 1)) ⋅ (cx − 1)
Fact: Suppose we have a recurrence relation of the following form:
f(1)
=
1
f(n)
=
cf(n − 1) + k
The the closed-form function that satisfies the two equations above is:
f(x)
=
cx − 1 + k ⋅ (1/(c − 1)) ⋅ (cx - 1 − 1)
Recall that logb(x/y) = logb(x) − logb(y), and that logb(1) = 0 because b0 = 1.
Example: How can we find a closed-form function that defines the height of a perfect binary tree in terms of its width? The relationship between width and height can be defined using the following recurrence relation:
f(1)
=
1
f(n)
=
f(n/2) + 1
Thus, we are looking for a non-recursive definition of a function f that satisfies the above two equations for all positive integers n. The f that satisfies this equation is:
f(x)
=
log2(x) + 1
We can confirm this by substituting the above f into the recurrence relation:
f(1)
=
log2(1) + 1
=
0 + 1
=
1
f(n)
=
log2(n) + 1
=
(log2(n) − 1 + 1) + 1
=
(log2(n) − log2(2) + 1) + 1
=
(log2(n/2) + 1) + 1
Example: How can we find a closed-form function that defines the height of a perfect ternary tree in terms of its width? The relationship between width and height can be defined using the following recurrence relation:
f(1)
=
1
f(n)
=
f(n/3) + 1
Thus, we are looking for a non-recursive definition of a function f that satisfies the above two equations for all positive integers n. The f that satisfies this equation is:
f(x)
=
log3(x) + 1
We can confirm this by substituting the above f into the recurrence relation:
f(1)
=
log3(1) + 1
=
0 + 1
=
1
f(n)
=
log3(n) + 1
=
(log3(n) − 1 + 1) + 1
=
(log3(n) − log3(3) + 1) + 1
=
(log3(n/3) + 1) + 1
Fact: Suppose we have a recurrence relation of the following form:
f(1)
=
1
f(n)
=
f(n/c) + 1
The the closed-form function that satisfies the two equations above is:
f(x)
=
logc(x) + 1
Fact: Suppose we have a recurrence relation of the following form:
f(1)
=
1
f(n)
=
af(n/c) + 1
The the closed-form function that satisfies the two equations above is:
f(x)
=
(1/(a − 1)) ⋅ (a1 + logc x − 1)
Fact: Suppose we have a recurrence relation of the following form:
f(1)
=
0
f(n)
=
cf(n/c) + n
The the closed-form function that satisfies the two equations above is:
f(x)
=
n ⋅ logc n
Fact: Suppose we have a recurrence relation of the following form:
f(1)
=
1
f(n)
=
nf(n − 1)
The the closed-form function that satisfies the two equations above is the factorial function:
f(x)
=
x!
The number of edges that can exist in a graph with a certain number of nodes can be determined by defining a recurrence relation, as well.
Fact: Suppose we have a recurrence relation of the following form:
f(1)
=
0
f(n)
=
f(n − 1) + (n − 1)
The the closed-form function that satisfies the two equations above is:
f(x)
=
(x ⋅ (x − 1)) / 2
Fact: Suppose we have a recurrence relation of the following form:
f(1)
=
1
f(n)
=
f(n − 1) + 2 ⋅ (n − 1) + 1
The the closed-form function that satisfies the two equations above is:
f(x)
=
x2
Fact: Suppose we have a recurrence relation of the following form:
f(1)
=
c
f(n)
=
f(n − 1) + (an) + b
The the closed-form function that satisfies the two equations above is:
f(x)
=
a ⋅ ((((x + 1) ⋅ x)/2) − 1) + (b ⋅ (x − 1)) + c

[link]  5.4. Summations and recurrence relations

The mathematical notation for summation of terms parameterized by an integer index is a concise way to describe a mathematical term. It represents an algorithm (effectively, a loop) that can be run to obtain a result. Thus, it can be viewed as an alternative, iterative version of some recurrence relations. Any function defined using a summation over a finite range of integers can be converted into a recurrence relation.
Fact: Suppose we have some function g over the integers, and we also have a function f defined in terms of g in the following way:
f(x)
=
Σxi = 1 g(i)
Then we can convert the above into the following recurrence relation:
f(1)
=
g(1)
f(n)
=
f(n − 1) + g(n)
Fact: We have the following identity for any constant integer c:
Σxi = 1 c
=
cx
Fact: We have the following identity:
Σxi = 1 i
=
(x ⋅ (x + 1)) / 2
Fact: We have the following identity for any constant integer c:
Σxi = 0 ci
=
(1/(c − 1)) ⋅ (cx+1 − 1)
We can see this is true by multiplying the sum 1 + c + c2 + c3 + ... + cx by (c − 1):
(c − 1) ⋅ (1 + c + c2 + c3 + ... + cx)
=
(c + c2 + c3 + ... + cx + cx+1) − (1 + c + c2 + c3 + ... + cx)
=
cx + 1 − 1
Example: In the Python programming language, it is possible to write an expression that corresponds to mathematical summation notation. For example, suppose we have the following summation:
Σxi = 1 ci
The above summation corresponds to the following Python expression:

sum(c**i for i in range(1,x+1))
        
For a particular instantiated example, suppose we have the following:
Σ10i = 1 2i
The above summation corresponds to the following Python expression:

>>> sum(2**i for i in range(1,11))
2046
>>> 2**1 + 2**2 + 2**3 + 2**4 + 2**5 + 2**6 + 2**7 + 2**8 + 2**9 + 2**10
2046
        


[link]  5.5. Assignment #5: Trees, Graphs, and Recurrence Relations

In this assignment you will work with recurrence relations that describe measurements of trees and graphs.

You may import the following library function in your module:

from math import log
        
Your solution may not import any other modules or employ any external library functions. The different problems and problem parts rely on the lecture notes and on each other; carefully consider whether you can use functions from the lecture notes, or functions you define in one part within subsequent parts.
  1. Implement a Python function check(f, g) that takes two functions f and g as its two arguments. The function check() should return True if both f and g return exactly the same output on all integer inputs from 1 to 100 (inclusive). Otherwise, it should return False. You may use the check() function to check the correctness of your solutions to for the remaining problems.
  2. For each of the following recurrence relations, define a closed-form function using only variables, integers, +, -, *, // (integer division), ** (exponentiation), and log() (logarithm). In your Python file, define the functions twoA(x), twoB(x), and so on. For example, if the solution to part (h) were the closed-form function f(x) = x2, you would include in your Python file the following definition:

    def twoH(x):
      return x**2
              
    In the above manner, solve the following problem parts.
    1. The following recurrence relation:
      f(1)
      =
      7
      f(n)
      =
      f(n − 1) + 4
    2. The following recurrence relation:
      f(1)
      =
      4
      f(n)
      =
      f(n − 1) + (6 ⋅ n) − (3 ⋅ n)
    3. The following recurrence relation:
      f(1)
      =
      1
      f(n)
      =
      4 ⋅ f(n/2) + 1
    4. The following recurrence relation:
      f(1)
      =
      0
      f(n)
      =
      f(n − 1) + 10
    5. The following recurrence relation:
      f(1)
      =
      1
      f(n)
      =
      5 ⋅ f(n − 1)
    6. The following recurrence relation:
      f(1)
      =
      0
      f(n)
      =
      f(n − 1) + 5 ⋅ (n + 1)
    7. The following recurrence relation:
      f(1)
      =
      1
      f(n)
      =
      3 ⋅ f(n − 1) + 2
  3. For each of the following summations or descriptions, define a recurrence relation as a Python function. In your Python file, define the functions threeA(n), threeB(n), and so on. For example, if the solution to part (h) were the recurrence relation f(1) = 1, f(n) = f(n-1) + 2, you would include in your Python file the following definition:

    def threeH(n):
      if n == 1:
        return 1
      elif n > 1:
        return threeH(n-1) + 2
              
    In the above manner, solve the following problem parts.
    1. The following summation:
      f(x)
      =
      Σxi = 1 13
    2. The following summation:
      f(x)
      =
      Σxi = 1 i
    3. The following summation:
      f(x)
      =
      Σxi = 1 (2 ⋅ i + 3)
    4. Suppose we have the following relationship between the height and size of a perfect 6-ary (or "senary") tree in which each non-leaf node has six children:
      • a perfect senary tree of height 1 has 1 node;
      • a perfect senary tree of height n has six times as many nodes as a perfect senary tree of height n − 1, plus one more root node.
      Define the recurrence relation that captures that size of a senary tree in terms of its height n.
    5. Suppose a username must start with the @ character, but can contain some sequence of any of the 26 letters of the alphabet after that. Thus, @abc would be a username of length four. Define a recurrence relation that specifies how many possible usernames of length n are possible. Keep in mind your solution must be a recurrence relation, not a closed form function.
    6. Suppose we have the following relationship between the height and width of a perfect 3-ary (or "ternary") tree in which each non-leaf node has three children:
      • a perfect ternary tree of height 1 has width 1;
      • a perfect ternary tree of height n has three times the width of a perfect ternary tree of height n − 1.
      Define the recurrence relation that captures that width of a perfect ternary tree in terms of its height n.
    7. Define a recurrence relation that specifies the size (i.e., number of nodes) of a perfect ternary tree in terms of its width (that is, for an input tree width n, the value f(n) should be the size of the tree having that width).
  4. For parts (a) to (g) of Problem #3 above, determine the closed-form function that solves the recurrence relation you defined. Specify the function definitions in your Python module under the names fourA(x), fourB(x), fourC(x), and so on.


[link]  5.6. Permutations with repetition, permutations, and combinations

In this subsection we review a few common counting problems, and present algorithms that can be used to solve them.
Definition (permutation with repetition): A permutation of n distinct objects is a particular ordering of these objects. Given n distinct objects (e.g., characters or letters), if we are allowed to use any number of copies of each of those n objects, then the number of distinct sequences of length k consisting of these n objects that can be built is nk. This is known as the number of permutations with repetition.
Example: Suppose we are given an alphabet consisting of the characters a, b, and c. How many passwords of length 5 can we make if we are allowed to use any number of copies of each character?
There are 5 possible positions in the password, and each position can be filled with one of three distinct characters. Thus, the total number of passwords is:
3 ⋅ 3 ⋅ 3 ⋅ 3 ⋅ 3
=
35
=
243
Example: Suppose we are given an alphabet consisting of the characters a, b, and c. How many passwords of length at least 1 and at most 5 can we make if we are allowed to use any number of copies of each character?
We can take the sum of the number of 1-character passwords, the number of 2-character passwords, and so on:
31 + 32 + 33 + 34 + 35
=
Σ5i = 1 3i
Using a known fact about summations, we can find a closed form expression for the above:
Σ5i = 1 3i
=
5i = 0 3i) − 30
=
((1/(3 − 1)) ⋅ (36 − 1)) − 1
=
(1/2) ⋅ (729 − 1) − 1
=
363
Thus, we can use three distinct characters to make 363 distinct passwords between length 1 and length 5, inclusive. We can confirm this is true:
31 + 32 + 33 + 34 + 35
=
3 + 9 + 27 + 81 + 243
=
363
Definition (permutation): A permutation of n distinct objects is a particular ordering of these objects. There are n! (read as "n factorial") permutations of a collection of n distinct objects, where:
n!
=
n ⋅ (n − 1) ⋅ (n − 2) ⋅ (n − 3) ⋅ ... ⋅ 3 ⋅ 2 ⋅ 1
Example: Suppose we are given an alphabet consisting of the characters a, b, c, and d. How many passwords of length 4 can we make if each character must appear exactly once in the password?
We are in a situation where we must list all possible combinations, and we cannot reuse any character. Thus, if we have four possible character positions in our password, then the first position has one of four possible characters (that is, a, b, c, or d), and the next position has one of three characters (since we have used one up), and so on. Thus, the total number of passwords that we can build while satisfying this constraint is:
4 ⋅ 3 ⋅ 2 ⋅ 1
=
4!
Example: Suppose we are given an alphabet consisting of 26 characters. How many passwords of length 5 can we make if each character must appear at most once in the password?
We are in a situation where we must list all possible combinations, and we cannot reuse any character, but we are limited to 5 character positions. Thus, if we have 26 possible character positions in our password, then the first position has one of 26 possible characters, and the next position has one of 25 characters (since we have used one up), and so on. Thus, the total number of passwords that we can build while satisfying this constraint is:
26 ⋅ 25 ⋅ 24 ⋅ 23 ⋅ 22
=
 
26 ⋅ 25 ⋅ 24 ⋅ 23 ⋅ 22 ⋅ 21 ⋅ ... ⋅ 2 ⋅ 1
                    21 ⋅ ... ⋅ 2 ⋅ 1
 
=
 
26!
21!
 
Definition (combination): A combination of k distinct objects from a collection of n distinct objects is a particular subset of k objects (such that order does not matter). The number of of combinations of k objects from a collection of n objects is denoted and defined as:
(
 
n
k
 
)
=
 
          n!
(nk)! ⋅ k!
 
The above notation is read in English as "n choose k".
Example: Given a set of objects {A, B, C, D}, how many different subsets of size 2 are there?
Here are all ordered sequences of two elements without repetition:
AB, BA, AC, CA, AD, DA, BC, CB, BD, DB, CD, DC
The number of ordered sequences of the above form is:
 
4!
2!
 
=
4 ⋅ 3
=
12
Notice that we do not care about the order, so AB and BA should only count once. Thus, we divide the above by 2! to account for this (the number of ways to permute two elements) and get the distinct subsets:
{A,B}, {A,C}, {A,D}, {B,C}, {B,D}, {C,D}
The number of subsets is then:
 
4!     
(4-2)! ⋅ 2!
 
=
(
 
4
2
 
)
=
 
4 ⋅ 3 ⋅ 2 ⋅ 1
(2 ⋅ 1) ⋅ (2 ⋅ 1)
 
=
6
Example: Suppose we have a group of n people. How many different subgroups of k people can form? How many subgroups of people (including a group containing no people) form?
We know that the number of subgroups of size k is:
(
 
n
k
 
)
We also know that the number of subgroups is just the number of subsets of a set of size n, which is 2n. This also tells us that the following identity must hold:
2n
=
Σni = 0 (
 
n
i
 
)
Example: Suppose we have a graph with n nodes. How many different edges can be drawn between any two nodes in the graph if we are only allowed to draw one edge between each pair of nodes?
Since we can only draw one edge between each pair of nodes, we need to count the number of ways in which we can choose two nodes from a graph of n nodes:
(
 
n
2
 
)
=
 
          n!
(n − 2)! ⋅ 2!
 
=
 
n ⋅ (n − 1) ⋅ (n − 2) ⋅ ... ⋅ 1
         (n − 2)! ⋅ 2!
 
=
 
n ⋅ (n − 1)
     2!
 
=
 
n ⋅ (n − 1)
     2
 
Notice that the above corresponds to a recurrence relation solution and a summation identity we saw earlier.
Example: Suppose we have a collection of characters with some duplicate characters: a, a, a, b, b, c, d. If we do not distinguish between copies of the same character, how many distinct permutations (without repetition) are there of these 7 characters?
We know there are 7! permutations if we do distinguish between tha duplicate copies of a and b that we have. However, the total of 7! overcounts the actual number of distinct permutations. For example (if we relabel the b copies as b1 and b2 for clarity), the following two sequences are counted as two distinct permutations:
a, a, a, b2, b1, c, d
a, a, a, b1, b2, c, d
But the above is true for every possible sequence. Thus, we are counting 2! as many sequences (since there are 2! ways to arrange b1 and b2) as we want to count because we do not distinguish between the two copies of b. Likewise for the three copies of a, we are counting 3! as many sequences. Thus, we can divide these factors out of our overall computation:
 
7!     
3! ⋅ 2!
 
=
 
7 ⋅ 6 ⋅ 5 ⋅ 4 ⋅ 3 ⋅ 2 ⋅ 1
     (3 ⋅ 2 ⋅ 1) ⋅ (2 ⋅ 1)
 
=
7 ⋅ 6 ⋅ 5 ⋅ 2
=
420
Example: Suppose we want to implement a Python function groups(n, k, m) that takes three integers n, k, and m, and computes how many groups of people can be chosen from n people, where the size of the group is between k and m (inclusive).
Since a group of people always has some specific size (i.e., the number of people in the group), we know that we can simply add the counts for each possible size of group (e.g., the number of groups of size three plus the number of groups of size four, and so on). Thus, we can use the sum(...) function to implement our function as follows:

from math import factorial

# Helper function for combinations, to make things more concise.
def cmb(n, k):
  return factorial(n) // (factorial(n-k) * factorial(k))

def groups(n, k, m):
  return sum(cmb(n, i) for i in range(k, m+1))
        

[link]  5.7. Generating functions

Generating functions are polynomials used for counting particular kinds of combinations. In order to solve a particular counting problem, we choose an appropriate collection of generating functions and multiply them so that the desired solution is a particular coefficient in the resulting polynomial.
Definition: Consider any polynomial of the following form for some particular collection of coefficients a0, a1, a2, ...:
a0x0 + a1x1 + ... + anxn
We can also consider a polynomial that is an infinite sum for some particular collection of coefficients a0, a1, a2, ...:
a0x0 + a1x1 + ... + anxn + ...
Any such polynomial is called a generating function.
Example: The following is a simple generating function in which all coefficients are 1:
1 + x + x2 + x3
Example: Suppose you must are allowed to choose any three coins from an unlimited collection of quarters, nickels, and dimes. In how many different ways can you choose coins so that their total value is 40 cents?
The following generating function captures the values that a given coin can have:
x5 + x10 + x25
We are allowed to choose three coins, which means the coefficient of x50 in the following product of generating functions will represent our solution:
(x5 + x10 + x25)3
We can compute the above explicitly:
...
=
(x10 + 2x15 + x20 + 2x30 + 2x35 + x50) ⋅ (x5 + x10 + x25)
...
=
x15 + 3x20 + 3x25 + x30 + 3x35 + 6x40 + 3x45 + 3x55 + 3x60 + x75
Thus, there are 6 ways in which we can choose three coins from this collection so that their total value is 40.
Example: Suppose you throw two 6-sided dice (with labels 1, 2, 3, 4, 5, and 6). In how many different ways can the total from the dice be 9? If all possible rolls are equally likely (i.e., both dice are fair), what is the probability that the total is 9?
We can use the following generating function to represent each of the individual dice:
x + x2 + x3 + x4 + x5 + x6
Then, the coefficient of the x9 term in the following product would be our solution:
(x + x2 + x3 + x4 + x5 + x6)2
We can compute the above product explicitly:
...
=
(x + x2 + x3 + x4 + x5 + x6) ⋅ (x + x2 + x3 + x4 + x5 + x6)
...
=
x2 + 2x3 + 3x4 + 4x5 + 5x6 + 6x7 + 5x8 + 4x9 + 3x10 + 2x11 + x12
Thus, there are 4 different ways in which the resulting throw can add up to 9.
The total number of possible rolls is permutation with repetition of length two using the six numbers in the range from 1 to 6. Thus, there are 62 = 36 distinct rolls of the dice. Out of these, 4 can add up to 9, so the probability of getting a 9 is 4/36 = 1/9.
Fact (binomial theorem): The following identity holds:
(1 + x)n
=
(
 
n
0
 
)x0   +   (
 
n
1
 
)x1   +   (
 
n
2
 
)x2   +   ...   +   (
 
n
n
 
)xn
=
Σni = 0 (
 
n
i
 
)xi
Example: Suppose we want to count the number of ways in which we can choose any combination of two items from a set of three items.
Each item from the collection of three items can either be in the chosen combination (i.e., 1) or not in the chosen combination (i.e., 0). There are three such items. Thus, one way to restate the question is to say the following: suppose we must set three switches to either 0 or 1:
  • switch A: 0 or 1;
  • switch B: 0 or 1;
  • switch C: 0 or 1.
In how many different ways can we set the switches so that their total adds up to 2? We can model the possible switch settings using exponents in a generating function: x0 corresponds to 0, and x1 corresponds to 1. Then, note that x0 = 1 and x1 = x and consider the following product of generating functions:
(1 + x)(1 + x)(1 + x)
If we multiply the above out, we get:
111   +   11x   +   1x1   +   1xx   +   x11   +   x1x   +   xx1   +   xxx
Rewriting the above using exponents, we get:
x0 + 0 + 0   +   x0 + 0 + 1   +   x0 + 1 + 0   +   x0 + 1 + 1   +   x1 + 0 + 0   +   x1 + 0 + 1   +   x1 + 1 + 0   +   x1 + 1 + 1
If we collect up xk terms with the same powers (e.g., x2 + x2 + x2 = 3 x2), we get:
1 + 3 x + 3 x2 + x3
Notice that the fact that 3 is the coefficient of x2 means that there are three ways in which we can set the switches (i.e., exponents) so that they add up to 2. Thus, the coefficient of x2 is exactly the solution we wanted to our counting problem, which in this case is:
(
 
3
2
 
)
=
3
Fact: The following identity holds:
 
1     
(1 − x)
 
=
1 + x + x2 + x3 + ...
=
Σi = 0 xi
We can see this is true by multiplying the infinite sum 1 + x + x2 + x3 + ... by (1 − x):
(1 − x) ⋅ (1 + x + x2 + x3 + ...)
=
(1 + x + x2 + x3 + ...) − x ⋅ (1 + x + x2 + x3 + ...)
=
(1 + x + x2 + x3 + ...) − (x + x2 + x3 + ...)
=
1
Notice that this is just a variant of the summation identity and recurrence relation solution we saw earlier.
Fact (negative binomial theorem): The following identity holds:
 
1     
(1 − x)n
 
=
Σi = 0 (
 
n + i − 1
     i
 
)xi
=
Σi = 0 (
 
n + i − 1
n − 1
 
)xi
Example: Consider the following equation involving three non-negative integer variables a, b, and c:
a + b + c
=
120
For how many different assignments of non-negative integers to the variables a, b, and c is the above equation true?
Suppose we model the possible values of a using the exponents of the generating function 1 + x + x2 + x3 + ..., and we do the same for b and c. Then this counting problem can be solved by finding the coefficient of the x120 term in the result of the following product of generating functions:
(1 + x + x2 + x3 + ...)3
Using an identity, we can convert the above into:
(
 
1 − x
 
)3
=
 
     1
(1 − x)3
 
Using the negative binomial theorem, we then know that the coefficient of the x120 term is:
(
 
3 + 120 − 1
3 − 1     
 
)
=
(
 
122
2
 
)
=
 
     122!
(122 − 2)! ⋅ 2!
 
=
 
122 ⋅ 121
     2 ⋅ 1
 
=
61 ⋅ 121
=
7381
Thus, there are 7381 different ways to assign non-negative integers to a, b, and c so that a + b + c = 120.
Example: Suppose you throw three 100-sided dice (with labels 0, 1, ..., 99). In how many different ways can the total from the dice be 150? If every outcome is equally likely (i.e., the dice are fair), what is the probability that the total is 150?
We can use the following generating function to represent each of the individual dice:
1 + x + x2 + ... + x99
Writing out the above in its entirety is not practical, but we can use a known identity to rewrite the above more concisely:
1 + x + x2 + ... + x99
=
 
     1
(1 − x)
 
 
     x100
(1 − x)
 
=
(1 − x100) ⋅
 
     1
(1 − x)
 
If we want to count the number of ways in which the three dice can have a total of 150, we need to find the coefficient of x150 in the following product of generating functions:
(1 + ... + x99)3
=
(1 − x100)3
 
     1
(1 − x)
 
3
We can expand the first term:
...
=
(1 − 3x100 + 3x200x300) ⋅
 
     1
(1 − x)
 
3
We see that the only possible way to obtain x150 terms in the above is either when the 1 term or the -3x100 term is multiplied by an appropriate term in the right-hand side of the above product. Thus, we have the following coefficient:
1 ⋅ (coefficient of x150 in
 
1     
(1 − x)
 
3) − 3 ⋅ (coefficient of x50 in
 
1     
(1 − x)
 
3)
Using the negative binomial theorem to fill in the above, we get:
...
=
1 ⋅ (
 
3 + 150 − 1
3 − 1
 
) − 3 ⋅ (
 
3 + 50 − 1
3 − 1
 
)
=
11476 − 3 ⋅ 1326
=
7498
The total number of possible rolls is permutation with repetition of length two using the six numbers in the range from 0 to 99. Thus, there are 1003 = 1,000,000 distinct rolls of the dice. Out of these, 7498 can add up to 150, so the probability of getting a 150 is 7498/1,000,000 ≈ 0.007.
In the previous two examples, we used an infinite generating function in one case, but a finite generating function in the other case. Why did we make this distinction?
Fact: Suppose we have the following two generating functions (one is infinite, the other is finite and has a largest exponent of k):
f
=
1 + x + x2 + x3 + ...
g
=
1 + x + x2 + x3 + ... + xk
Suppose we take some exponent n of each of the two functions (that is, we compute fn and gn). Notice that for exponents less than or equal to k, the coefficients for the two functions will actually be exactly the same. However, for any exponent greater than k, the coefficients will be different (the coefficients in fn will be larger). As an example, consider the following:
(1 + x + x2 + x3 + ...)2
=
1 + 2x + 3x2 + 4x3 + ...
(1 + x)2
=
1 + 2x + x2
Notice that the coefficients diverge when the exponent is greater than the largest exponents of the finite generating function.
Thus, in situations in which we must model some choice of a number of integer values, we must ask: is the exponent corresponding to the coefficient we need greater than the largest possible integer we can choose? If it is, then we must use a finite generating function. If it is not, we can take a shortcut and just use the infinite generating function.
Example: Suppose we have a tree in which each non-leaf can have either two or three branches. Given this constraint, how many possible trees of height 3 have width 6? We know that there are at least two such trees: (1) a root node with three branches, with a two-branch tree at the end of each branch, and (2) a root node with two branches, with a three-branch tree at the end of each branch. Is there a more systematic way for us to count the number of trees with a certain width?
We can use products of generating functions to keep track of how many trees there are of each width. For example, a tree of height 2 can have either width 2 or width 3, and we can represent this using the following generating function:
x2 + x3
The above indicates that there is one tree having width 2 (the coefficient 1 of the term x2), and one tree having width 3 (the coefficient of the term x3). The number of trees having width 1 is represented using the following generating function:
x
Suppose we have a generating function g that represents the number of trees of each width of some height n. How can we build a generating function that represents the number of trees of each width having height n+1? The following generating function will accomplish this:
g2 + g3
In other words, we can introduce a new root node with two subtrees, or a new root node with three subtrees. Suppose we start with x, which represents the number of trees having width 1:
g
=
x
Then, using the formula g2 + g3, we obtain:
x2 + x3
If we set g = x2 + x3 and apply the same calculation again, we get:
(x2 + x3)2 + (x2 + x3)3
=
x4 + 2x5 + 2x6 + 3x7 + 3x8 + x9
Thus, we see that there are indeed only 2 trees of height 3 having width 6.
If we want to implement a Python function that builds the appropriate generating function given some tree height n, we can do so in the following way (using a small Python class for representing generating functions):

def treesOfHeight(n):
  g = x
  for i in range(1, n):
    g = g**2 + g**3
  return g
        
To determine the number of trees of a certain height having a certain width, we can access the appropriate coefficient of the resulting generating function:

>>> treesOfHeight(3)
1*x**4 + 2*x**5 + 2*x**6 + 3*x**7 + 3*x**8 + 1*x**9
>>> g = treesOfHeight(3)
>>> g[6]
2
        

[link]  5.8. Counting by measurements, and computing probability of measurements

In the previous subsections we have seen how to count the number of possible ways that objects from a collection can be selected and assembled into a sequence or new collection, and how generating functions can be used to count the number of possible ways that integer quantities can be combined to yield a particular total. All of these techniques are tools for counting the number of situations (similar to system states) that have a certain measurement value along certain dimensions. In this section, we will use logic to explicitly model the problem of counting only those situations that have a certain measurement. We will then use this framework to define a notion of the probability of a given situation occurring (i.e., being true), and we will show how we can use the counting techniques we have presented in the previous subsections to compute such probabilities.
Example: Suppose we have a domain of discourse that consists of two sets: the objects {elephant, hospital, car, book, mouse, library, hammer} and the sizes {small, medium, large}. We also have the predicate ... is of ... size, and the following are all predicate instantiations (i.e., variables) that map to true in our model (in the technical sense we have seen before) of the world, while all other instantiations map to false:
elephant is of medium size
hospital is of large size
car is of medium size
book is of small size
mouse is of small size
library is of large size
hammer is of small size
Suppose we wish to count the number of objects in {elephant, hospital, car, book, mouse, library, hammer} that have each of the possible sizes {small, medium, large}. We could do so as follows:
|{ x | x is of small size }|
=
|{book, mouse, hammer}|
=
3
|{ x | x is of medium size }|
=
|{car, elephant}|
=
2
|{ x | x is of large size }|
=
|{hospital, library}|
=
2
Suppose we want to count for each possible size the fraction of all the objects that have that size. We can do so as follows:
 
|{ x | x is of small size }|                         
|{elephant, hospital, car, book, mouse, library, hammer}|
 
=
 
3
7
 
 
|{ x | x is of medium size }|                         
|{elephant, hospital, car, book, mouse, library, hammer}|
 
=
 
2
7
 
 
|{ x | x is of large size }|                         
|{elephant, hospital, car, book, mouse, library, hammer}|
 
=
 
2
7
 
Definition: We use the following notation to denote that the probability that a formula f is true is p, where p is a real number in the range 0 ≤ p ≤ 1:
Pr[ f ]
=
p
Suppose f is a formula that contains a variable x (such as "¬(x) ∨ ⊤" or "x is green"). Assuming that each possible assignment of a constant or value from a set S to the variable x is equally likely, we use the following notation to denote that when we choose an element of S and assign it to x, the probability of our choosing an replacement for x that makes the formula f true is p:
Prx S [ f ]
=
p
Example: Suppose we have a domain of discourse that consists of three sets: the objects O = {elephant, whale, car, book, mouse, library}, the sizes {small, medium, large}, and the descriptions {animate, inanimate}. We also have the predicates ... is of ... size and ... is ..., and the following are all predicate instantiations (i.e., variables) that map to true in our model (in the technical sense we have seen before) of the world, while all other instantiations map to false:
elephant is of medium size
    
elephant is animate
whale is of large size
    
whale is animate
car is of medium size
    
car is inanimate
book is of small size
    
book is inanimate
mouse is of small size
    
mouse is animate
library is of large size
    
library is inanimate
The two dimensions along which we can measure objects are size and animacy. These dimensions are independent, so we can represent them using orthogonal axes (i.e., rows and columns), as in the diagram below:
animate inanimate
{elephant, mouse, whale}
3 / 6
{book, car, library}
3 / 6
small {book, mouse}
2 / 6
{mouse}
6 / 36 = 1 / 6
{book}
6 / 36 = 1 / 6
medium {elephant, car}
2 / 6
{elephant}
6 / 36 = 1 / 6
{car}
6 / 36 = 1 / 6
large {library, whale}
2 / 6
{whale}
6 / 36 = 1 / 6
{library}
6 / 36 = 1 / 6
Assuming that the chances of choosing each of the objects in O are equally likely, we can say the following:
Prx O [ x is of small size ] =
 
2
6
 
Prx O [ x is of medium size ] =
 
2
6
 
Prx O [ x is of large size ] =
 
2
6
 
Prx O [ x is animate ] =
 
3
6
 
Prx O [ x is inanimate ] =
 
3
6
 
Definition: Suppose we know that for some logical formulas f and g, the following is true:
Pr[ f ]
=
p
Pr[ g ]
=
q
The probabilities that f and g are true are independent if the following holds:
  • if we restrict ourselves to the cases where f is true, it is still the case that Pr[ g ] = q;
  • if we restrict ourselves to the cases where f is false, it is still the case that Pr[ g ] = q;
  • if we restrict ourselves to the cases where g is true, it is still the case that Pr[ f ] = p;
  • if we restrict ourselves to the cases where g is false, it is still the case that Pr[ f ] = p.
Definition: Two logical formulas f and g are mutually exclusive if the following holds:
  • if we restrict ourselves to the cases where f is true, it is still the case that g is false;
  • if we restrict ourselves to the cases where g is true, it is still the case that f is false.
Example: Let the domain of discourse be O = {elephant, whale, car, book, mouse, library}. Are the probabilities for the dimensions in the following table independent?
animate
{elephant, mouse, whale}
inanimate
{book, car, library}
small
{book, mouse}
{mouse}
6 / 36 = 1 / 6
{book}
6 / 36 = 1 / 6
medium
{car, elephant}
{elephant}
6 / 36 = 1 / 6
{car}
6 / 36 = 1 / 6
large
{whale, library}
{whale}
6 / 36 = 1 / 6
{library}
6 / 36 = 1 / 6
Yes. If we restrict ourselves to any measurement along the size dimension, the probability of an element being animate is 1/2, which is also the overall probability (across all sizes) of an object being animate. Likewise, if we restrict ourselves to a particular animacy column, the probability that an object is of a particular size (e.g., small) is always 1/3, which is also the overall probability of an object being that size.
Example: Are the probabilities for the dimensions in the following table independent?
can fly
{flying squirrel, bat, robin}
cannot fly
{wolf, bear}
has fur
{flying squirrel, bat, wolf, bear}
{bat, flying squirrel}
2 / 5
{wolf, bear}
2 / 5
has no fur
{robin}
{robin}
1 / 5

0 / 5
No, they are not independent because the overall probability that an element can fly is 3/5, while the probability that an element can fly if it has no fur is 1 and the probability that an element can fly if it does have fur is 1/2.
Example: Are the probabilities for the dimensions in the following table independent?
can fly
{flying squirrel, bat, robin, crow}
cannot fly
{bear, crocodile}
has fur
{flying squirrel, bat, bear}
{bat, flying squirrel}
2 / 6
{bear}
1 / 6
has no fur
{robin, crow, crocodile}
{robin, crow}
2 / 6
{crocodile}
1 / 6
Yes, the overall probability that an element has fur is 1/2, which is also the probability that an animal that can fly has fur, and the the probability that an animal that cannot fly has fur. Likewise, the overall probability that an animal can fly is 2/3, which is also the probability that an animal with fur can fly, and the probability that an animal without fur can fly.
Fact: Suppose we know that for some logical formulas f and g, the following is true:
Pr[ f ]
=
p
Pr[ g ]
=
q
If the probabilities for f and g are independent, then we have that:
Pr[ fg ]
=
pq
Fact: Suppose we know that for some logical formulas f, the following is true:
Pr[ f ]
=
p
Then we have that:
Pr[ ¬(f) ]
=
1 − p
Pr[ fg ]
=
pq
As long as dimensions are independent and different measurements are mutually exclusive, the facts above are sufficient to determine the probability of any logical formula because the logical operators ¬ and ∧ are functionally complete (i.e., they can be used to represent all other logical operations).
Fact: Suppose we know that for some logical formulas f and g, the following is true:
Pr[ f ]
=
p
Pr[ g ]
=
q
Then if f and g are mutually exclusive (i.e., can never both be true at the same time), we have:
Pr[ fg ]
=
p + q
One common situation in which this arises is if the two formulas represent different measurements of an object along the same dimension.
Fact: Suppose we know that for some logical formulas f and g, the following is true:
Pr[ f ]
=
p
Pr[ g ]
=
q
If the probabilities that f and g are true are independent, then we have that:
Pr[ fg ]
=
p + qpq
=
1 − (1 − p) ⋅ (1 − q)
Notice that 1 − (1 − p) ⋅ (1 − q) can be derived as the probability of the formula ¬(¬(f) ∧ ¬(g)); the latter formula is equivalent to fg via De Morgan's Law.
Example: Suppose we have a domain of discourse that consists of three sets: the objects {elephant, whale, car, book, mouse, library}, the sizes {small, medium, large}, and the descriptions {animate, inanimate}. We also have the predicates ... is of ... size and ... is ..., and the axioms as exactly like those in the related example above.
animate inanimate
{elephant, mouse, whale}
3 / 6
{book, car, library}
3 / 6
small {book, mouse}
2 / 6
{mouse}
6 / 36 = 1 / 6
{book}
6 / 36 = 1 / 6
medium {elephant, car}
2 / 6
{elephant}
6 / 36 = 1 / 6
{car}
6 / 36 = 1 / 6
large {library, whale}
2 / 6
{whale}
6 / 36 = 1 / 6
{library}
6 / 36 = 1 / 6
  • What is the probability that a replacement for x chosen uniformly at random from O = {elephant, whale, car, book, mouse, library} is small?
    The following is a visual representation of the set in question:
    animate
    {elephant, mouse, whale}
    3 / 6
    inanimate
    {book, car, library}
    3 / 6
    small
    {book, mouse}
    2 / 6
    {mouse}
    6 / 36 = 1 / 6
    {book}
    6 / 36 = 1 / 6
    medium
    {elephant, car}
    2 / 6
    {elephant}
    6 / 36 = 1 / 6
    {car}
    6 / 36 = 1 / 6
    large
    {library, whale}
    2 / 6
    {whale}
    6 / 36 = 1 / 6
    {library}
    6 / 36 = 1 / 6
    We can compute the probability as follows:
    Prx O [ x is of small size ]
    =
     
                             |{ x | x is of small size }|
    |{elephant, whale, car, book, mouse, library}|
     
    =
     
                             |{book, mouse}|
    |{elephant, whale, car, book, mouse, library}|
     
    =
     
    2
    6
     
  • What is the probability that a replacement for x chosen uniformly at random from O = {elephant, whale, car, book, mouse, library} is medium in size and animate?
    The following is a visual representation of the possibilities specified by the formula:
    animate
    {elephant, mouse, whale}
    3 / 6
    inanimate
    {book, car, library}
    3 / 6
    small
    {book, mouse}
    2 / 6
    {mouse}
    6 / 36 = 1 / 6
    {book}
    6 / 36 = 1 / 6
    medium
    {elephant, car}
    2 / 6
    {elephant}
    6 / 36 = 1 / 6
    {car}
    6 / 36 = 1 / 6
    large
    {library, whale}
    2 / 6
    {whale}
    6 / 36 = 1 / 6
    {library}
    6 / 36 = 1 / 6
    Since medium and animate are measurements along two independent measurement dimensions (i.e., size and animacy), any x we choose must have exactly one measurement along each dimension. Thus, we must multiply the probability that x is medium in size by the probability that x is animate:
    Prx O [ x is of medium sizex is animate ]
    =
    Prx O [ x is of medium size ] ⋅ Prx O [ x is animate ]
    =
     
    2
    6
     
     
    1
    2
     
    =
     
    2
    12
     
    =
     
    1
    6
     
  • What is the probability that a replacement for x chosen uniformly at random from O = {elephant, whale, car, book, mouse, library} is small or large in size?
    The following is a visual representation of the possibilities specified by the formula:
    animate
    {elephant, mouse, whale}
    3 / 6
    inanimate
    {book, car, library}
    3 / 6
    small
    {book, mouse}
    2 / 6
    {mouse}
    6 / 36 = 1 / 6
    {book}
    6 / 36 = 1 / 6
    medium
    {elephant, car}
    2 / 6
    {elephant}
    6 / 36 = 1 / 6
    {car}
    6 / 36 = 1 / 6
    large
    {library, whale}
    2 / 6
    {whale}
    6 / 36 = 1 / 6
    {library}
    6 / 36 = 1 / 6
    Since small and large are both mutually exclusive measurements of x along the same measurement dimension (i.e., size), x can never have both measurements at the same time. Thus, we can add the probabilities that x is of each of these two sizes:
    Prx O [ x is of small sizex is of large size ]
    =
    Prx O [ x is of small size ] + Prx O [ x is of large size ]
    =
     
    2
    6
     
    +
     
    2
    6
     
    =
     
    4
    6
     
    =
     
    2
    3
     
  • What is the probability that a replacement for x chosen uniformly at random from O = {elephant, whale, car, book, mouse, library} is small or medium in size, and is inanimate?
    The following is a visual representation of the possibilities specified by the formula:
    animate
    {elephant, mouse, whale}
    3 / 6
    inanimate
    {book, car, library}
    3 / 6
    small
    {book, mouse}
    2 / 6
    {mouse}
    6 / 36 = 1 / 6
    {book}
    6 / 36 = 1 / 6
    medium
    {elephant, car}
    2 / 6
    {elephant}
    6 / 36 = 1 / 6
    {car}
    6 / 36 = 1 / 6
    large
    {library, whale}
    2 / 6
    {whale}
    6 / 36 = 1 / 6
    {library}
    6 / 36 = 1 / 6
    Since small and medium are both possible measurements of x along the same measurement dimension (i.e., size), x can never have both measurements at the same time. Thus, we can add the probabilities that x is of each of these two sizes. However, if we also want to consider its animacy, we need to multiply the probability that its size is small or medium by the probability that it is inanimate.
    Prx O [ (x is of small sizex is of medium size) ∧ x is inanimate ]
    Note the parentheses in the formula above. These are important, as they determine the order of operations when we compute the probability:
              ...
    =
    (Prx O [ x is of small size ] + Prx O [ x is of medium size ]) ⋅ Prx O [ x is inanimate ]
    =
    (
     
    2
    6
     
    +
     
    2
    6
     
    ) ⋅
     
    1
    2
     
    =
     
    4
    6
     
     
    1
    2
     
    =
     
    2
    6
     
    =
     
    1
    3
     
    The above is equivalent to the following (by the distributive property of ∧ across ∨, and the corresponding distributive property of multiplication across addition):
    Prx O [ (x is of small sizex is inanimate) ∨ (x is of medium sizex is inanimate) ]
    We can compute the corresponding probability:
              ...
    =
    (Prx O [ x is of small size ] ⋅ Prx O [ x is inanimate ]) + (Prx O [ x is of medium size ] ⋅ Prx O [ x is inanimate ])
    =
    (
     
    2
    6
     
     
    1
    2
     
    ) + (
     
    2
    6
     
     
    1
    2
     
    )
    =
     
    1
    6
     
     
    1
    6
     
    =
     
    2
    6
     
    =
     
    1
    3
     
  • What is the probability that a replacement for x chosen uniformly at random from O = {elephant, whale, car, book, mouse, library} is medium in size or animate?
    The following is a visual representation of the possibilities specified by the formula. Notice that we must be careful not to "double count" the intersection of the regions:
    animate
    {elephant, mouse, whale}
    3 / 6
    inanimate
    {book, car, library}
    3 / 6
    small
    {book, mouse}
    2 / 6
    {mouse}
    6 / 36 = 1 / 6
    {book}
    6 / 36 = 1 / 6
    medium
    {elephant, car}
    2 / 6
    {elephant}
    6 / 36 = 1 / 6
    {car}
    6 / 36 = 1 / 6
    large
    {library, whale}
    2 / 6
    {whale}
    6 / 36 = 1 / 6
    {library}
    6 / 36 = 1 / 6
    Since medium and animate are independent, we can add the probabilities, but we must subtract the probability of the intersection occurring out of the total.
    Prx O [ x is of medium sizex is inanimate ]
    We then compute the probability:
              ...
    =
    Prx O [ x is of medium size ] + Prx O [ x is animate ] − (Prx O [ x is of medium size ] ⋅ Prx O [ x is animate ])
    =
     
    3
    6
     
    +
     
    2
    6
     
    − (
     
    3
    6
     
     
    2
    6
     
    )
    =
     
    5
    6
     
     
    6
    36
     
    =
     
    4
    6
     
    =
     
    2
    3
     
    By De Morgan's Law, the above is equivalent to the following:
    Prx O [ ¬(¬(x is of medium size) ∧ ¬(x is inanimate)) ]
    The visual representation can be adjusted to reflect the above (in the diagram below, we show the intersection of the complements; this is then subtract from the overall set or from the probability of 1 to obtain the desired subset):
    animate
    {elephant, mouse, whale}
    3 / 6
    inanimate
    {book, car, library}
    3 / 6
    small
    {book, mouse}
    2 / 6
    {mouse}
    6 / 36 = 1 / 6
    {book}
    6 / 36 = 1 / 6
    medium
    {elephant, car}
    2 / 6
    {elephant}
    6 / 36 = 1 / 6
    {car}
    6 / 36 = 1 / 6
    large
    {library, whale}
    2 / 6
    {whale}
    6 / 36 = 1 / 6
    {library}
    6 / 36 = 1 / 6
    We can compute the corresponding probability:
              ...
    =
    1 − ((1 − Prx O [ x is of medium size ]) ⋅ (1 − Prx O [ x is animate ]))
    =
    1 − ((1 −
     
    2
    6
     
    ) ⋅ (1 −
     
    3
    6
     
    ))
    =
    1 − (
     
    4
    6
     
     
    3
    6
     
    )
    =
    1 −
     
    12
    36
     
    =
    1 −
     
    1
    3
     
    =
     
    2
    3
     
Example: Suppose we have the following assumptions about the probabilities that two boolean variables are true:
Pr[ x ≡ ⊤ ]
=
 
1
2
 
Pr[ y ≡ ⊤ ]
=
 
1
2
 
If we count the total number of models involving these two boolean variables, we see that each individual model is 1/4 of the total number of models:
x y
1/4
1/4
1/4
1/4
Since x and y are independent dimensions of measurement (each variable can independently either have one of the two distinct "measurements", or values, ⊤ or ⊥), we can compute the probability of the particular model {x ↦ ⊤, y ↦ ⊤} being selected as:
Pr[ (x ≡ ⊤) ∧ (y ≡ ⊤) ]
=
 
1
2
 
 
1
2
 
=
 
1
4
 
However, we cannot directly compute the following because the values of x and y are not distinct measurements along the same dimension:
Pr[ (x ≡ ⊤) ∨ (y ≡ ⊤) ]
=
...
Instead, we need to use algebraic laws of logical operators (in particular, De Morgan's Law) to convert the formula above into one that uses only the negation and conjunction operators (¬ and ∧):
Pr[ (x ≡ ⊤) ∨ (y ≡ ⊤) ]
=
Pr[ ¬(¬(x ≡ ⊤) ∧ ¬(y ≡ ⊤)) ]
Once we have the formula above, we can apply the known algebraic facts for computing probabilities of formulas:
Pr[ ¬(x ≡ ⊤) ]
=
1 − Pr[ x ≡ ⊤ ]
=
1 −
 
1
2
 
=
 
1
2
 
Pr[ ¬(y ≡ ⊤) ]
=
1 − Pr[ y ≡ ⊤ ]
=
1 −
 
1
2
 
=
 
1
2
 
We can now combine the above:
Pr[ ¬(x ≡ ⊤) ∧ ¬(y ≡ ⊤) ]
=
Pr[ ¬(x ≡ ⊤) ] ⋅ Pr[ ¬(y ≡ ⊤) ]
=
 
1
2
 
 
1
2
 
=
 
1
4
 
Finally, we have:
Pr[ ¬(¬(x ≡ ⊤) ∧ ¬(y ≡ ⊤)) ]
=
1 − Pr[ ¬(x ≡ ⊤) ∧ ¬(y ≡ ⊤) ]
=
1 −
 
1
4
 
=
 
3
4
 
Thus, we have computed:
Pr[ (x ≡ ⊤) ∨ (y ≡ ⊤) ]
=
 
3
4
 
Example: Suppose we have the following assumptions about the probabilities that two boolean variables are true (note that Pr[ x ] and Pr[ x ≡ ⊤ ] have the same meaning):
Pr[ x ]
=
 
1
3
 
Pr[ y ]
=
 
3
5
 
We want to compute the probability that the following formula is true:
x y
We can do so by first converting the formula using algebraic laws so that it only contains the logical operators ¬ and ∧:
x y
¬(x) ∨ y
¬(¬(¬(x)) ∧ ¬(y))
¬(x ∧ ¬(y))
We compute the probability of ¬(y):
Pr[ ¬(y ≡ ⊤) ]
=
1 −
 
3
5
 
=
 
2
5
 
We also compute the probability of x ∧ ¬(y):
Pr[ x ∧ ¬(y) ]
=
Pr[ x ] ⋅ Pr[ ¬(y) ]
=
 
1
3
 
 
2
5
 
=
 
2
15
 
Finally, we can compute the probability of the overall formula:
Pr[ ¬(x ∧ ¬(y)) ]
=
1 −
 
2
15
 
=
 
13
15
 
Thus, we can say that:
Pr[ x y ]
=
1 −
 
2
15
 
=
 
13
15
 
Example: Suppose we have the following assumptions about the probabilities that two boolean variables are true, and that a particular formula is true:
Pr[ x ]
=
p
Pr[ y ]
=
 
3
4
 
Pr[ x y ]
=
 
9
10
 
Note that we do not know exactly what the probability is of x being true, but we know it is some real number p such that 0 ≤ p ≤ 1.
We note that x y ≡ ¬(x ∧ ¬(y)), and compute the probability that ¬(x ∧ ¬(y)) is true (using p wherever we need the probability that x is true):
Pr[ x y ]
=
Pr[ ¬(x ∧ ¬(y)) ]
=
1 − (p ⋅ (1 −
 
3
4
 
))
At this point, since Pr[ x y ] = 9/10, we can set up an equation with a single unknown real number variable p:
 
9
10
 
=
1 − (p ⋅ (1 −
 
3
4
 
))
 
9
10
 
− 1
=
− (p ⋅ (1 −
 
3
4
 
))
1 −
 
9
10
 
=
p ⋅ (1 −
 
3
4
 
)
 
1
10
 
=
p
 
1
4
 
p
=
 
4
10
 
p
=
 
2
5
 
Thus, we have determined the probability Pr[ x ] (i.e., the probability that x is true) to be 2/5.
Example: Suppose that that we are working with perfect trees in which each non-leaf node is equally likely to have two or three branches (i.e., the probability that a non-leaf node has two branches is 1/2, and the probability it has three branches is also 1/2). Then we could write the following generating function for a tree of height 2; in this generating function, in each term cxk the coefficient c represents the probability that a tree would have width k:
 
1
2
 
x2 +
 
1
2
 
x3
The generating function for a tree of height 3 would then be as follows:
 
1
2
 
⋅ (
 
1
2
 
x2 +
 
1
2
 
x3)2 +
 
1
2
 
⋅ (
 
1
2
 
x2 +
 
1
2
 
x3)3
In the above generating function, the coefficient of x4 would be the probability that a tree of height 3 has width 4.


[link]  5.9. Assignment #6: Permutations, Combinations, and Generating Functions

In this assignment you will solve problems requiring counting and computation of probabilities.
You may import the following library function in your module:

from math import factorial
from fractions import Fraction
        
Your solution may not import any other modules or employ any external library functions. The different problems and problem parts rely on the lecture notes and on each other; carefully consider whether you can use functions from the lecture notes, or functions you define in one part within subsequent parts.
  1. For each of the following descriptions, define a mathematical expression for computing the specified count by using multiplication *, integer division //, addition +, and/or the factorial() function. In your Python file, define the variables oneA, oneB, oneC, and so on. For example, a solution to part (h) might look as follows:

    oneH = factorial(5) // factorial(2)
              
    In the above manner, solve the following problem parts.
    1. You are given the collection of characters {a, b, c, d, e, f, g}, and you must count the number of ordered sequences of length four that can be made using these characters in which no character repeats.
    2. You are given the collection of characters {a, a, b, b, c, c, d, d}, and you must count the number of distinct ordered sequences of length eight that can be made using these characters in which each of the characters from the collection can be used exactly once.
    3. You must count the number of different ways that ten people can split into two groups consisting of four people and six people. Hint: once the group of four people has been chosen, this automatically means the remaining six people are also in a group.
    4. Given a set of size 15, how many distinct subsets of size five does it have?
    5. Suppose there are seven people and five pets. In how many different ways can we choose three people and two pets to invite to an event (the people and pets are chosen independently).
    6. How many different groups of size two, three, four, or five can we choose from a group of seven people?
    7. Suppose there are nine people and four pets. In how many different ways can we choose a group of three, four, or five people and a group of two, three, or four pets to invite to an event? The number of people we invite and the number of pets we invite are independent.
  2. For each of the following descriptions, use Python to define a generating function and then specify using the index notation [...] which coefficient of the generating function corresponds to the correct count. In your Python file, you should include the class definition for generating function found in the appendix. For your solutions, define the variables twoA, twoB, twoC, and so on. For example, a solution to part (h) might look as follows:

    twoH = ((x**2 + x**3 + x**4)**2)[3]  # Third coefficient is the solution.
              
    In the above manner, solve the following problem parts.
    1. In how many ways can you choose a combination of four items from a set of six items? Hint: use the binomial theorem.
    2. Suppose you have four six-sided dice, where each dice is labelled with the numbers 0, 2, 4, 6, 8, 10. In how many ways can you roll the four dice so that the total is 20?
    3. You must build a fence of length 20. You are allowed to use any number of fence sections, and each fence section can be of length 2, 3, 5, or 7. You decide to buy a particular bundle of fence sections (e.g., four section of length 5, or ten sections of length 2). How many distinct bundles of fence sections could you buy that would allow you to build a fence of the desired length without having any extra fence sections left over (it's fine to count different arrangements of the sections as different bundles, so that 3 + 3 + 7 + 7 and 7 + 7 + 3 + 3 count as two different bundles). Hint: split the problem into cases based on the number of sections you buy, and then use sum(...).
    4. Suppose you have five 50-sided dice, where each dice is labelled with the numbers 0, 1, 2, ..., 49. In how many ways can you roll the five dice so that the total is 158? Hint: use the sum(...) function as in this example, except with powers of x to build an appropriate generating function.
    5. Suppose you have five 100-sided dice, where each dice is labelled with the numbers 1, 2, 3, ..., 100. In how many ways can you roll the five dice so that the total is 406? Hint: be careful when applying a theorem; you will need to do a little additional work because the labels start from 1 and not 0.
    6. If non-leaf nodes of a tree can have either one, two, or three branches, how many trees of height 3 have width 7? You must write out the generating function; you may not use an algorithm.
  3. Implement a function sequences(cs) that takes a list of characters cs as its input and computes the number of distinct ordered sequences of length len(cs) that can be made using these characters. Hint: use set() and the .count() method.

    >>> sequences(['a','a','a'])
    1
    >>> sequences(['a','b','c'])
    6
    >>> sequences(['a','b','b','c'])
    12
    >>> sequences(['a','a','a','b','b','c','c'])
    210
              
  4. Implement a function solutions(k, n) that takes two integer inputs k and n. The function should compute the number of possible positive integer solutions to the following equation:
    y1 + y2 + ... + yk
    =
    n
    Each solution to the above would be some assignment of a positive integer to each of the yi so that they add up to n, and your function must return the total count of all such solutions. Hint: note that the integer must be positive and thus non-zero, so you'll need to think carefully about what generating functions to use; if you use appropriate theorems, your implementation should be very short (two or three lines).

    >>> solutions(2, 10)
    9
    >>> solutions(3, 100)
    4851
    >>> solutions(5, 20)
    3876
    >>> solutions(6, 10)
    126
              
  5. Implement a function hgtToSizes(h) that takes an integer input h that represents a tree height. The function should return the generating function that represents how many height-h trees there are of each possible size (i.e., total number of all nodes in the tree) if each non-leaf node in the tree can have either one, two, or three branches.

    >>> hgtToSizes(2)
    1*x**2 + 1*x**3 + 1*x**4
    >>> hgtToSizes(3)
    1*x**3 + 1*x**4 + 2*x**5 + 2*x**6 + 4*x**7 + 5*x**8 + 7*x**9 + 7*x**10 + 6*x**11 + 3*x**12 + 1*x**13
              
  6. Extra credit: Implement a function hgtToWidthProb(b, h, w) that take three integer inputs:
    • b represents the maximum number of branches each non-leaf node in the tree might have (so each node can have 1, 2, ..., b-1, or b branches);
    • h represents tree height;
    • w represents a tree width.
    The function should return the probability (represented using a Python Fraction() object) that a tree of height h has width w. Hint: build a generating function by using Python Fraction(...) objects as the coefficients.


[link]  5.10. More applications of counting and probability

Example: Suppose we are modelling how bacterial cells multiply over time. To simplify the problem, assume that time is divided into discrete time steps, and at a given time step, each cell has a 50% chance of dividing into two cells, and a 50% chance of not dividing into two cells. After 5 time steps, what is the probability that there are at least 10 cells?
We approach the problem by observing that all we need to do is compute the probabilities of the possible widths of trees of height 5 when each node can have either one or two branches (each with probability 1/2). Thus, we would start with:
g
:=
x
Next, we would "update" our generating function four times using the following rule:
g
:=
 
1
2
 
g +
 
1
2
 
g2
Finally, to find the probability that there are at least 10 cells, we would compute the sum of all coefficients of the generating function for the terms x10, x11, x12, ..., x16 (the maximum width of any tree of height 5 must be 16).
Example: It is your job to allocate memory for an artificial intelligence algorithm that will model a game. You are not told the details of the game, but you know the following: during each turn, a player rolls a fair six-sided die that has the labels 1, 2, ..., 6. The result of the dice roll determines how many distinct options the player has in making the next move (e.g., if the roll results in a 2, the player has two possible ways they can make a move during that turn).
The AI algorithm needs to represent and store in memory the entire collection of possible ways the game can progress. Thus, it will use a tree to represent this: each node is a possible turn, and each branch is one of the possible moves a player can make. The number of children that each node has will be determined by the roll of a fair dice.
Suppose games always last exactly four turns. You decide to allocate enough memory to store exactly 20 nodes. What is the probability that this will be enough memory to store the entire tree of possible ways the game can progress?
Fact (generalized binomial theorem): The following identity holds:
(a + b)n
=
(
 
n
0
 
)anb0   +   (
 
n
1
 
)an − 1b1   +   (
 
n
2
 
)an − 2b2   +   ...   +   (
 
n
n
 
)a0bn
=
Σni = 0 (
 
n
i
 
)anibi
=
Σni = 0 (
 
n
i
 
)aibni
Example: Suppose that a robot is standing in the center of a room. Every second, the robot moves one meter forward with probability 1/3, or it moves one meter backward with probability 2/3. What is the probability that after 10 seconds, the robot will end up in the center of the room?
We can model this situation using a generating function by representing the forward motion of the robot using a positive exponent, and the backward motion using a negative exponent. Thus, the behavior at any particular time step is:
 
2
3
 
x-1 +
 
1
3
 
x1
After 10 seconds, the probability that the robot moved some distance d is given by the coefficient of the xd term in the following generating function:
(
 
2
3
 
x-1 +
 
1
3
 
x1)10
In this particular problem, we are interested in the coefficient of x0. One way to simplify the above is to introduce a factor of x into the generating function, so that there are no negative exponents:
 
1
x10
 
⋅ (
 
2
3
 
x0 +
 
1
3
 
x2)10
We still want to find the coefficient of x0 of the above generating function. However, we see that this corresponds to the coefficient of x10 of the second term ((2/3)x0 + (1/3)x2)10. We can then use the generalized binomial theorem to compute this:
(
 
2
3
 
x0 +
 
1
3
 
x2)10     
=
     Σ10i = 0      (
 
10
i
 
) ⋅ (
 
2
3
 
)10 − i ⋅ (
 
1
3
 
)i ⋅ (x2)i
The above shows us that the coefficient will correspond to the i = 5 case, since 2 ⋅ 5 = 10. Thus, our desired coefficient is:
(
 
10
5
 
) ⋅ (
 
2
3
 
)10 − 5 ⋅ (
 
1
3
 
)5
=
(
 
10
5
 
) ⋅ (
 
2
3
 
)5 ⋅ (
 
1
3
 
)5
Notice we can explain the above formula in a different way. Each of the 10 time steps is an indepent choice the robot must make between moving forward or backward. The only way the robot will end up at the center of the room is if the robot chooses to move forward during exactly 5 of those time steps (and moves backward during the other five). This explains the combination term in the above. Then, it is only necessary to compute the probability of choosing to move forward five times, and choosing to move backward five times. This explains the two fraction terms in the above.
Example: Suppose that there is an equal likelihood that on a given weekday, a stock price will fall 3, 2, or 1 dollars in value, will stay the same, or will rise 1, 2, or 3 dollars in value. What is the probability that after five weekdays, the stock has risen in value or stayed the same?
We can model the behavior of the stock price on a given day using a generating function:
 
1
7
 
(x−3 + x−2 + x−1 + x0 + x1 + x2 + x3)
Given these assumptions, the maximum that the stock price can rise over five days is 3 ⋅ 5 = 15. Thus, we then need to compute the sum of the coefficients of xi for i ≥ 0 for the following generating function:
(
 
1
7
 
(x−3 + x−2 + x−1 + x0 + x1 + x2 + x3))5
We can do so using Python:

>>> g = (Fraction(1,7) * sum(x**i for i in range(-3,4)))**5
>>> sum(g[i] for i in range(0, 15))
Fraction(1304, 2401)
          
Thus, the overall probability that the stock price will be the same or higher after five days is 1304/2401.
Example: Suppose that passwords for a certain device must be sequences that consist of some number of a and b characters. Assuming that the password for the device is aabab, answer each of the following questions.
  • Alice's password guessing algorithm is equally likely to choose any 5-character password that can be assembled from the five characters a, a, a, b, and b. What is the probability that it will guess the password on its first try?
    The total number of possible 5-character passwords is:
     
    5!     
    2! ⋅ 3!
     
    Notice this corresponds to a combination because we are "choosing" which of the five password character positions will be a (or, equivalently, b):
    (
     
    5
    3
     
    )
    =
    (
     
    5
    2
     
    )
    =
     
         5!
    2! ⋅ 3!
     
    .
    Since Alice's algorithm is equally likely to choose any of these, the probability it will choose exactly the correct one is:
    (
     
    5!     
    2! ⋅ 3!
     
    )−1
    =
    (
     
    2! ⋅ 3!
         5!
     
    )
    =
     
    1
    10
     
  • Bob's password guessing algorithm builds the password from left to right, one character at a time. At each step, the probability that it will choose a is 2/5, and the probability that it will choose b is 3/5. What is the probability that it will guess the correct 5-character password on its first try?
    Each character in the 5-character password is an independent dimension consisting of two mutually exclusive choices (a and b). The probability that Bob's algorithm will choose the particular sequence is then the product of probabilities over these five independent dimensions:
     
    2
    5
     
     
    2
    5
     
     
    3
    5
     
     
    2
    5
     
     
    3
    5
     
    =
    (
     
    2
    5
     
    )3 ⋅ (
     
    3
    5
     
    )2
    =
     
    72
    3125
     
     
    1
    50
     
    Notice that this intuitively makes sense: there are 25 = 32 possible 5-character passwords that can be built, and if Bob's algorithm was equally likely to choose a or b for each character position, the overall probably would be 1/32. However, since Bob's algorithm is biased in favor of choosing a b while the actual password has more a character instances than b character instances, the probability that Bob's algorithm will guess the correct password is actually lower.
  • What is the probability that it will guess a 5-character password that is not necessarily correct, but has the correct number of a and b characters?
    We know that the probability that Bob's algorithm will choose a password containing three a characters and two b characters is (2/5)3 ⋅ (3/5)2. However, the number of ways in which this can happen is:
    (
     
    5
    3
     
    )
    =
    (
     
    5
    2
     
    )
    =
     
         5!
    2! ⋅ 3!
     
    .
    Thus, the probability would be:
    (
     
    5
    3
     
    ) ⋅ (
     
    2
    5
     
    )3 ⋅ (
     
    3
    5
     
    )2
    Notice that this is exactly the coefficient of x2 in the generalized binomial theorem expansion of the following generating function:
    ((
     
    2
    5
     
    ) ⋅ x0 + (
     
    3
    5
     
    ) ⋅ x1)5
  • Suppose Alice runs her algorithm three times. What is the probability that at least one of the three guesses will be the correct password?
    We want to measure the probability that at least one of the guesses generated by Alice's algorithm is correct. However, we also want to count any situation in which two or three of the guesses are correct. Thus, we want the probability of the following logical formula:
    first guess is correctsecond guess is correctthird guess is correct
    The three situations that correspond to the three predicates above are not mutually exclusive (e.g., the first and third guesses might both be correct). Thus, we cannot simply add the probabilities. Fortunately, the three attempts are completely independent (for example, the probability of possible outcomes in the second attempt is the same whether or not we restrict ourselves to a particular outcome of the first attempt). Thus, we can instead use De Morgan's law and compute the probability of the following:
    ¬(¬(first guess is correct) ∧ ¬(second guess is correct) ∧ ¬(third guess is correct))
    In other words, we instead multiply the complements of the probabilities to find the probability that none of the guesses are correct. Then take the complement of that to determine the probability that at least one guess works. We can can now compute:
    1 − ((1 −
     
    1
    10
     
    )3)
    =
    1 − (
     
    9
    10
     
    )3
    =
    1 −
     
    729
    1000
     
    =
     
    271
    1000
     
    Thus, the probability of at least one of the guesses being correct is 271/1000.
  • Suppose Alice runs her algorithm three times. What is the probability that exactly one of the three guesses will be the correct password?
    In this case, we are considering three mutually exclusive possibilities: the first guess is correct and they other two are not, the second guess is correct and the others are not, and the third guess is correct and the others are not. We can compute each of these probabilities separately and then add them:
    (
     
    1
    10
     
     
    9
    10
     
     
    9
    10
     
    ) + (
     
    9
    10
     
     
    1
    10
     
     
    9
    10
     
    ) + (
     
    9
    10
     
     
    9
    10
     
     
    1
    10
     
    )
    =
    3 ⋅ (
     
    81
    1000
     
    )
    =
     
    243
    1000
     
  • Suppose Bob runs his algorithm three times. What is the probability that all three guesses will be the correct password?
    If each guess is an independent dimension, we want the probability of a particular measurement along each dimension, so we multiply:
    ((
     
    2
    5
     
    )3 ⋅ (
     
    3
    5
     
    )2)3
    =
    (
     
    8
    125
     
     
    9
    25
     
    )3
    =
    (
     
         72
    15625
     
    )3
    =
     
         373248
    3814697265625
     


[link]  5.11. Assignment #7: Generating Functions, Probability, and Logical Formulas

In this assignment you will solve problems involving logical formulas and the computation of probabilities using generating functions and other counting methods.
You may import the following library function in your module (and you will want to include at the top of your Python file the latest version of the Python class for working with generating functions):

from math import factorial
from fractions import Fraction
        
Your solution may not import any other modules or employ any external library functions. The different problems and problem parts rely on the lecture notes and on each other; carefully consider whether you can use functions from the lecture notes, or functions you define in one part within subsequent parts.
  1. Suppose we are given the following data about which predicates apply to elements in the collection of objects O = {tomato, frog, robin, parrot, fox, tree} (this particular collection of concepts and properties is adapted from examples in Semantic Cognition by Rogers & McClelland):
    object predicates that apply (i.e., instantiations that are true)
    tomato tomato is red,      tomato cannot eat,      tomato cannot fly
    frog frog can eat,      frog is green,      frog cannot fly
    robin robin can eat,      robin can fly,      robin is red
    parrot parrot can eat,      parrot is green,      parrot can fly
    fox fox is red,      fox cannot fly,      fox can eat,
    tree tree cannot eat,      tree is green,      tree cannot fly
    Answer the following questions about the above information.
    1. Determine whether the following formula is true for all objects x in O. Indicate your answer by including in your Python file either oneA = True or, if it is not true, providing a counterexample such as oneA = 'penguin':
      x,      (x is redx can eat) x can fly
    2. Determine whether the following formula is true for all objects x in O. Indicate your answer by including in your Python file either oneB = True or, if it is not true, providing a counterexample such as oneB = 'penguin':
      x,      x cannot fly x cannot eat
    3. Separate the predicates into separate dimensions of mutually exclusive properties. Use Python strings such as 'cannot fly' to represent each predicate (avoid typos to make grading easier), and define oneC = ... to be a list of sets in which each set contains all the strings for a particular dimension. As an example, here is one way this could be done for the predicates ... is hot, ... is cold, ... is wide, and ... is narrow:

      oneC = [{'is hot''is cold'}, {'is wide''is narrow'}]
                    
    4. For each pair of dimensions in part (c) above, indicate whether for the given data the pair of dimensions is independent (for purposes of probability computation). Indicate your answer by putting every pair of independent dimensions together into a set. For example, if {'is hot', 'is cold'} and {'is wide', 'is narrow'} are independent, and {'is sweet', 'is sour'} and {'is wide', 'is narrow'} are independent, but no other pair of dimensions is independent, then you would write:

      oneD = [\
        { {'is hot''is cold'}, {'is wide''is narrow'} },\
        { {'is sweet''is sour'}, {'is wide''is narrow'} }\
        ]
                    
    5. If an object x is selected uniformly at random from O (any object has the same probability of being selected as any other), what is the probability that x can fly is true? Indicate your answer by including in your Python file using the line oneE = Fraction(..., ...) (filling in the ... appropriately).
    6. Use Python Fraction() objects to write out the computation oneF = ... for the probability that the following formula is true (for an object x selected uniformly at random from O):
      Prx O[ x cannot flyx is red ]
    7. Use Python Fraction() objects to write out the computation oneG = ... for the probability that the following formula is true (for an object x selected uniformly at random from O):
      Prx O[ x is greenx cannot eat ]
    8. Use Python Fraction() objects to write out the computation oneH = ... for the probability that the following formula is true (for an object x selected uniformly at random from O):
      Prx O[ x is green x cannot fly ]
    9. Use Python Fraction() objects to write out the computation oneI = ... for the probability that a new predicate ... is slithy applies to an object x chosen from O given the following information (assuming "slithiness" is independent from all the other dimensions):
      Prx O[ x is slithyx cannot fly ]
      =
       
      1
      3
       
  2. Suppose you are modelling a drunken walk along a bridge that is 4 steps wide. The drunk person starts from one end of the bridge (at the center of the bridge's walkway) and at each time step, they will step forward. They will also weave either one step to the right, one step to the left, or will only move forward without moving left or right (each possibility has an equal probability of occurring). They will fall of the bridge if they move 3 or more steps to the left, or 3 or more steps to the right.
    1. What is the minimum number of steps until there is a non-zero probability of falling off the bridge? Indicate your answer using a line of the form twoA = ....
    2. Assuming your answer to part (a) is the integer twoA, define the probability twoB = ... (using addition and/or multiplication of Python Fraction() objects) of falling off the bridge after either twoA or after twoA + 1 time steps. Hint: you cannot fall off the bridge again after you have already fallen off the bridge; compute your probability accordingly.
    3. What is the probability twoC = ... that after five steps have been taken, no fall has occurred?
  3. Suppose we have the following model for the way a tree grows: it begins growing from a single root node (which exists at time step t = 1), and at each time step t (starting with time step t = 2), every node that appeared at time step t1 behaves as follows:
    • with probability 1/3, the node does not grow any branches (and never again grows any branches);
    • with probability 1/3, the node grows one branch;
    • with probability 1/3, the node grows two branches.
    Implement a Python function probHgtAtTime(h, t) that uses generating functions to compute the probability that the tree has height h at time step t.
  4. You are modelling how two different types of bacteria reproduce. At each time step, each cell of type A that exists will behave as follows:
    • with probability 2/3, the cell will split into two cells;
    • with probability 1/3, the cell will not split.
    Each cell of type B that exists at a given time step will behave as follows:
    • with probability 1/3, the cell will split into three cells;
    • with probability 2/3, the cell will not split.
    Implement a Python function moreB(t) that uses generating functions to compute the probability that at time step t, there will be strictly more cells of type B than cells of type A. Hint: for each possible number of cells of type A, and for each possible number of cells of type B, we can compute the probability of that particular pair of quantities occuring at the same time; each of these possibilities is mutually exclusive.
  5. Suppose you are advising a company about the amount of funds they should have available to repay potential debt at a future point in time (e.g., to be kept in a separate account from which they can only withdraw at the time they need to pay the debt). The goal is to ensure that the probability the company will go bankrupt at that point in time is below 10%. Assume that the mathematical model of the company's performance each month is as follows: given some positive integer c, each month the company will either earn c in funds with probability p, or it will lose an amount c in funds with probability 1p.
    1. Implement a Python function probInRange(c, p, t, a, b) that computes the probability that the company's net loss or gain after t months will be between a and b (inclusive). You may assume the following (you do not need to check these conditions in your code):
      • c > 0;
      • 0 < p < 1;
      • t > 0;
      • ab.
      Your code should work for large values of c and t, and you may assume that p is a Python Fraction() object. Hint: do not build a generating function; instead, use a theorem to determine how to compute the desired coefficient.
    2. At time t, the company's performance will be assessed; if its cumulative losses (i.e., negative earnings) at that time are greater than the amount of funding the company allocated to repay its debt, the company will go bankrupt. Implement a Python function minAlloc(c, p, t) that computes the minimum amount of funds the company should allocate (as an integer) so that the probability that the company will go bankrupt is below 10%.



[link]  Review #2. Logic, Counting, and Probability

The following is a breakdown of what you should be able to do at this point in the course (and of what you may be tested on in an exam). Notice that many of the tasks below can be composed. This also means that some problems can be solved in more than one way.
  • boolean formulas, their meanings, and truth tables
    • determine whether a formula is a well-formed formula
    • determine the meaning of a formula without variables
    • solve for an assignment of values to variables that makes a formula true
    • build a truth table for a formula with variables
    • for a given model (assignment of variables to values), find the meaning of a formula
  • boolean algebra
    • common algebraic laws, and applying algebraic laws
      • commutativity, associativity, and identity of ∨ and ∧
      • distributivity of ∧ and invertibility of  ¬
      • De Morgan's laws
      • representing ⊕, , and using ∨, ∧, and  ¬
  • models, model sets, satisfaction of formulas, and common problems involving formulas
    • determine whether a given model satisfies a given formula
    • given a formula, find the set of all models that satisfy a formula
    • given a formula, solve the boolean satisfiability problem
    • given a formula, solve the maximum boolean satisfiability problem
    • given a formula, solve the counting boolean satisfiability problem
    • counting problems
      • determine the number of rows in a truth table for a formula, or a given number of variables
      • determine the number of possible truth tables for a given number of variables
  • using formulas to represent properties of "real-world" systems and entities
    • representing states of a system using boolean variables
    • representing constraints and relationships between system states using formulas
    • answering questions about systems and entities being modeled
      • finding a system state that satisfies a formula
      • finding a system state with the maximum number of variables assigned to ⊤
      • counting the number of system states that satisfy a formula
  • set theory, algebra of sets, and relationships to boolean algebra
    • perform computations involving set union, intersection, difference, and complement
    • find the size of a set
    • find the power set (or the size of a power set) of a set
    • apply algebraic laws of set operations
    • convert logical operations on formulas into corresponding set operations on model sets (and vice versa)
  • theorems, proofs, predicates, universal quantification, and induction
    • given a domain of discourse and collection of predicates, count the number of predicate instantiations possible
    • determine whether a well-formed formula is a theorem
    • determine whether an inference rule holds (i.e., ⊥ rows can become ⊤, but ⊤ rows must stay ⊤)
    • identify or apply common inference rules and logical (algebraic) laws
      • modus ponens
      • ∀-instantiation
      • induction
    • assemble a proof given axioms and a goal (using the representation involving a vertical list of subformulas)
    • determine whether a proof is valid
  • inductively defining trees and measurements (height, width, size) on trees
    • defining recurrence relations for...
      • tree width
      • tree size
      • tree height
    • converting between different ways of representing measurements:
      • recurrence relations
      • closed-form functions
      • summations
  • counting and probability
    • permutations and combinations
    • treating counts (possible/total) as probabilities
    • combining counts and probabilities
      • multiplying counts or probabilities for independent simultaneous choices
      • adding counts or probabilities for mutually exclusive choices
    • using generating functions for breaking up counts and probabilities by measurement dimensions
      • defining generating functions that represent various scenarios
        • counting trees by width, size, and height
        • throwing dice, adding coins, and so on
        • random walks and sequences of independent events
      • binomial theorem
      • negative binomial theorem
      • generalized binomial theorem
    • reasoning about data represented as a set of predicates defined over a domain of discourse
      • identifying independent dimensions
      • computing the probability of a logical formula being true
        • logical ∨ applied to two mutually exclusive measurements
        • logical ∨ applied to two independent properties
        • logical ∧ applied to two independent properties
        • logical ¬ applied to any property

[link]  Appendix A. Python

The Python programming language will be among the languages we use in this course. This language supports the object-oriented, imperative, and functional programming paradigms, has automatic memory managememt, and natively supports common high-level data structures such as lists and sets. Python is often used as an interpreted language, but it can also be compiled.

[link]  A.1. Obtaining Python

The latest version of Python 3 can be downloaded at: http://www.python.org/getit/. In this course, we will require the use if Python 3, which has been installed on all the CS Department's undergraduate computing lab machines, as well as on csa2/csa3.

[link]  A.2. Assembling a Python module

The simplest Python program is a single file (called a module) with the file extension .py. For example, suppose the following is contained within a file called example.py:

# This is a comment in example.py.
# Below is a Python statement.
print("Hello, world.")
      
Assuming Python is installed on your system, to run the above program from the command line you can use the following (you may need to use python3, python3.2, python3.3, etc. depending on the Python installation you're using). Note that in the examples below %> represents a terminal prompt, which may look different on your system.

%> python example.py
Hello, world.
      
If you run Python without an argument on the command line, you will enter Python's interactive prompt. You can then evaluate expressions and execute individual statements using this prompt; you can also load and execute a Python module file:

%> python
Python 3.2 ...
Type "help""copyright""credits" or "license" for more information.
>>> exec(open("example.py").read()) # Load example.py module.
Hello, world.
>>> x = "Hello." # Execute an assignment statement.
>>> print(x)     # Execute a print statement.
Hello.
>>> x            # Evaluate a string expression.
'Hello.'
>>> 1 + 2        # Evaluate a numerical expression.
3
      

[link]  A.3. Common data structures (i.e., Python expressions)

Python provides native support for several data structures that we will use throughout this course: integers, strings, lists, tuples, sets, and dictionaries (also known as finite maps). In this subsection, we present how instances of these data structures are represented in Python, as well as the most common operations and functions that can be applied to these data structure instances.
  • Booleans consist of two constants: True and False.
    • The usual logical operations are available using the operators and, or, and not.

    >>> True                                      # A boolean constant.
    True
    >>> False                                     # A boolean constant.
    False
    >>> True and False or True and (not False)    # A boolean expression.
    True
              
  • Integers are written as in most other programming languages (i.e., as a sequence of digits).
    • The usual arithmetic operations are available using the operators +, *, -, and /. The infix operator // represents integer division, and the infix operators ** represents exponentiation. Negative integers are prefixed with the negation operator -.
    • The usual relational operators ==, !=, <, >, <=, >= are available.
    • The int() function can convert a string that looks like an integer into an integer.

    >>> 123                                       # An integer constant.
    True
    >>> 1 * (2 + 3) // 4 - 5                      # An integer expression.
    -4
    >>> 4 * 5 >= 19                               # A boolean expression involving integers.
    True
    >>> int("123")                                # A string being converted into an integer
    123
              
  • Strings are delimited by either ' or " characters. Strings can be treated as lists of single-character strings. Another way to look at this is that there is no distinction between a character and a string: all characters are just strings of length 1. Multiline strings can be delimited using """ or ''' (i.e., three quotation mark characters at the beginning and end of the string literal).
    • The empty string is denoted using '' or "".
    • Two strings can be concatenated using +.
    • The function len() returns the length of a string.
    • Individual characters in a string can be accessed using the bracketed index notation (e.g., s[i]). These characters are also strings themselves.

    >>> 'Example.'                                # A string.
    'Example.'
    >>> "Example."                                # Alternate notation for a string.
    'Example.'
    >>> len("ABCD")                               # String length.
    4
    >>> "ABCD" + "EFG"                            # String concatenation.
    'ABCDEFG'
    >>> "ABCD"[2]                                 # Third character in the string.
    'C'
              
  • Lists are similar to arrays: they are ordered sequences of objects and/or values. The entries of a list can be of a mixture of different types, and lists containing one or more objects are delimited using [ and ], with the individual list entries separated by commas. Lists cannot be members of sets.
    • The empty list is denoted using [].
    • Two lists can be concatenated using +.
    • The function len() returns the length of a list.
    • Individual entries in a list can be accessed using the bracketed index notation (e.g., a[i]).
    • To check if a value is in a list, use the in relational operator.

    >>> [1,2,"A","B"]                             # A list.
    [1, 2, 'A''B']
    >>> [1, 2] + ['A','B']                        # Concatenating lists.
    [1, 2, 'A''B']
    >>> len([1,2,"A","B"] )                       # List length.
    4
    >>> [1,2,"A","B"][0]                          # First entry in the list.
    1
    >>> 1 in [1, 2]                               # List containment check.
    True
              
  • Tuples are similar to lists (they are ordered, and can contain objects of different types), except they are delimited by parentheses ( and ), with entries separated by commas. The main distinction between lists and tuples is that tuples are hashable (i.e., they can be members of sets).
    • The empty tuple is denoted using ().
    • A tuple containing a single object x is denoted using (x, ).
    • Two tuples can be concatenated using +.
    • A tuple can be turned into a list using the list() function.
    • A list can be turned into a tuple using the tuple() function.
    • The function len() returns the length of a tuple.
    • Individual entries in a tuple can be accessed using the bracketed index notation (e.g., t[i]).
    • To check if a value is in a tuple, use the in relational operator.

    >>> (1,2,"A","B")                             # A tuple.
    (1, 2, 'A''B')
    >>> (1,)                                      # Another tuple.
    (1,)
    >>> (1, 2) + ('A','B')                        # Concatenating tuples.
    (1, 2, 'A''B')
    >>> list((1, 2, 'A','B'))                     # A tuple being converted into a list.
    [1, 2, 'A''B']
    >>> tuple([1, 2, 'A','B'])                    # A list being converted into a tuple.
    (1, 2, 'A''B')
    >>> len((1,2,"A","B"))                        # Tuple length.
    4
    >>> (1,2,"A","B")[0]                          # First entry in the tuple.
    1
    >>> 1 in (1, 2)                               # Tuple containment check.
    True
              
  • Sets are unordered sequences that cannot contain duplicates. They are a close approximation of mathematical sets. Sets cannot be members of sets.
    • The empty set is denoted using set().
    • The methods .union() and .intersect correspond to the standard set operations.
    • A list or tuple can be turned into a set using the set() function.
    • A set can be turned into a list or tuple using the list() or list() function, respectively.
    • The function len() returns the size of a set.
    • To access individual entries in a set, it is necessary to turn the set into a list or tuple.
    • To check if a value is in a set, use the in relational operator.

    >>> {1,2,"A","B"}                             # A set.
    {1, 2, 'A''B'}
    >>> ({1,2}.union({3,4})).intersection({4,5})  # Set operations.
    {4}
    >>> set([1, 2]).union(set(('A','B')))         # Converting a list and a tuple to sets.
    {'A', 1, 2, 'B'}
    >>> len({1,2,"A","B"})                        # Set size.
    4
    >>> 1 in {1,2,"A","B"}                        # Tuple containment check.
    True
              
  • Frozen sets are like sets, except they can be members of other sets. A set can be turned into a frozen set using the frozenset() function.

    >>> frozenset({1,2,3})                        # A frozen set.
    frozenset({1, 2, 3})
    >>> {frozenset({1,2}), frozenset({3,4})}      # Set of frozen sets.
    {frozenset({3, 4}), frozenset({1, 2})}
              
  • Dictionaries are unordered collections of associations between some set of keys and some set of values. Dictionaries are also known as finite maps.
    • The empty dictionary is denoted using {}.
    • The list of keys that the dictionary associates with values can be obtained using list(d.keys()).
    • The list of values that the dictionary contains can be obtained using list(d.values()).
    • The function len() returns the number of entries in the dictionary.
    • Individual entries in a dictionary can be accessed using the bracketed index notation (e.g., d[key]).

    >>> {"A":1, "B":2}                            # A dictionary.
    {'A': 1, 'B': 2}
    >>> list({"A":1, "B":2}.keys())               # Dictionary keys.
    ['A''B']
    >>> list({"A":1, "B":2}.values())             # Dictionary values.
    [1, 2]
    >>> len({"A":1, "B":2})                       # Dictionary size.
    2
    >>> {"A":1, "B":2}["A"]                       # Obtain a dictionary value using a key.
    1
              

[link]  A.4. Function, procedure, and method invocations

Python provides a variety of ways to supply parameter arguments when invoking functions, procedures, and methods.
  • Function calls and method/procedure invocations consist of the function, procedure, or method name followed by a parenthesized, comma-delimited list of arguments. For example, suppose a function or procedure example() is defined as follows:

    def example(x, y, z):
      print("Invoked.")
      return x + y + z
              
    To invoke the above definition, we can use one of the following techniques.
    • Passing arguments directly involves listing the comma-delimited arguments directly between parentheses.

      >>> example(1,2,3)
      Invoked.
      6
                    
    • The argument unpacking operator (also known as the *-operator, the scatter operator, or the splat operator) involves providing a list to the function, preceded by the * symbol; the arguments will be drawn from the elements in the list.

      >>> args = [1,2,3]
      >>> example(*args)
      Invoked.
      6
                    
    • The keyword argument unpacking operator (also known as the **-operator) involves providing a dictionary to the function, preceded by the ** symbol; each named paramter in the function definition will be looked up in the dictionary, and the value associated with that dictionary key will be used as the argument passed to that parameter.

      >>> args = {'z':3, 'x':1, 'y':2}
      >>> example(**args)
      Invoked.
      6
                    
  • Default parameter values can be specified in any definition. Suppose the following definition is provided.

    def example(x = 1, y = 2, z = 3):
      return x + y + z
              
    The behavior is then as follows: if an argument corresponding to a parameter is not supplied, the default value found in the definition is used. If an argument is supplied, the supplied argument value is used.

    >>> example(0, 0)
    3
    >>> example(0)
    5
    >>> example()
    6
              

[link]  A.5. Comprehensions

Python provides concise notations for defining data structures and performing logical computations. In particular, it support a comprehension notation that can be used to build lists, tuples, sets, and dictionaries.
  • List comprehensions make it possible to construct a list by iterating over one or more other data structure instances (such as a list, tuple, set, or dictionary) and performing some operation on each element or combination of elements. The resulting list will contain the result of evaluating the body for every combination.

    >>> [ x for x in [1,2,3] ]
    [1, 2, 3]
    >>> [ 2 * x for x in {1,2,3} ]
    [2, 4, 6]
    >>> [ x + y for x in {1,2,3} for y in (1,2,3) ]
    [2, 3, 4, 3, 4, 5, 4, 5, 6]
              
    It is also possible to add conditions anywhere after the first for clause. This will filter which combinations are actually used to add a value to the resulting list.

    >>> [ x for x in {1,2,3} if x < 3 ]
    [1, 2]
    >>> [ x + y for x in {1,2,3} for y in (1,2,3) if x > 2 and y > 1 ]
    [5, 6]
              
  • Set comprehensions make it possible to construct a set by iterating over one or more other data structure instances (such as a list, tuple, set, or dictionary) and performing some operation on each element or combination of elements. The resulting list will contain the result of evaluating the body for every combination. Notice that the result will contain no duplicates because the result is a set.

    >>> { x for x in [1,2,3,1,2,3] }
    {1, 2, 3}
              
  • Dictionary comprehensions make it possible to construct a dictionary by iterating over one or more other data structure instances (such as a list, tuple, set, or dictionary) and performing some operation on each element or combination of elements. The resulting dictionary will contain the result of evaluating the body for every combination.

    >>> { key : 2 for key in ["A","B","C"] }
    {'A': 2, 'C': 2, 'B': 2}
              

[link]  A.6. Other useful built-in functions

The built-in function type() can be used to determine the type of a value. Below, we provide examples of how to check whether a given expression has one of the common Python types:

>>> type(True) == @bool
True
>>> type(123) == int
True
>>> type("ABC") == str
True
>>> type([1,2,3]) == list
True
>>> type(("A",1,{1,2})) == tuple
True
>>> type({1,2,3}) == set
True
>>> type({"A":1, "B":2}) == dict
True
      

[link]  A.7. Common Python definition and control constructs (i.e., Python statements)

A Python program is a sequence of Python statements. Each statement is either a function definition, a variable assignment, a conditional statement (i.e., if, else, and/or elif), an iteration construct (i.e., a for or while loop), a return statement, or a break or continue statement.
  • Variable assignments make it possible to assign a value or object to a variable.

    x = 10
              
    It is also possible to assign a tuple (or any computation that produces a tuple) to another tuple:

    (x, y) = (1, 2)
              
  • Function and procedure definitions consist of the def keyword, followed by the name of the function or procedure, and then by one or more arguments (delimited by parentheses and separated by commas).

    def example(a, b, c):
        return a + b + c
              
  • Conditional statements consist of one or more branches, each with its own boolean expression as the condition (with the exception of else). The body of each branch is an indented sequence of statements.

    def fibonacci(n):
        # Computes the nth Fibonacci number.
        if n <= 0:
            return 0
        elif n <= 2:
            return 1
        else:
            return fibonacci(n-1) + fibonacci(n-2)
              
  • Iteration constructs make it possible to repeat a sequence of statements over and over. The body of an iteration construct is an indented sequence of statements.
    • The while construct has a boolean expression as its condition (much like if). The body is executed over and over until the expression in the condition evaluates to False, or a break statement is encountered.

      def example1(n):
          # Takes an integer n and returns the sum of
          # the integers from 1 to n-1.
          i = 0
          sum = 0
          while i < n:
              sum = sum + i
              i = i + 1
          return sum

      def example2(n):
          # Takes an integer n and returns the sum of
          # the integers from 1 to n-1.
          i = 0
          sum = 0
          while True:
              sum = sum + i
              i = i + 1
              if i == n:
                  break
          return sum
                    
    • The for construct makes it possible to repeat a sequence of statements once for every object in a list, tuple, or set, or once for every key in a dictionary.

      def example3(n):
          # Takes an integer n and returns the sum of
          # the integers from 1 to n-1.
          sum = 0
          for i in range(0,n):
              sum = sum + i
          return sum

      def example4(d):
          # Takes a dictionary d that maps keys to
          # integers and returns the sum of the integers.
          sum = 0
          for key in d:
              sum = sum + d[key]
          return sum
                    

[link]  A.8. Class for representing generating functions

The following is a small, stand-alone Python class for representing generating functions:

class G():
    def __init__(self, g = {1: 1}): self.g = g
    def __getitem__(self, i): return self.g.get(i,0)
    def k(self): return list(self.g.keys())
    def __repr__(self): return self.__str__()
    def __str__(g): return " + ".join([str(g[i])+"*x**"+str(i) for i in g.g])
    def __add__(x, y): return G({i:x[i]+y[i] for i in x.k()+y.k()})
    def __radd__(x, y):
        r = {i:x[i] for i in x.g}
        r[0] = r.get(0,0) + y
        return G(r)
    def __mul__(x, y):
        g = {}
        for i in x.g:
            for j in y.g: g[i+j] = g.get(i+j,0) + x[i]*y[j]
        return G(g)
    def __rmul__(x, y): return G({i:y*x[i] for i in x.g})
    def __pow__(x, y, r = 1):
        if x.g == {1:1} and y <= 0: return G({y:1})
        while y >= 1: (r,y) = (r*x, y-1)
        return r

x = G()
      
It is possible to use the above class in the following ways:
  • Building a generating function is straightforward: you can use x and the operators *, +, and **:

    >>> x
    1*x**1
    >>> (1 + x) ** 3
    1*x**0 + 3*x**1 + 3*x**2 + 1*x**3
    >>> (1 + x + x**2) * (x**4 + x**5)
    1*x**4 + 2*x**5 + 2*x**6 + 1*x**7
              
    Accessing coefficients is possible using the bracket notation (as with dictionaries and lists):

    >>> g = (1 + x) ** 3
    >>> g
    1*x**0 + 3*x**1 + 3*x**2 + 1*x**3
    >>> g[0]
    1
    >>> g[1]
    3
    >>> g[2]
    3