λ \lambda λ-Calculus
Real Question in Python
In order to solve the problem in Python as below, you need to grasp the theory behind this.
Currently, if you can’t solve the problem or figure out what the answer part is saying, no worries, just proceed to see the theory part first starting from 1.1 λ \lambda λ-Terms until the end. Then back to this part for the real question in Python.
# Question Part > what's the output value?
def foo(y):
return lambda x: x(x(y))
def bar(x):
return lambda y: x(y)
print((bar)(bar)(foo)(2)(lambda x: x + 1))
# Solution 1
# Step 1: Use alpha-conversion to rename the variables in case of naming conflicts.
def foo(i):
return lambda k: k(k(i))
# Step 2: Use beta-reduction to slove (bar)(bar). It means calling the function 'bar' and passing it with the argument 'bar'.
(lambda y: bar(y))(foo)(2)(lambda x: x + 1))
# Step 3: Use beta-reduction to solve (lambda y: bar(y))(foo). It means calling the function 'lambda y: bar(y)'' and passing it with the argument 'foo'.
bar(foo)(2)(lambda x: x + 1))
# Step 3: Use beta-reduction to solve bar(foo). It means calling the function 'bar' and passing it with the argument 'foo'.
(lambda y: foo(y))(2)(lambda x: x + 1))
# Step 4: Use beta-reduction to solve (lambda y: foo(y))(2). It means calling the function 'lambda y: foo(y)'' and passing it with the argument '2'.
foo(2)(lambda x: x + 1)
# Step 5: Use beta-reduction to solve foo(2). It means calling the function 'foo' and passing it with the argument '2'.
(lambda x: x(x(2)))(lambda x: x + 1)
# Step 6: Use alpha-conversion to rename the variables in case of naming conflicts.
(lambda x: x(x(2)))(lambda z: z + 1)
# Step 7: Use beta-reduction to solve (lambda x: x(x(2)))(lambda z: z + 1). It means calling the function 'lambda x: x(x(2))' and passing it with the argument 'lambda z: z + 1'.
(lambda z: z + 1)((lambda z: z + 1)(2))
# Step 8: Use the confluence of beta-reduction to solve ((lambda z: z + 1)(2)) firstly. It means calling the function 'lambda z: z + 1)' and passing it with the argument '2'.
# Note that in Python, the function is left-associate. However, we can apply the math technique to get the same result too. Inside Python, it doesn't work like this.
(lambda z: z + 1)(3)
# Step 9: Use beta-reduction to solve (lambda z: z + 1)(3). It means calling the function 'lambda z: z + 1' and passing it with the argument '3'.
4
# Solution 2
# Step 1: Use alpha-conversion to rename the variables in case of naming conflicts.
def bar(w):
return lambda z: w(z)
# the codes above it's equivalent to the codes below
# it 'bar' is always supplied a function (with 1 argument) as the argument
# bar(f)(g) -> (lmabda z: f(z))(g) -> f(g)
# when calling the function 'bar' and passing it the function 'f', the result is 'f' itself
# Step 2: Convert the original function above into the one below
def bar(w):
return w
Theory under the hood:
λ
z
.
(
(
λ
b
.
M
)
z
)
▹
β
λ
z
.
(
M
[
b
:
=
z
]
)
▹
α
λ
b
.
(
M
[
b
:
=
z
]
[
z
:
=
b
]
)
=
λ
b
.
M
\lambda z.((\lambda b.M)z) \space \triangleright_\beta \space \lambda z.(M[b:=z]) \space \triangleright_\alpha \space \lambda b.(M[b:=z][z:=b]) \space = \space \lambda b.M
λz.((λb.M)z) ▹β λz.(M[b:=z]) ▹α λb.(M[b:=z][z:=b]) = λb.M
Therefore,
(
λ
w
.
λ
z
.
w
z
)
(
λ
b
.
M
)
=
β
λ
b
.
M
(\lambda w. \lambda z. wz)(\lambda b.M) =_{\beta} \lambda b.M
(λw.λz.wz)(λb.M)=βλb.M
1.1 λ \lambda λ-Terms
- var -> variables
- abstraction -> λ \lambda λ expression
- application -> execution
e : = v a r ∣ a t o m ∣ a b s t r a c t i o n ∣ a p p l i c a t i o n a b s t r a c t i o n : = λ v a r . e a p p l i c a t i o n : = ( e 1 e 2 ) e := \textcolor{red}{var}|\textcolor{red}{atom}|\textcolor{green}{abstraction}|\textcolor{blue}{application} \\ \textcolor{green}{abstraction} := \lambda \textcolor{red}{var}.e \\ \textcolor{blue}{application} := (e_1 e_2) e:=var∣atom∣abstraction∣applicationabstraction:=λvar.eapplication:=(e1e2)
1.2 λ \lambda λ Expression in Python
Examples
- λ x . x + 1 → l a m b d a x : x + 1 \lambda x.x+1 \rightarrow lambda \space x: x+1 λx.x+1→lambda x:x+1
- λ x . λ y . x y → l a m b d a x : l a m b d a y : x ( y ) \lambda x. \lambda y.xy \rightarrow lambda \space x: \space lambda \space y: \space x(y) λx.λy.xy→lambda x: lambda y: x(y)
- ( λ x . 2 x ) y → ( l a m b d a x : 2 ∗ x ) ( y ) (\lambda x.2x)y \rightarrow (lambda \space x:2*x)(y) (λx.2x)y→(lambda x:2∗x)(y)
1.3 Free Variables
A variable x in a term e is bound if it is in the scope of a λ x \lambda x λx in e. Otherwise, it’s free.
Let
F
V
(
e
)
FV(e)
FV(e) be the free variables of e.
f
=
λ
x
.
λ
y
.
x
y
z
F
V
(
f
)
=
{
z
}
f = \lambda x. \lambda y.xyz \\ FV(f)=\{z\}
f=λx.λy.xyzFV(f)={z}
1.4 SUBSTITUIONS
Let e 1 [ x : = e 2 ] e_1 [x := e_2] e1[x:=e2] be the substitution of all free occurences of x in e 1 e_1 e1 with e 2 e_2 e2, changing the names of bound variables to avoid clashes.
(1) x [ x : = e ] ≡ e x[x := e] \equiv e x[x:=e]≡e
(2) ( e 1 e 2 ) [ x : = e 3 ] ≡ ( e 1 [ x : = e 3 ] ) ( e 2 [ x : = e 3 ] ) (e_1 e_2)[x := e_3] \equiv (e_1 [x := e_3])(e_2 [x := e_3]) (e1e2)[x:=e3]≡(e1[x:=e3])(e2[x:=e3])
(3) ( λ x . e 1 ) [ x : = e 2 ] ≡ λ x . e 1 (\lambda x.e_1) [x := e_2] \equiv \lambda x.e_1 (λx.e1)[x:=e2]≡λx.e1
- e.g. ( λ x . x + 1 ) [ x : = y ] ≡ λ x . x + 1 (\lambda x.x+1)[x := y] \equiv \lambda x.x+1 (λx.x+1)[x:=y]≡λx.x+1
- Why? x is not a free variable.
(4) ( λ y . e 1 ) [ x : = e 2 ] ≡ λ y . e 1 i f x ∉ F V ( e 1 ) (\lambda y.e_1) [x := e_2] \equiv \lambda y.e_1 \quad if \space x \notin FV(e_1) (λy.e1)[x:=e2]≡λy.e1if x∈/FV(e1)
- e.g. ( λ y . λ x . x + y ) [ x : = z ] ≡ ( λ y . λ x . x + y ) (\lambda y. \textcolor{red}{\lambda x.}x+y)[x := z] \equiv (\lambda y. \lambda x.x+y) (λy.λx.x+y)[x:=z]≡(λy.λx.x+y)
- Why? x is not a free variable.
(5) ( λ y . e 1 ) [ x : = e 2 ] ≡ λ y . ( e 1 [ x : = e 2 ] ) i f x ∈ F V ( e 1 ) , y ∉ F V ( e 2 ) (\lambda y.e_1) [x := e_2] \equiv \lambda y.(e_1 [x := e_2]) \quad if \space x \in FV(e_1), \space y \notin FV(e_2) (λy.e1)[x:=e2]≡λy.(e1[x:=e2])if x∈FV(e1), y∈/FV(e2)
- e.g. ( λ y . x + y ) [ x : = z ] ≡ λ y . z + y (\lambda y.x+y)[x := z] \equiv \lambda y.z+y (λy.x+y)[x:=z]≡λy.z+y
(6) ( λ y . e 1 ) [ x : = e 2 ] ≡ λ z . ( ( e 1 [ z : = y ] ) [ x : = e 2 ] ) i f x ∈ F V ( e 1 ) , y ∈ F V ( e 2 ) (\lambda y.e_1)[x := e_2] \equiv \lambda z.((e_1 [z := y])[x := e_2]) \quad if \space x \in FV(e_1), \space y \in FV(e_2) (λy.e1)[x:=e2]≡λz.((e1[z:=y])[x:=e2])if x∈FV(e1), y∈FV(e2)
- e.g. ( λ y . x + y ) [ x : = y ] ≡ λ z . ( x + z ) [ x : = y ] ≡ λ z . ( y + z ) (\lambda y.x+y)[x := y] \equiv \lambda z.(x+z)[x := y] \equiv \lambda z.(y+z) (λy.x+y)[x:=y]≡λz.(x+z)[x:=y]≡λz.(y+z)
- why? It causes a naming conflict in this example. Intuitively, λ y . y + y \lambda y.y+y λy.y+y is impossible because y y y is a parameter of lambda function, while x x x may be a global variable. Hence, firstly we need to alternate y y y with other distinct varibales, like z z z here [ α \alpha α-conversion]. In fact, It doesn’t change the meaning of this lambda function.
1.5 α \alpha α-conversion and congruence
α
\alpha
α-conversion: Change the name of the bound variable without causing name conflicts.
λ
y
.
x
+
y
▹
α
λ
z
.
x
+
z
\lambda y.x +y \space \triangleright_{\alpha} \space \lambda z.x+z
λy.x+y ▹α λz.x+z
Both terms are
α
\alpha
α-congruence if one term can be
α
\alpha
α-converted into anther term in a finite steps.
λ
y
.
x
+
y
≡
α
λ
z
.
x
+
z
\lambda y.x+y \space \equiv_{\alpha} \space \lambda z.x+z
λy.x+y ≡α λz.x+z
Examples in Python:
def f(x): return x + 1
def f(y): return y + 1
f = lambda z: z + 1
f = lambda i: i + 1
1.6 β \beta β-reduction and equivalence
It tells how computation happens in
λ
\lambda
λ-calculus. The idea behind
λ
\lambda
λ-calculus is a series of functions. We keep calling functions and passing arguments in it until we get the result. The process of calling a function and passing arguments to it is called
β
\beta
β-reduction.
(
λ
x
.
e
)
y
▹
β
e
[
x
:
=
y
]
(\lambda x.e)y \space \triangleright_{\beta} \space e[x := y]
(λx.e)y ▹β e[x:=y]
Call the function
λ
x
.
e
\lambda x.e
λx.e and pass it
y
y
y. The result is the return value of the function that you substitute the parameter
x
x
x with the argument
y
y
y.
In other words, when you call the function, the result is e e e where you substitue all the free ocurrences of x x x with y y y.
When one term can be obtained from the other following a finite series of
β
\beta
β-reduction, they are
β
\beta
β-equivalent.
(
λ
x
.
x
)
y
=
β
y
(\lambda x.x)y \space =_{\beta} \space y
(λx.x)y =β y
Why? Because when conducting
β
\beta
β-reduction,
(
λ
x
.
x
)
y
=
x
[
x
:
=
y
]
(\lambda x.x)y=\textcolor{red}{x}[x := y]
(λx.x)y=x[x:=y].
x
x
x is a free variable, so it can be substitued by
y
y
y.
Examples in Python:
>>> f = lambda x: x
>>> y = 12
>>> f(y)
12
(1) The relationship between β \beta β-reduction and α \alpha α-conversion?
In β \beta β-reductioin, e [ x : = y ] e[x := y] e[x:=y] is substitution. When substituting a variable, substitution should advoid naming conflicts. The way to advoid is through α \alpha α-conversion. Naming conflicts is caused through substituting free varibales, like ( λ y . x + y ) [ x : = y ] (\lambda y.x+y)[x := y] (λy.x+y)[x:=y].
On the contrast, when conducting β \beta β-reduction, we call the function and pass it the arguments. The result is the return expression of the function substituing all the parameters (bound variables once, but free variable now after β \beta β-reduction) with the arguments.
(2) Use β \beta β-reduction to evaluate pure functions in Python
To be clarified, it is not exactly Python works but we can apply the thought on the analysis.
Example 1:
In Python:
def f(x):
return x + 1
f = lambda x: x + 1
In
λ
\lambda
λ-calculus:
f
=
λ
x
.
x
+
1
f= \lambda x.x+1
f=λx.x+1
Through
β
\beta
β-reduction:
f
(
2
)
▹
β
(
x
+
1
)
[
x
:
=
2
]
=
2
+
1
▹
β
3
f(2) \space \triangleright_{\beta} \space (x+1)[x:=2] \space = \space 2+1 \space \triangleright_{\beta} \space 3
f(2) ▹β (x+1)[x:=2] = 2+1 ▹β 3
Example 2:
In Python:
g = lambda x: lambda y: x + y
In
λ
\lambda
λ-calculus:
g
=
λ
x
.
λ
y
.
(
x
+
y
)
g = \lambda x. \lambda y.(x+y)
g=λx.λy.(x+y)
Through
β
\beta
β-reduction:
g
(
1
)
(
2
)
▹
β
(
l
a
m
b
d
a
y
:
x
+
y
)
[
x
:
=
1
]
(
2
)
▹
β
(
l
a
m
b
d
a
y
:
1
+
y
)
(
2
)
▹
β
(
1
+
y
)
[
y
:
=
2
]
▹
β
1
+
2
▹
β
3
g(1)(2) \space \triangleright_{\beta} \space (lambda \space y: x+y)[x:=1](2) \space \triangleright_{\beta} \space (lambda \space y:1+y)(2) \space \triangleright_{\beta} \space (1+y)[y:=2] \space \triangleright_{\beta} \space 1+2 \space \triangleright_{\beta} \space 3
g(1)(2) ▹β (lambda y:x+y)[x:=1](2) ▹β (lambda y:1+y)(2) ▹β (1+y)[y:=2] ▹β 1+2 ▹β 3
(3) Church-Rosser Theorem
It is regarding to the confluence of β \beta β-reduction ( ▹ β \triangleright_{\beta} ▹β).
The idea: If P ▹ β M P \space \triangleright_{\beta} \space M P ▹β M and P ▹ β N P \space \triangleright_{\beta} \space N P ▹β N, there exists some T T T where M ▹ β T M \space \triangleright_{\beta} \space T M ▹β T and N ▹ β T N \space \triangleright_{\beta} \space T N ▹β T.
* P P P is some λ \lambda λ terms.
Example 1:
(
λ
x
.
(
λ
y
.
y
)
x
)
(
1
)
(\lambda x.(\lambda y.y)x)(1)
(λx.(λy.y)x)(1)
Path 1 - Reducing
(
λ
y
.
y
)
x
(\lambda y.y)x
(λy.y)x :
(
λ
x
.
(
λ
y
.
y
)
x
)
(
1
)
▹
β
(
λ
x
.
x
)
(
1
)
)
▹
β
1
(\lambda x.(\lambda y.y)x)(1) \space \triangleright_{\beta} \space (\lambda x.x)(1)) \space \triangleright_{\beta} \space 1
(λx.(λy.y)x)(1) ▹β (λx.x)(1)) ▹β 1
Path 2 - Reducing
(
λ
x
.
(
λ
y
.
y
)
x
)
(
1
)
(\lambda x.(\lambda y.y)x)(1)
(λx.(λy.y)x)(1):
(
λ
x
.
(
λ
y
.
y
)
x
)
(
1
)
▹
β
(
λ
y
.
y
)
(
1
)
▹
β
1
(\lambda x.(\lambda y.y)x)(1) \space \triangleright_{\beta} \space (\lambda y.y)(1) \space \triangleright_{\beta} \space 1
(λx.(λy.y)x)(1) ▹β (λy.y)(1) ▹β 1
It indicates that the order of calling which function first doesn’t matter.
Example 2 - Confluence in Python:
Path 1 - Reducing ( λ y . y ) x (\lambda y.y)x (λy.y)x :
* Note that in Python, the statements inside a function are not executed until the functioin is called. Here, we apply the math techniques - confluence to get the same result by executing the inner statements first.
(
l
a
m
b
d
a
x
:
(
l
a
m
b
d
a
y
:
y
)
(
x
)
)
(
1
)
▹
β
(
l
a
m
b
d
a
x
:
x
)
(
1
)
)
▹
β
1
(lambda \space x: (lambda \space y: y)(x))(1) \space \triangleright_{\beta} \space (lambda \space x:x)(1)) \space \triangleright_{\beta} \space 1
(lambda x:(lambda y:y)(x))(1) ▹β (lambda x:x)(1)) ▹β 1
Path 2 - Reducing
(
λ
x
.
(
λ
y
.
y
)
x
)
(
1
)
(\lambda x.(\lambda y.y)x)(1)
(λx.(λy.y)x)(1):
(
l
a
m
b
d
a
x
:
(
l
a
m
b
d
a
y
:
y
)
(
x
)
)
(
1
)
▹
β
(
l
a
m
b
d
a
y
:
y
)
(
1
)
▹
β
1
(lambda \space x: (lambda \space y: y)(x))(1) \space \triangleright_{\beta} \space (lambda \space y:y)(1) \space \triangleright_{\beta} \space 1
(lambda x:(lambda y:y)(x))(1) ▹β (lambda y:y)(1) ▹β 1
Reference:
Youtube: Lecture 9: Higher-Order Functions - Yong Qi Foo