首先这个系列的第一个单元是空中机器人,博客如下:
1 Robotics: Aerial Robotics 第1+2周 课程学习记录及课后习题解答
1 Robotics: Aerial Robotics 第3+4周 课程学习记录及课后习题解答
2 Robotics: Computational Motion Planning 第1周(内含Dijkstra 和 A* MATLAB代码手把手教学)课后习题解答
2 Robotics: Computational Motion Planning 第2+3+4周 课后习题解答
3 Robotics: Mobility 课程学习记录及课后习题解答
此课程在Coursera需要科学上网才能观看,放一下B站和Coursera的课程链接
- UP主 博主自己做的字幕版本 这个单元… 只有英文(持续更新)
- Coursera的链接介绍
此文仅为听课记录以及做题思考,有所错误的地方或是好的想法欢迎评论区交流。(点个赞就更好了,鼓励一下~)
- 9.6:其实3月份一起的时候就做了这章,但是真心觉得这个单元讲的不是特别易懂,如果是视觉那边建议听一下吴恩达的Deep Learning的系列(无基础都可入门 而且可以看很多遍去想——PS线性代数很重要,理解本质的那种,可以看看3Blue1Brown B站的线性代数的本质 刷新一下认识)
所以这个单元我也没怎么认真听,做题只是为了做题过,所以很多都是组合的,如果打了X在后面那边证明这种组合是不正确的… 有认真听的同学可以在评论区留下正确答案给大家参考
WEEK 1
Introduction
1.In the equation
1
f
=
1
a
+
1
b
\dfrac{1}{f} = \dfrac{1}{a} +\dfrac{1}{b}
f1=a1+b1, what does the
f
f
f stands for:
Focal Length
解释:
2.If an object is originally in focus and then you start moving the image plane, what do you expect to happen:
Image starts blurring
1
f
≠
1
a
+
1
b
\dfrac{1}{f} \ne \dfrac{1}{a} +\dfrac{1}{b}
f1=a1+b1
3.The size of the projection of an object increases as the object distance from the lens increases.
False
4.Parallel lines in the world remain always parallel after projection.
False
5.Parallel lines in the world remain parallel in the image plane when
the lines are parallel to the image plane
6.A vanishing point in an image is the intersection of projections of parallel lines in the world. There is at most one vanishing point in an image
False
7.The two parameters that we can directly control using the bi-perspectograph construction are:
Height of the camera
Focal Length
Vanishing Points
1.The School of Athens is a famous fresco by Raphael. Correct perspective projection is visible here. From the three specified points (A, B, or C), which is the vanishing point? (You need to use a ruler)
C
2.From the three options
(
l
1
,
l
2
,
l
3
)
(l_1, l_2, l_3)
(l1,l2,l3), which is the horizon?
l
3
l_3
l3
3.In the following image, from the three options
(
l
1
,
l
2
,
l
3
)
(l_1, l_2, l_3)
(l1,l2,l3), which is the horizon?
l
1
l_1
l1
4.A vanishing point is always visible inside an image
False
5.The horizon is the set of all directions to infinity for a plane
True
Perspective Projection
1.Assume you are given a line represented in the form
2
x
+
2
y
−
2
2
=
0
2x+2y-2\sqrt{2}=0
2x+2y−22=0. Which set of parameters (\rho,\theta)(ρ,θ) gives the same line represented in the form
ρ
=
x
cos
θ
+
y
sin
θ
\rho = x \cos\theta + y \sin \theta
ρ=xcosθ+ysinθ:
(
1
,
4
5
∘
)
(1,45^\circ)
(1,45∘)
2.The distance of a line to the origin is
ρ
=
3
\rho=3
ρ=3 and the norm direction of the line is
θ
=
π
/
4
\theta = \pi/4
θ=π/4. Which of the following is/are valid equations for the line?
x
+
y
−
3
2
=
0
x+y-3\sqrt2=0
x+y−32=0
3.What is the equation of the line passing through points with homogeneous coordinates
(
1
,
2
,
1
)
(1,2,1)
(1,2,1) and
(
−
1
,
3
,
1
)
(-1,3,1)
(−1,3,1)?
2
x
+
4
y
−
10
=
0
2x+4y-10=0
2x+4y−10=0
4.The lines
l
1
=
(
1
,
1
,
0
)
l_1=(1,1,0)
l1=(1,1,0) and
l
2
=
(
−
1
,
1
,
1
)
l_2=(-1,1,1)
l2=(−1,1,1)instersect at the point with homogeneous coordinates:
(
0.5
,
−
0.5
,
1
)
(0.5,-0.5,1)
(0.5,−0.5,1)
5.Consider the lines
y
=
1
y=1
y=1 and
y
=
2
y=2
y=2 in 2D projective space (as previous questions). What is the point of intersection in homogeneous coordinates?
They do not intersect
(
1
,
0
,
0
)
(1,0,0)
(1,0,0)
(
−
1
,
0
,
0
)
(-1,0,0)
(−1,0,0)
Rotations and Translations
这一章节建议去看一下b站的3blue1brown的线性代数章节,(比原来课程讲得好多了… 几下就能理解)
1.What is the determinant of a rotation matrix?
+1
2.What is the rotation
c
R
w
{ }^{c}R_{w}
cRw such that
X
c
=
c
R
w
X
w
+
c
T
w
X_c={ }^cR_wX_w+{ }^cT_w
Xc=cRwXw+cTw for a point
X
w
X_w
Xw expressed in the world coordinate frame?
c
R
w
{ }^{c}R_{w}
cRw=(-1 0 0;0 0 -1;0 -1 0)
解释(如果看完3blue1brown 就能理解更多 -> 线性代数的本质):如果w是单元系,也就是用w表示c,我们看
X
w
X_w
Xw和
X
c
X_c
Xc那么第一列就是(-1 0 0)因为
X
c
X_c
Xc是
−
X
w
-X_w
−Xw,好,看一下
Y
c
Y_c
Yc怎么表示,我们看到
Y
c
Y_c
Yc对应w的
Z
w
Z_w
Zw而且,还有个负号,所以第二列为(0 0 -1),以此类推
3.What is the corresponding translation
c
T
w
{ }^{c}T_{w}
cTw?
c
T
w
=
(
0
,
0
,
−
2
)
{ }^{c}T_{w}=(0,0,-2)
cTw=(0,0,−2)
4.What is
w
R
c
{ }^wR_{c}
wRc?
w
R
c
=
(
−
100
;
00
−
1
;
0
−
10
)
{ }^wR_{c}=(-1 0 0;0 0 -1;0 -1 0)
wRc=(−100;00−1;0−10)
5.What is
w
T
c
{ }^wT_{c}
wTc?
w
T
c
=
(
0
,
−
2
,
0
)
{ }^wT_{c}=(0,-2,0)
wTc=(0,−2,0)
6.For the quadrotor configuration in the two images below (top view and side view), what is the transformation from the body (imu) coordinate system to the camera?
In particular, what is the rotation
c
R
b
{ }^{c}R_{b}
cRb
such that
X
c
=
c
R
b
X
b
+
c
T
b
X_c={ }^cR_bX_b+{ }^cT_b
Xc=cRbXb+cTb for a point
X
b
X_b
Xb expressed in the body coordinate frame?
[Top View] (Distance between origins on XY plane is 4cm)
x=根号2/2
(x -x 0;-x -x 0;0 0 -1)
7.What is the corresponding translation
c
T
b
{ }^{c}T_{b}
cTb?
(-0.04,0,-0.03)
Dolly Zoom
1.Given Image 1, which of the four other images (2-5) would be the final result if we reduce the focal length?
Image 5
2.For the five images above, for which one do you think that the camera is the farthest away from the scene?
Image 3
Feeling of Camera Motion
1.You are given two images of a scene, before and after a change in the camera. Which transformation can produce this result?
Movement of the camera on the horizontal axis
2.You are given two images of a scene, before and after a change in the camera. Which transformation can produce this result?
Movement of the camera on the vertical axis
3.You are given two images of a scene, before and after a change in the camera. Which transformation can produce this result?
Rotation of the camera around the z-aixs
How to Compute Intrinsics from Vanishing Points
1.In the image below, we can see the projections of three orthogonal vanishing points
V
1
,
V
2
,
V
3
V_1,V_2, V_3
V1,V2,V3 and the image center CC. Which of the following statements is always true?
The image center is the centroid of the triangle formed by the projections of three orthogonal vanishing points.
2.Assume that the image center has been computed using the result of the previous question. Then, under which conditions can we compute the focal length from the image projections of three orthogonal vanishing points?
At least two of the vanishing points are not at infinity.
Camera Calibration
1.The calibration procedure estimates:
All the above
2.Which two of the four images below suffer mostly from radial distortion effects?
A
D
3.For calibration you need to know the size of the checkerboard squares
True
WEEK 2
Homogeneous Coordinates
1.The homogeneous coordinates of a point
P
P
P are
(
1
,
2
,
1
)
(1,2,1)
(1,2,1). Which of the following (homogeneous) coordinates represent the same point?
(2,4,2)
(-0.5,-1,-0.5)
2.Given a square ABCD, with
A
=
(
0
,
0
,
1
)
A = (0,0,1)
A=(0,0,1) and
C
=
(
1
,
1
,
1
)
C = (1,1,1)
C=(1,1,1), the equation of the diagonal BD in
P
2
P^2
P2 has the form
l
T
x
=
0
l^Tx=0
lTx=0 with
l
l
l equal to
Clarification: For this and following questions, we use P2 to denote the real projective plane.
(-1,-1,1)
3.Determine the equation of the line in
P
2
P^2
P2 through the points
(
a
,
0
,
1
)
(a,0,1)
(a,0,1) and
(
0
,
b
,
1
)
(0,b,1)
(0,b,1).
-b -a ab
4.Determine the equation of the line in P2 through the points
(
a
,
b
,
c
)
(a,b,c)
(a,b,c) and
(
d
,
e
,
0
)
(d,e,0)
(d,e,0).
-ce cd ae-bd X
5.0 0 ae-bd
Projective Transformations
1.What is the least number of non-collinear points required to estimate a projective transformation
H
:
P
2
→
P
2
H:\mathbb{P}^2 \rightarrow \mathbb{P}^2
H:P2→P2?
4
2.A projective transformation
M
M
M preserves the points
(
1
,
0
,
0
)
,
(
0
,
1
,
0
)
(1,0,0),(0,1,0)
(1,0,0),(0,1,0), and the origin of the coordinate system. However, it maps the point
(
1
,
1
,
1
)
(1,1,1)
(1,1,1) to the points
(
2
,
1
,
1
)
(2,1,1)
(2,1,1) meaning
(
2
,
1
,
1
)
T
=
M
(
1
,
1
,
1
)
T
(2,1,1)^{T} = M (1,1,1)^{T}
(2,1,1)T=M(1,1,1)T. Compute MM.
M
∼
(
2
0
0
0
1
0
0
0
1
)
M \sim \begin{pmatrix} 2 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{pmatrix}
M∼⎝⎛200010001⎠⎞
3.Find the projective transformation
A
A
A which will keep the points
(
0
,
0
,
1
)
(0,0,1)
(0,0,1) and
(
1
,
1
,
1
)
(1,1,1)
(1,1,1) fixed and will map point
(
1
,
0
,
1
)
(1,0,1)
(1,0,1) to
(
1
,
0
,
0
)
(1,0,0)
(1,0,0) and point
(
0
,
1
,
1
)
(0,1,1)
(0,1,1) to
(
0
,
1
,
0
)
(0,1,0)
(0,1,0)?
A
∼
(
−
1
0
0
0
−
1
0
−
1
−
1
1
)
A \sim \begin{pmatrix} -1 & 0 & 0 \\ 0 & -1 & 0 \\ -1 & -1 & 1 \end{pmatrix}
A∼⎝⎛−10−10−1−1001⎠⎞
4.Find the projective transformation AA that maps the points
(
1
,
0
,
0
)
(1,0,0)
(1,0,0),
(
0
,
1
,
0
)
(0,1,0)
(0,1,0),
(
0
,
0
,
1
)
(0,0,1)
(0,0,1), and
(
1
,
1
,
1
)
(1,1,1)
(1,1,1) to the points
(
−
2
,
0
,
1
)
,
(
0
,
1
,
−
1
)
,
(
−
1
,
2
,
−
1
)
(−2,0,1), (0,1,-1), (-1,2,-1)
(−2,0,1),(0,1,−1),(−1,2,−1) and
(
−
1
,
1
,
1
)
(-1,1,1)
(−1,1,1), respectively.
A
∼
(
−
2
/
3
0
1
0
5
/
3
−
2
1
/
3
−
5
/
3
1
)
A \sim \begin{pmatrix} -2/3 & 0 & 1 \\ 0 & 5/3 & -2 \\ 1/3 & -5/3 & 1 \end{pmatrix}
A∼⎝⎛−2/301/305/3−5/31−21⎠⎞
Vanishing Points
1.When the camera is zooming, do the vanishing points move?
Yes
2.What camera change would give the following result from Image 1 to Image 2
(Hint: Notice how the vanishing points change)
Camera Translation
3.What camera change would give the following result from Image 1 to Image 2
(Hint: Notice if the vanishing points change)
Zooming
4.z
h-b+bh 3,3 = bh
Cross Ratios and Single View Metrology
1.For the image below, if AB=12, BC=4 and CD=8, what is the cross ratio CR(A,B,C,D)?
2
2.For the same image as the previous question, if
1.91
3.Is it possible that the image of A’B’C’D’is the result of a perspective projection from ABCD? (assume that the lengths are the same as those from the previous two questions)
No
4.If not, what should be the length A’B’, such that A’B’C’D’ is indeed the result of a perspective projection from ABCD
6
WEEK 3
Visual Features
1.Features can be useful for
Panorama stitching
Scene reconstruction
Image retrieval
Image based localization
2.What properties of features are desirable?
X Detection variance
Descriptor invariance
91.3 Detection invariance
Descriptor variance
83.3 Detection variance
Descriptor variance
3.A scale space of an image can be build by
convolving with gaussian filters and subsampling
4.The scale of a feature is chosen by first convolving the corresponding image patch with Difference-of-Gaussian (DoG) filters and then, by taking the maximum response over all scales.
True
5.The SIFT detector is
Scale and rotation invariant
6.To compute the SIFT descriptor
You compute a histogram of gradients in a 16 by 16 grid and rotate them to have the largest magnitude gradient oriented upwards
Singular Value Decomposition
1.If
U
Σ
V
T
UΣV^T
UΣVT is an SVD for a given matrix AA then which if the following statements are true?
U is orthogonal and Σ is diagonal
U and V are orthogonal matrices
2.A symmetric real matrix has real eigenvalues and real singular values. Which of the following is true?
All singular values are nonnegative
Singular values are equal to the eigenvalues X
All eigenvalues are nonnegative
Singular values are equal to the eigenvalues X
All singular values are nonnegative
All eigenvalues are nonnegative X
就是这上面的组合都是错的… 因为没时间了
3.The largest singular value of is
2
4.Which of the following are valid SVD’s of the form
U
Σ
V
T
UΣV^T
UΣVT for the matrix
U=-1
U=1;-1;-1
5.Find the rank of the matrix
4
6.Which of the following is true?
The rank of a matrix is equal to the number of nonzero singular values.
7.The minimizer of the fitting cost
∣
∣
A
x
∣
∣
2
2
||Ax||_2^2
∣∣Ax∣∣22 with
A
∈
R
m
×
n
A∈R^{m×n}
A∈Rm×n, rank(A)>n subject to
The rank of matrix has nothing to do with its singular values.
8.Consider the points (0,-0.8), (1,0), (2.2,0.9), (2.9,2.1). Which of the following lines best fits the given points?
0.58
x
−
0.59
y
=
0.57
0.58x-0.59y=0.57
0.58x−0.59y=0.57
RANSAC
1.Assume we have a case for RANSAC with 300 samples and 200 inliers. If we pick
n
=
10
n = 10
n=10 samples to build our model, what is the probability that we will build the correct model? (Use 3 decimals of precision)
0.017
解释:(2/3)^10
2.For the same description, what is the probability that we won’t build a correct model after
k
=
100
k = 100
k=100 iterations? (Use 3 decimals of precision)
0.174
解释:(1-(2/3)10)100
3.How many iterations will we need at least, in case the desired RANSAC success rate is p≥0.99?
264
解释:log(0.01)/log(1-(2/3)^10)
3D-3D Pose
1.Find the rotation matrix RR such that
∣
∣
A
−
R
B
∣
∣
F
2
||A-RB||_F^2
∣∣A−RB∣∣F2 is minimized, where
A
=
[
1
1
−
1
−
1
1
−
1
−
1
1
1
1
1
1
]
A = \left[ {\begin{array}{cc} 1&1&-1&-1\\ 1&-1&-1&1\\ 1&1&1&1 \end{array}} \right]
A=⎣⎡1111−11−1−11−111⎦⎤,
B
=
[
−
1.2131
−
1.4413
0.3470
0.5752
0.0851
−
0.7858
−
1.6594
−
0.7885
−
1.2334
0.5525
0.3550
−
1.4309
]
B = \left[ {\begin{array}{cc} -1.2131&-1.4413&0.3470&0.5752\\ 0.0851&-0.7858&-1.6594&-0.7885\\ -1.2334&0.5525&0.3550&-1.4309 \end{array}} \right]
B=⎣⎡−1.21310.0851−1.2334−1.4413−0.78580.55250.3470−1.65940.35500.5752−0.7885−1.4309⎦⎤
2-0.8941
Pose Estimation
1.What is the minimum number of point correspondences required for camera pose estimation given the perspective projections of points with known world coordinates?
3
2.What is the maximum number of solutions obtained from solving the P3P?
4
3.Assume that all points in the world lie on the plane
K(r1 r2 T)
4.Assume that all points in the world lie on the plane
K(r1 r3 T)
WEEK 4
Epipolar Geometry
1.Let
X
x2.T T x1=0
x2.T R x1=0
X
x2.T T x1=0
x2.T x1=0
X
x2.T T x1=0
x2.T T R x1=0
X
x2.T x1=0
x2.T T R x1=0
X
x2.T R x1=0
x2.T T R x1=0
X
x2.T x1=0
x2.T R x1=0
2.is a rotation matrix, which of the following properties hold?
X
u.T u=0.T
R.t u R =R.T u
X
u.T u=0.T
uu=0
X
u.T u=0.T
u.T=-u
X
u.T=-u
uu=0
X
u.T=-u
R.t u R =R.T u
X
uu=0
R.t u R =R.T u
3.Let two cameras with poses
X
E=R1.T T21 R12
E=R1.T T12 R2
X
E=R1.T T21 R12
E=R1.T T12 R12
X
E=R1.T T21 R12
E=R1.T T21 R2
X
E=R1.T T12 R2
E=R1.T T21 R2
X
E=R1.T T21 R2
E=R1.T T12 R12
X
E=R1.T T12 R2
E=R1.T T12 R12
4.The relative pose between two views is
(
R
,
T
)
∈
S
E
(
3
)
(R,T)∈SE(3)
(R,T)∈SE(3) where
R
=
I
R=I
R=I and
T
T
T corresponds to a translation of 11m in the direction of the z-axis, which of the following is a valid essential matrix? Hint: use the fact that E=TˆR.
0 -1 0;1 0 0;0 0 0
5.The relative pose between two
E
=
[
0
0
0
0
0
−
1
0
1
0
]
E = \left[ {\begin{array}{cc} 0&0&0\\ 0&0&{ - 1}\\ 0&1&0 \end{array}} \right]
E=⎣⎡0000010−10⎦⎤
6.A nonzero matrix
sigma sigma 0 >
7.Given
sigma sigma 0 12/2
8.How many point correspondences are required to obtain an essential matrix using the linear algorithm?
8
9.Which of the following are valid essential matrices?
0 0 0;x 0 -x;0 1 0
1 1 0
10.Suppose we know the camera motion always moves on a plane, say the XY- plane (i.e. translation with only x and y components and rotation only about the z-axis). The essential matrix E=TˆR has the special form
0 0 a
11.Now, assuming the same scenario as in the previous question, which of the following solutions for
T
=
[
−
b
a
0
]
,
R
=
[
−
b
d
−
a
c
−
a
d
+
b
c
0
a
d
−
b
c
−
b
d
−
a
c
0
0
0
1
]
T = \left[ {\begin{array}{cc} { - b}\\ a\\ 0 \end{array}} \right],R = \left[ {\begin{array}{cc} { - bd - ac}&{ - ad + bc}&0\\ {ad - bc}&{ - bd - ac}&0\\ 0&0&1 \end{array}} \right]
T=⎣⎡−ba0⎦⎤,R=⎣⎡−bd−acad−bc0−ad+bc−bd−ac0001⎦⎤
12.In general, given a normalized essential matrix, we get mm distinct poses
(
R
,
T
)
(R,T)
(R,T) and by enforcing the positive depth constraint, we end up with nn valid poses. Which of the following is true?
(
m
,
n
)
=
(
4
,
1
)
(m,n)=(4,1)
(m,n)=(4,1)
Nonlinear Least Squares
1.Which of the following cost functions can be minimized in the framework of linear least squares? (Note the underscore on the norm refers to the p-norm)
f
(
x
)
=
∣
∣
A
x
−
b
∣
∣
2
2
f(x) = ||Ax-b||_2^2
f(x)=∣∣Ax−b∣∣22
2.Consider the problem of minimizing
f
(
x
)
=
∣
∣
A
x
−
b
∣
∣
2
2
f(x) = ||Ax-b||_2^2
f(x)=∣∣Ax−b∣∣22, where the rank of
A
A
A is larger than the dimension of
x
x
x. Which of the following corresponds to the optimality condition?
A
T
A
x
=
A
T
b
A^TAx=A^Tb
ATAx=ATb
3.Minimizing
∣
∣
f
(
x
)
−
b
∣
∣
2
||f(x)-b||^2
∣∣f(x)−b∣∣2 is prone, in general, to the existence of local minima.
True
4.Examples of nonlinear least squares problems include
Triangulation
Perspective-n-Point
5.Assume we want to minimize
∣
∣
f
(
x
)
−
b
∣
∣
2
2
||f(x)-b||_2^2
∣∣f(x)−b∣∣22. Then, the (globally) optimal solution satisfies
f(x)=b
6.If a point satisfies the condition of the previous question then it is globally optimal.
False
3D Velocities from Optical Flow
1.The equation of optical flow given in Lecture is:
Heading Direction
2.What was the constraint
X N
X I-J
B
3.In trying to minimize
Taylor
The second
Iterating
Bundle Adjustment
1.Bundle adjustment corresponds to minimization of
Reprojection Error
2.Bundle adjustment corresponds to optimization of a cost function with respect to
All of above
3.Assume that we want to minimize
b-f(x)
4.Which of the following tools are useful in a visual odometry framework
Bundle adjustment over sliding window
Key frame selection
Visual
5.Select any answer that is an indispensable part of a structure from motion pipeline.
B
P
E
O
OBI
OOI