Related Work
- Cognitive science
- Learning physics from visual observations
- Stability prediction
- End-to-end approcach
3.The ShapeStacks Dataset
3.1.Dataset Content
Every recorded image carries a binary stability label. Also, every image is aligned with a segmentation map relating the different parts of the image to their semantics with regard to stability. The segmentation map annotates the object which violates the stability of the tower, the first object to fall during the collapse and the base and top of the tower.
3.1.The Mechanics of Stacking
Goal
study intuitive physics and the emergence of object affordances
Importance
a precise understanding of the physical properties of the scenarios is essential to control data generation as well as to evaluate models.
Constraint
- each object S rests on top of another object S_0 or the ground plane
- no two objects are at the same level.
- exclude structures such as arches, multiple columns, forks, etc.
- all objects are convex
The notion of Centre of Mass(CoM)
p = (x,y,z)\(\in S_i \subset R^3\) a point contained within the rigid body \(S_i\)
m the mass
the material is homogeneous with density \(\rho\)
CoM \(r_i = \rho \int_{S_i} pdxdydz/m\)
Study the stability of an object on top of another and then generalize the result to a full stack
Lemma 1
Let * \(S_1,…,S_n\)* be a collection of convex rigid bodies forming a single-stranded tower resting on a flat ground plane \(S_0\) . Let \(m_1,…,m_n\) be the masses of the objects and \(r_1,…,r_n\) their centres of mass. Furthermore, let \(A_i\) be the contact surface between object \(S_{i-1} \) and \(S_i\) and let \( \pi_i \subset A\)be the plane containing it. Assume that π is parallel to the xy plane, which in turn is rthogonal to gravity. Then, if the objects are initially at rest, the tower is stable if, and only if,
∀ i = 1 , . . . , n − 1 : P r o j π ( r i + 1 n ) ∈ A i , r i + 1 n = ∑ j = i + 1 n m j r j ∑ j = i + 1 n m j \forall i = 1,...,n-1: Proj_{\pi}(r_{i+1}^n) \in A_i, r_{i+1}^n = \frac {\sum_{j = i+1}^n m_jr_j}{\sum_{j = i+1}^n m_j} ∀i=1,...,n−1:Projπ(ri+1n)∈Ai,ri+1n=∑j=i+1nmj∑j=i+1nmjrj
\(r_{i+1}^n\) is the overall CoM of the topmost n-i blocks
Two types of instabilities
- violation of the planar surface criterion (VPSF)
curved suface --> infinitesimally small contact area - violation of the centre of mass criterion (VCOM)
4.Stability Prediction
from RGB images ----> passive observations of stable and unstable stacks
4.1.Training the Stability Predictor
From ShapeStacks dataset, annotated with binary stability labels.
4.2.Instability Localisation
comparing the network’s attention maps with the corresponding ground truth stability segmentation maps
- an occlusion study whereby images are blurred using a Gaussian filter(the blurred patch does not have rigid boundaries but gradually fades into the image)
Stability classifier Input: the patched images ----------> predicted stability scores are aggregated in a map
Focus: define as the smallest rectangle enclosing the violating object and the first object to fall
5.Stacking and Stackability
- estimate the stackability of different objects and prioritise them while stacking
- accurately estimate the optimal placement of blocks on a stack through visual feedback
- counter-balance an unstable structure by placing an additional object on top
indeed acquire actionable physical knowledge from passive stability prediction
5.1 Stackability
5.2 Stacking Shapes in Simulation
If no stable position is identified for a particular object, it is put aside and disregarded for the rest of the process.
The process is iterated until the placement of an object results in the collapse of the stack or no more objects are available.
The CCS stability predictor clearly outperforms the one trained on cubes only in all three scenarios.
5.3
Balancing Unstable Structures