Distance geometry

「已注销」

已于 2022-12-18 15:00:58 修改

阅读量281

点赞数

分类专栏： Reference 文章标签： python 开发语言

于 2022-11-25 09:03:25 首次发布

本文链接：https://blog.csdn.net/qq_66485519/article/details/128031690

版权

Reference 专栏收录该内容

581 篇文章 5 订阅

订阅专栏

距离几何学是研究仅基于点之间距离的数学分支，涉及从距离推断配置形状的问题，如超声导航中的位置确定和数据降维。它包括了半度量空间、等距嵌入和仿射独立性的概念，并利用 Cayley-Menger 确定式来判断点集是否可在欧几里得空间中嵌入。该理论在生物学、传感器网络等领域有广泛应用。

摘要由CSDN通过智能技术生成

Distance geometry is the branch of mathematics concerned with characterizing and studying sets of points based only on given values of the distances between pairs of points.[1][2][3] More abstractly, it is the study of semimetric spaces and the isometric transformations between them. In this view, it can be considered as a subject within general topology.[4]

Historically, the first result in distance geometry is Heron’s formula in 1st century AD. The modern theory began in 19th century with work by Arthur Cayley, followed by more extensive developments in the 20th century by Karl Menger and others.

Distance geometry problems arise whenever one needs to infer the shape of a configuration of points (relative positions) from the distances between them, such as in biology,[4] sensor network,[5] surveying, navigation, cartography, and physics.

1 Introduction and definitions
2 Cayley–Menger determinants
3 History
4 Menger characterization theorem
5 Characterization via Cayley–Menger determinants
6 Applications
7 See also

1 Introduction and definitions

The concepts of distance geometry will first be explained by describing two particular problems.

1.1 First problem: hyperbolic navigation

Consider three ground radio stations A, B, C, whose locations are known. A radio receiver is at an unknown location. The times it takes for a radio signal to travel from the stations to the receiver, {\displaystyle t_{A},t_{B},t_{C}}{\displaystyle t_{A},t_{B},t_{C}}, are unknown, but the time differences, {\displaystyle t_{A}-t_{B}}{\displaystyle t_{A}-t_{B}} and {\displaystyle t_{A}-t_{C}}{\displaystyle t_{A}-t_{C}}, are known. From them, one knows the distance differences {\displaystyle c(t_{A}-t_{B})}{\displaystyle c(t_{A}-t_{B})} and {\displaystyle c(t_{A}-t_{C})}{\displaystyle c(t_{A}-t_{C})}, from which the position of the receiver can be found.

在这里插入图片描述

Problem of hyperbolic navigation

1.2 Second problem: dimension reduction

In data analysis, one is often given a list of data represented as vectors {\displaystyle \mathbf {v} =(x_{1},\ldots ,x_{n})\in \mathbb {R} ^{n}}{\displaystyle \mathbf {v} =(x_{1},\ldots ,x_{n})\in \mathbb {R} ^{n}}, and one needs to find out whether they lie within a low-dimensional affine subspace. A low-dimensional representation of data has many advantages, such as saving storage space, computation time, and giving better insight into data.

1.3 Definitions

Now we formalize some definitions that naturally arise from considering our problems.

1.3.1 Semimetric space

Given a list of points on {\displaystyle R={P_{0},\ldots ,P_{n}}}{\displaystyle R={P_{0},\ldots ,P_{n}}}, {\displaystyle n\geq 0}n\geq 0, we can arbitrarily specify the distances between pairs of points by a list of {\displaystyle d_{ij}>0}{\displaystyle d_{ij}>0}, {\displaystyle 0\leq i<j\leq n}{\displaystyle 0\leq i<j\leq n}. This defines a semimetric space: a metric space without triangle inequality.

Explicitly, we define a semimetric space as a nonempty set {\displaystyle R}R equipped with a semimetric {\displaystyle d:R\times R\to [0,\infty )}{\displaystyle d:R\times R\to [0,\infty )} such that, for all {\displaystyle x,y\in R}x,y\in R,

Positivity: {\displaystyle d(x,y)=0}d(x,y)=0 if and only if {\displaystyle x=y}x=y.
Symmetry: {\displaystyle d(x,y)=d(y,x)}d(x,y)=d(y,x).
Any metric space is a fortiori a semimetric space. In particular, {\displaystyle \mathbb {R} ^{k}}\mathbb {R} ^{k}, the {\displaystyle k}k-dimensional Euclidean space, is the canonical metric space in distance geometry.

The triangle inequality is omitted in the definition, because we do not want to enforce more constraints on the distances {\displaystyle d_{ij}}d_{ij} than the mere requirement that they be positive.

In practice, semimetric spaces naturally arise from inaccurate measurements. For example, given three points {\displaystyle A,B,C}A,B,C on a line, with {\displaystyle d_{AB}=1,d_{BC}=1,d_{AC}=2}{\displaystyle d_{AB}=1,d_{BC}=1,d_{AC}=2}, an inaccurate measurement could give {\displaystyle d_{AB}=0.99,d_{BC}=0.98,d_{AC}=2.00}{\displaystyle d_{AB}=0.99,d_{BC}=0.98,d_{AC}=2.00}, violating the triangle inequality.

1.3.2 Isometric embedding

Given two semimetric spaces, {\displaystyle (R,d),(R’,d’)}{\displaystyle (R,d),(R’,d’)}, an isometric embedding from {\displaystyle R}R to {\displaystyle R’}R’ is a map {\displaystyle f:R\to R’}{\displaystyle f:R\to R’} that preserves the semimetric, that is, for all {\displaystyle x,y\in R}x,y\in R, {\displaystyle d(x,y)=d’(f(x),f(y))}{\displaystyle d(x,y)=d’(f(x),f(y))}.

For example, given the finite semimetric space {\displaystyle (R,d)}{\displaystyle (R,d)} defined above, an isometric embedding into is defined by points {\textstyle A_{0},A_{1},\ldots ,A_{n}\in \mathbb {R} ^{k}}{\textstyle A_{0},A_{1},\ldots ,A_{n}\in \mathbb {R} ^{k}}, such that {\displaystyle d(A_{i},A_{j})=d_{ij}}{\displaystyle d(A_{i},A_{j})=d_{ij}} for all {\displaystyle 0\leq i<j\leq n}{\displaystyle 0\leq i<j\leq n}.

1.3.3 Affine independence

Given the points {\textstyle A_{0},A_{1},\ldots ,A_{n}\in \mathbb {R} ^{k}}{\textstyle A_{0},A_{1},\ldots ,A_{n}\in \mathbb {R} ^{k}}, they are defined to be affinely independent, iff they cannot fit inside a single {\displaystyle l}{\displaystyle l}-dimensional affine subspace of {\displaystyle \mathbb {R} ^{k}}{\displaystyle \mathbb {R} ^{k}}, for any {\displaystyle \ell <n}{\displaystyle \ell <n}, iff the {\displaystyle n}n-simplex they span, {\displaystyle v_{n}}v_{n}, has positive {\displaystyle n}n-volume, that is, {\displaystyle \operatorname {Vol} {n}(v{n})>0}{\displaystyle \operatorname {Vol} {n}(v{n})>0}.

In general, when {\displaystyle k\geq n}{\displaystyle k\geq n}, they are affinely independent, since a generic n-simplex is nondegenerate. For example, 3 points in the plane, in general, are not collinear, because the triangle they span does not degenerate into a line segment. Similarly, 4 points in space, in general, are not coplanar, because the tetrahedron they span does not degenerate into a flat triangle.

When {\displaystyle n>k}{\displaystyle n>k}, they must be affinely dependent. This can be seen by noting that any {\displaystyle n}n-simplex that can fit inside {\displaystyle \mathbb {R} ^{k}}\mathbb {R} ^{k} must be “flat”.

2 Cayley–Menger determinants

Main article: Cayley–Menger determinant
Cayley–Menger determinants, named after Arthur Cayley and Karl Menger, are determinants of matrices of distances between sets of points.

Let {\textstyle A_{0},A_{1},\ldots ,A_{n}}{\textstyle A_{0},A_{1},\ldots ,A_{n}} be n + 1 points in a semimetric space, their Cayley–Menger determinant is defined by

{\displaystyle \operatorname {CM} (A_{0},\cdots ,A_{n})={\begin{vmatrix}0&d_{01}^{2}&d_{02}{2}&\cdots &d_{0n}^{2}&1\d_{01}{2}&0&d_{12}^{2}&\cdots &d_{1n}^{2}&1\d_{02}{2}&d_{12}^{2}&0&\cdots &d_{2n}^{2}&1\\vdots &\vdots &\vdots &\ddots &\vdots &\vdots \d_{0n}^{2}&d_{1n}{2}&d_{2n}^{2}&\cdots &0&1\1&1&1&\cdots &1&0\end{vmatrix}}}{\displaystyle \operatorname {CM} (A_{0},\cdots ,A_{n})={\begin{vmatrix}0&d_{01}^{2}&d_{02}{2}&\cdots &d_{0n}^{2}&1\d_{01}{2}&0&d_{12}^{2}&\cdots &d_{1n}^{2}&1\d_{02}{2}&d_{12}^{2}&0&\cdots &d_{2n}^{2}&1\\vdots &\vdots &\vdots &\ddots &\vdots &\vdots \d_{0n}^{2}&d_{1n}{2}&d_{2n}^{2}&\cdots &0&1\1&1&1&\cdots &1&0\end{vmatrix}}}
If {\textstyle A_{0},A_{1},\ldots ,A_{n}\in \mathbb {R} ^{k}}{\textstyle A_{0},A_{1},\ldots ,A_{n}\in \mathbb {R} ^{k}}, then they make up the vertices of a possibly degenerate n-simplex {\displaystyle v_{n}}v_{n} in {\displaystyle \mathbb {R} ^{k}}\mathbb {R} ^{k}. It can be shown that[6] the n-dimensional volume of the simplex {\displaystyle v_{n}}v_{n} satisfies

{\displaystyle \operatorname {Vol} {n}(v{n})^{2}={\frac {(-1)^{n+1}}{(n!){2}2^{n}}}\operatorname {CM} (A_{0},\ldots ,A_{n}).}{\displaystyle \operatorname {Vol} {n}(v{n})^{2}={\frac {(-1)^{n+1}}{(n!){2}2^{n}}}\operatorname {CM} (A_{0},\ldots ,A_{n}).}
Note that, for the case of {\displaystyle n=0}n=0, we have {\displaystyle \operatorname {Vol} {0}(v{0})=1}{\displaystyle \operatorname {Vol} {0}(v{0})=1}, meaning the “0-dimensional volume” of a 0-simplex is 1, that is, there is 1 point in a 0-simplex.

{\textstyle A_{0},A_{1},\ldots ,A_{n}}{\textstyle A_{0},A_{1},\ldots ,A_{n}} are affinely independent iff {\displaystyle \operatorname {Vol} {n}(v{n})>0}{\displaystyle \operatorname {Vol} {n}(v{n})>0}, that is, {\displaystyle (-1)^{n+1}\operatorname {CM} (A_{0},\ldots ,A_{n})>0}{\displaystyle (-1)^{n+1}\operatorname {CM} (A_{0},\ldots ,A_{n})>0}. Thus Cayley–Menger determinants give a computational way to prove affine independence.

If {\displaystyle k<n}{\displaystyle k<n}, then the points must be affinely dependent, thus {\displaystyle \operatorname {CM} (A_{0},\ldots ,A_{n})=0}{\displaystyle \operatorname {CM} (A_{0},\ldots ,A_{n})=0}. Cayley’s 1841 paper studied the special case of {\displaystyle k=3,n=4}{\displaystyle k=3,n=4}, that is, any five points {\displaystyle A_{0},\ldots ,A_{4}}{\displaystyle A_{0},\ldots ,A_{4}} in 3-dimensional space must have {\displaystyle \operatorname {CM} (A_{0},\ldots ,A_{4})=0}{\displaystyle \operatorname {CM} (A_{0},\ldots ,A_{4})=0}.

3 History

The first result in distance geometry is Heron’s formula, from 1st century AD, which gives the area of a triangle from the distances between its 3 vertices. Brahmagupta’s formula, from 7th century AD, generalizes it to cyclic quadrilaterals. Tartaglia, from 16th century AD, generalized it to give the volume of tetrahedron from the distances between its 4 vertices.

The modern theory of distance geometry began with Arthur Cayley and Karl Menger.[7] Cayley published the Cayley determinant in 1841,[8] which is a special case of the general Cayley–Menger determinant. Menger proved in 1928 a characterization theorem of all semimetric spaces that are isometrically embeddable in the n-dimensional Euclidean space {\displaystyle \mathbb {R} ^{n}}\mathbb {R} ^{n}.[9][10] In 1931, Menger used distance relations to give an axiomatic treatment of Euclidean geometry.[11]

Leonard Blumenthal’s book[12] gives a general overview for distance geometry at the graduate level, a large part of which is treated in English for the first time when it was published.

4 Menger characterization theorem

Menger proved the following characterization theorem of semimetric spaces:[2]

A semimetric space {\displaystyle (R,d)}{\displaystyle (R,d)} is isometrically embeddable in the {\displaystyle n}n-dimensional Euclidean space {\displaystyle \mathbb {R} ^{n}}\mathbb {R} ^{n}, but not in {\displaystyle \mathbb {R} ^{m}}\mathbb {R} ^{m} for any {\displaystyle 0\leq m<n}{\displaystyle 0\leq m<n}, if and only if:

{\displaystyle R}R contains an {\displaystyle (n+1)}(n+1)-point subset {\displaystyle S}S that is isometric with an affinely independent {\displaystyle (n+1)}(n+1)-point subset of {\displaystyle \mathbb {R} ^{n}}\mathbb {R} ^{n};
any {\displaystyle (n+3)}{\displaystyle (n+3)}-point subset {\displaystyle S’}S’, obtained by adding any two additional points of {\displaystyle R}R to {\displaystyle S}S, is congruent to an {\displaystyle (n+3)}{\displaystyle (n+3)}-point subset of {\displaystyle \mathbb {R} ^{n}}\mathbb {R} ^{n}.
A proof of this theorem in a slightly weakened form (for metric spaces instead of semimetric spaces) is in.[13]

5 Characterization via Cayley–Menger determinants

The following results are proved in Blumethal’s book.[12]

5.1 Embedding $n+1$ points in $\mathbb {R} ^{n}$

Given a semimetric space {\displaystyle (S,d)}{\displaystyle (S,d)} , with {\displaystyle S={P_{0},\ldots ,P_{n}}}{\displaystyle S={P_{0},\ldots ,P_{n}}}, and {\displaystyle d(P_{i},P_{j})=d_{ij}\geq 0}{\displaystyle d(P_{i},P_{j})=d_{ij}\geq 0}, {\displaystyle 0\leq i<j\leq n}{\displaystyle 0\leq i<j\leq n}, an isometric embedding of {\displaystyle (S,d)}{\displaystyle (S,d)} into {\displaystyle \mathbb {R} ^{n}}\mathbb {R} ^{n} is defined by {\textstyle A_{0},A_{1},\ldots ,A_{n}\in \mathbb {R} ^{n}}{\textstyle A_{0},A_{1},\ldots ,A_{n}\in \mathbb {R} ^{n}}, such that {\displaystyle d(A_{i},A_{j})=d_{ij}}{\displaystyle d(A_{i},A_{j})=d_{ij}} for all {\displaystyle 0\leq i<j\leq n}{\displaystyle 0\leq i<j\leq n}.

Again, one asks whether such an isometric embedding exists for {\displaystyle (S,d)}(S,d).

A necessary condition is easy to see: for all {\displaystyle k=1,\ldots ,n}{\displaystyle k=1,\ldots ,n}, let {\displaystyle v_{k}}v_{k} be the k-simplex formed by {\textstyle A_{0},A_{1},\ldots ,A_{k}}{\textstyle A_{0},A_{1},\ldots ,A_{k}}, then

{\displaystyle (-1)^{k+1}\operatorname {CM} (P_{0},\ldots ,P_{k})=(-1)^{k+1}\operatorname {CM} (A_{0},\ldots ,A_{k})=2^{k}(k!){k}\operatorname {Vol} {k}(v{k})^{2}\geq 0}{\displaystyle (-1)^{k+1}\operatorname {CM} (P_{0},\ldots ,P_{k})=(-1)^{k+1}\operatorname {CM} (A_{0},\ldots ,A_{k})=2^{k}(k!){k}\operatorname {Vol} {k}(v{k})^{2}\geq 0}
The converse also holds. That is, if for all {\displaystyle k=1,\ldots ,n}{\displaystyle k=1,\ldots ,n},

{\displaystyle (-1)^{k+1}\operatorname {CM} (P_{0},\ldots ,P_{k})\geq 0,}{\displaystyle (-1)^{k+1}\operatorname {CM} (P_{0},\ldots ,P_{k})\geq 0,}
then such an embedding exists.

Further, such embedding is unique up to isometry in {\displaystyle \mathbb {R} ^{n}}\mathbb {R} ^{n}. That is, given any two isometric embeddings defined by {\textstyle A_{0},A_{1},\ldots ,A_{n}}{\textstyle A_{0},A_{1},\ldots ,A_{n}}, and {\textstyle A’{0},A’{1},\ldots ,A’{n}}{\textstyle A’{0},A’{1},\ldots ,A’{n}}, there exists a (not necessarily unique) isometry {\displaystyle T:\mathbb {R} ^{n}\to \mathbb {R} ^{n}}{\displaystyle T:\mathbb {R} ^{n}\to \mathbb {R} ^{n}}, such that {\displaystyle T(A_{k})=A’{k}}{\displaystyle T(A{k})=A’{k}} for all {\displaystyle k=0,\ldots ,n}{\displaystyle k=0,\ldots ,n}. Such {\displaystyle T}T is unique if and only if {\displaystyle \operatorname {CM} (P{0},\ldots ,P_{n})\neq 0}{\displaystyle \operatorname {CM} (P_{0},\ldots ,P_{n})\neq 0}, that is, {\textstyle A_{0},A_{1},\ldots ,A_{n}}{\textstyle A_{0},A_{1},\ldots ,A_{n}} are affinely independent.

5.2 Embedding $n + 2$ and $n + 3$ points

If {\displaystyle n+2}n+2 points {\displaystyle P_{0},\ldots ,P_{n+1}}{\displaystyle P_{0},\ldots ,P_{n+1}} can be embedded in {\displaystyle \mathbb {R} ^{n}}\mathbb {R} ^{n} as {\displaystyle A_{0},\ldots ,A_{n+1}}{\displaystyle A_{0},\ldots ,A_{n+1}}, then other than the conditions above, an additional necessary condition is that the {\displaystyle (n+1)}(n+1)-simplex formed by {\textstyle A_{0},A_{1},\ldots ,A_{n+1}}{\textstyle A_{0},A_{1},\ldots ,A_{n+1}}, must have no {\displaystyle (n+1)}(n+1)-dimensional volume. That is, {\displaystyle \operatorname {CM} (P_{0},\ldots ,P_{n},P_{n+1})=0}{\displaystyle \operatorname {CM} (P_{0},\ldots ,P_{n},P_{n+1})=0}.

The converse also holds. That is, if for all {\displaystyle k=1,\ldots ,n}{\displaystyle k=1,\ldots ,n},

{\displaystyle (-1)^{k+1}\operatorname {CM} (P_{0},\ldots ,P_{k})\geq 0,}{\displaystyle (-1)^{k+1}\operatorname {CM} (P_{0},\ldots ,P_{k})\geq 0,}
and

{\displaystyle \operatorname {CM} (P_{0},\ldots ,P_{n},P_{n+1})=0,}{\displaystyle \operatorname {CM} (P_{0},\ldots ,P_{n},P_{n+1})=0,}
then such an embedding exists.

For embedding {\displaystyle n+3}n+3 points in {\displaystyle \mathbb {R} ^{n}}\mathbb {R} ^{n}, the necessary and sufficient conditions are similar:

For all {\displaystyle k=1,\ldots ,n}{\displaystyle k=1,\ldots ,n}, {\displaystyle (-1)^{k+1}\operatorname {CM} (P_{0},\ldots ,P_{k})\geq 0}{\displaystyle (-1)^{k+1}\operatorname {CM} (P_{0},\ldots ,P_{k})\geq 0};
{\displaystyle \operatorname {CM} (P_{0},\ldots ,P_{n},P_{n+1})=0;}{\displaystyle \operatorname {CM} (P_{0},\ldots ,P_{n},P_{n+1})=0;}
{\displaystyle \operatorname {CM} (P_{0},\ldots ,P_{n},P_{n+2})=0;}{\displaystyle \operatorname {CM} (P_{0},\ldots ,P_{n},P_{n+2})=0;}
{\displaystyle \operatorname {CM} (P_{0},\ldots ,P_{n},P_{n+1},P_{n+2})=0.}{\displaystyle \operatorname {CM} (P_{0},\ldots ,P_{n},P_{n+1},P_{n+2})=0.}

5.3 Embedding arbitrarily many points

The {\displaystyle n+3}n+3 case turns out to be sufficient in general.

In general, given a semimetric space {\displaystyle (R,d)}{\displaystyle (R,d)}, it can be isometrically embedded in {\displaystyle \mathbb {R} ^{n}}\mathbb {R} ^{n} if and only if there exists {\displaystyle P_{0},\ldots ,P_{n}\in R}{\displaystyle P_{0},\ldots ,P_{n}\in R}, such that, for all {\displaystyle k=1,\ldots ,n}{\displaystyle k=1,\ldots ,n}, {\displaystyle (-1)^{k+1}\operatorname {CM} (P_{0},\ldots ,P_{k})\geq 0}{\displaystyle (-1)^{k+1}\operatorname {CM} (P_{0},\ldots ,P_{k})\geq 0}, and for any {\displaystyle P_{n+1},P_{n+2}\in R}{\displaystyle P_{n+1},P_{n+2}\in R},

{\displaystyle \operatorname {CM} (P_{0},\ldots ,P_{n},P_{n+1})=0;}{\displaystyle \operatorname {CM} (P_{0},\ldots ,P_{n},P_{n+1})=0;}
{\displaystyle \operatorname {CM} (P_{0},\ldots ,P_{n},P_{n+2})=0;}{\displaystyle \operatorname {CM} (P_{0},\ldots ,P_{n},P_{n+2})=0;}
{\displaystyle \operatorname {CM} (P_{0},\ldots ,P_{n},P_{n+1},P_{n+2})=0.}{\displaystyle \operatorname {CM} (P_{0},\ldots ,P_{n},P_{n+1},P_{n+2})=0.}
And such embedding is unique up to isometry in {\displaystyle \mathbb {R} ^{n}}\mathbb {R} ^{n}.

Further, if {\displaystyle \operatorname {CM} (P_{0},\ldots ,P_{n})\neq 0}{\displaystyle \operatorname {CM} (P_{0},\ldots ,P_{n})\neq 0}, then it cannot be isometrically embedded in any {\displaystyle \mathbb {R} ^{m},m<n}{\displaystyle \mathbb {R} ^{m},m<n}. And such embedding is unique up to unique isometry in {\displaystyle \mathbb {R} ^{n}}\mathbb {R} ^{n}.

Thus, Cayley–Menger determinants give a concrete way to calculate whether a semimetric space can be embedded in {\displaystyle \mathbb {R} ^{n}}\mathbb {R} ^{n}, for some finite {\displaystyle n}n, and if so, what is the minimal {\displaystyle n}n.

6 Applications

There are many applications of distance geometry.[3]

In telecommunication networks such as GPS, the positions of some sensors are known (which are called anchors) and some of the distances between sensors are also known: the problem is to identify the positions for all sensors.[5] Hyperbolic navigation is one pre-GPS technology that uses distance geometry for locating ships based on the time it takes for signals to reach anchors.

There are many applications in chemistry.[4][12] Techniques such as NMR can measure distances between pairs of atoms of a given molecule, and the problem is to infer the 3-dimensional shape of the molecule from those distances.

Some software packages for applications are:

DGSOL. Solves large distance geometry problems in macromolecular modeling.
Xplor-NIH. Based on X-PLOR, to determine the structure of molecules based on data from NMR experiments. It solves distance geometry problems with heuristic methods (such as simulated annealing) and local search methods (such as conjugate gradient minimization).
TINKER. Molecular modeling and design. It can solve distance geometry problems.
SNLSDPclique. MATLAB code for locating sensors in a sensor network based on the distances between the sensors.

7 See also

Euclidean distance matrix
Multidimensional scaling (a statistical technique used when distances are measured with random errors)
Metric space
Tartaglia’s formula
Triangulation
Trilateration