kd树是做多维数据索引时候用到的一种数据结构:
k-d树是二叉检索树的扩展,k-d树的每一层将空间分成两个。树的顶层结点按一维进行划分,下
一层结点按另一维进行划分,以此类推,各个维循环往复。划分要使得在每个结点,大约一半存
储在子树中的点落入一侧,而另一半落入另一侧。当一个结点中的点数少于给定的最大点数时,划分结束。
在每一层利用某一维的信息进行判断,比如来说,当前的数据是10维的,在第一层的时候用第一层的信息进行二叉排序,第一维数据比较,小于的在左子树,大于的在右子树,在第二层的时候又比较第二维的数据比较了。
那这个是怎么检索最紧邻的数据呢?当然是在lg(n)的复杂度了
对于拥有n个已知点的kD-Tree,其复杂度如下:
- 构建:O(log2n)
- 插入:O(log n)
- 删除:O(log n)
- 查询:O(n1-1/k+m) m---每次要搜索的最近点个数
In computer science, a kd-tree (short for k-dimensional tree) is a space-partitioning data structure for organizing points in a k-dimensional space. kd-trees are a useful data structure for several applications, such as searches involving a multidimensional search key (e.g. range searches and nearest neighbor searches). kd-trees are a special case of BSP trees.(二叉搜索树)
Informal description
The kd-tree is a binary tree in which every node is a k-dimensional point. Every non-leaf node can be thought of as implicitly generating a splitting hyperplane that divides the space into two parts, known as subspaces. Points to the left of this hyperplane represent the left sub-tree of that node and points right of the hyperplane are represented by the right sub-tree. The hyperplane direction is chosen in the following way: every node in the tree is associated with one of the k-dimensions, with the hyperplane perpendicular to that dimension's axis. So, for example, if for a particular split the "x" axis is chosen, all points in the subtree with a smaller "x" value than the node will appear in the left subtree and all points with larger "x" value will be in the right sub tree. In such a case, the hyperplane would be set by the x-value of the point, and its normal would be the unit x-axis.