第十一章查找

Xvpi

于 2024-09-30 15:26:32 发布

阅读量483

点赞数 7

文章标签：算法 java 数据结构

本文链接：https://blog.csdn.net/m0_72303194/article/details/135366654

版权

提示：文章写完后，目录可以自动生成，如何生成可参考右边的帮助文档

文章目录

4.二分答案（找到第一个大于val的位置）

5.快速查找（快速查找排序为第k的元素）O(n)

前言

平均查找长度ASL：Σ(pi*ci)概率*次数

一、静态查找

（一）顺序查找

1.查找最值

朴素查找：2（n-1）

同时利用两个数据找min,max 3n/2

2.查找质数

埃氏筛法思路：O(nloglogn)

分析：

埃氏筛法:每一个数，若能分解为k个不相同的质因数，会被筛掉k次。

合数限定：为防止多次删除，采用合数限定法，所以不能简单删除所有倍数，而是由当前找到的质数合成的数。

线性筛法（欧拉筛法）：

埃氏筛法 n较小时快，线性筛法n大时快

//朴素筛法
int primes[N], cnt;     // primes[]存储所有素数
bool st[N];         // st[x]存储x是否被筛掉

void get_primes(int n)
{
    for (int i = 2; i <= n; i ++ )
    {
        if (st[i]) continue;//筛掉就跳过
        primes[cnt ++ ] = i;
        for (int j = i + i; j <= n; j += i)
            st[j] = true;//倍数被筛
    }
}
//欧拉筛法
void get_primes(int n)
{
    for (int i = 2; i <= n; i ++ )
    {
        if (!st[i]) primes[cnt ++ ] = i;//没筛则放入
        for (int j = 0; primes[j] <= n / i; j ++ )//循环条件时i*当前质数在范围内
        {
            st[primes[j] * i] = true;//筛掉i*当前质数
            if (i % primes[j] == 0) break;
//如果当前质数是i的质数，则一定为i的最小质数，从此跳出
        }
    }
}

（二）折半查找（二分查找）

1.二分查找

（查找小于等于val的最右值）

ans的下一个位置也是>val的最左值

int find(int l, int r, int val) {
int mid, ans = l;//给个初始值，否则没找到的话访问a[ans]容易RE
while(l <= r) {
mid = l + r >> 1;//等价于(l + r) / 2，这里加法优先级大于位移运算可以打开括号
if(a[mid] <= val) ans = mid, l = mid + 1;
else r = mid - 1;
}
if(a[ans] == val) return ans;//有可能一直没找到
return -1;
}
//这样写找到的val的位置就是区间内该值出现的最靠右的位置。
//不论找没找到，ans的下一位就是第一个大于val的位置。

查找>=val的最左值，ans的上一个位置，也是<val的最右值

if(a[mid] >= val) ans = mid, r = mid - 1;//优先考虑往左调整区间

else l = mid + 1;

最多与└log2n┘+1个元素比较

2.区间查询（[a,b)）

[a,b),low为>=a的最小值，high为<b的最大值

3.快速求幂

4.二分答案（找到第一个大于val的位置）

bool check(int x, int val) {//具体题目传入的不一定是int
return x > val;
}
int find(int l, int r, int val) {//同理，不一定返回int
int l = 1, r = n, mid, ans = -1;
    while(l <= r) {
    mid = l + r >> 1;
    if(check(a[mid], val)) ans = mid, r = mid - 1;
    else l = mid + 1;    
    }
return ans;
}

应用：见一道趣题p65

5.快速查找（快速查找排序为第k的元素）O(n)

快排+二分

(三)索引查找

(四)习题

查找失败需要多走一步，所以n+1

！！构造等概率判定树，失败时，多走一个，分母为14，第四层有6个节点，第三层有两个节点有单独空孩子。

二、二叉查找树（BST）

BST的中序遍历即为节点数据中序排序结果，所以重构BST只需要知道前序或后序或层序

1.查找O（logd）

2.插入（查找不成功时）

BST结构取决于元素插入顺序

3.删除

先二分查找node。若找到：

1.如果node无孩子，是叶子节点，则直接删，父节点相应指针域为空。

2.如果node只有一个孩子，将孩子指向父节点，父节点的相应指针域改为其孩子。

3.如果node有两个孩子，找到后继节点（被删除节点的右孩子的最左祖孙），替换删除节点，删除后继节点（此节点情况一定属于1或2）

三、AVL树

每个节点都存储一个关键字值。
对于任意节点，它的左子树和右子树都是AVL树。
对于任意节点，其左子树中的关键字值小于等于节点的关键字值，而其右子树中的关键字值大于等于节点的关键字值。
每个节点都有一个平衡因子（Balance Factor），它表示其左子树的高度减去右子树的高度。平衡因子可以是 -1、0 或 1。
对于AVL树中的每个节点，其平衡因子必须为 -1、0 或 1。如果一个节点的平衡因子不在这个范围内，那么它就不是AVL树，需要进行平衡操作以恢复平衡性。

class AVLNode {
    int key;            // 节点的关键字值
    int height;         // 节点的高度
    AVLNode left;       // 左子节点
    AVLNode right;      // 右子节点

    public AVLNode(int key) {
        this.key = key;
        this.height = 1;
        this.left = null;
        this.right = null;
    }
}
// AVLTree表示AVL树
public class AVLTree {
    AVLNode root;       // 树的根节点

    // 获取节点的高度
    private int getHeight(AVLNode node) {
        if (node == null)
            return 0;
        return node.height;
    }

    // 获取节点的平衡因子
    private int getBalanceFactor(AVLNode node) {
        if (node == null)
            return 0;
        return getHeight(node.left) - getHeight(node.right);
    }

    // 更新节点的高度
    private void updateHeight(AVLNode node) {
        int leftHeight = getHeight(node.left);
        int rightHeight = getHeight(node.right);
        node.height = Math.max(leftHeight, rightHeight) + 1;
    }
// 执行左旋操作
    private AVLNode leftRotate(AVLNode node) {
        AVLNode newRoot = node.right;
        AVLNode subtree = newRoot.left;

        newRoot.left = node;
        node.right = subtree;

        updateHeight(node);
        updateHeight(newRoot);

        return newRoot;
    }

    // 执行右旋操作
    private AVLNode rightRotate(AVLNode node) {
        AVLNode newRoot = node.left;
        AVLNode subtree = newRoot.right;

        newRoot.right = node;
        node.left = subtree;

        updateHeight(node);
        updateHeight(newRoot);

        return newRoot;
    }
}

右旋转：原根node的左孩子成为新根newroot，newroot的右子树subtree变成原根node的左孩子。更新node和newroot的高度。

A的深度减1，B和C的深度加1

A的高度只增不减，B的高度只减不增，C的高度一定不变

左旋转：node的右孩子成为newroot,newroot的左子树subtree成为node的右孩子。更新高度。

插入节点

失衡类型：

插入节点自上而下层层比较，一定会成为叶节点。为保证平衡，再从下而上逐个判断平衡因子。使用递归做法。

RL型上层R高，下层L高，需要自下而上，先右转解决下层，再左转上层。

 // 插入节点到AVL树中
    public void insert(int key) {
        root = insertNode(root, key);
    }

    private AVLNode insertNode(AVLNode node, int key) {
        if (node == null)
            return new AVLNode(key); 
        if (key < node.key) {
            node.left = insertNode(node.left, key);
        } else if (key > node.key) {
            node.right = insertNode(node.right, key);
        } else {
            return node;
        } // 忽略重复的关键字值

        updateHeight(node);

        int balanceFactor = getBalanceFactor(node);

        // 左左情况，执行右旋
        if (balanceFactor > 1 && key < node.left.key)
            return rightRotate(node);

        // 右右情况，执行左旋
        if (balanceFactor < -1 && key > node.right.key)
            return leftRotate(node);

        // 左右情况，先对左子树左旋，再对当前节点右旋
        if (balanceFactor > 1 && key > node.left.key) {
        //key > node.left.key相当于左孩子的右子树比左子树高。
        //也可写成为getHeight(node.left.left)<getHeight(node.left.right)
            node. Left = leftRotate(node. Left);
            return rightRotate(node);
        }

        // 右左情况，先对右子树右旋，再对当前节点左旋
        if (balanceFactor < -1 && key < node.right.key) {
            node.right = rightRotate(node.right);
            return leftRotate(node);
        }

        return node;
    }

删除节点（结合插入的做法与BST树的删除）

    // 删除节点
    public void delete(int key) {
        root = deleteNode(root, key);
    }

    private AVLNode deleteNode(AVLNode node, int key) {
        if (node == null)
            return null;

        if (key < node.key)
            node.left = deleteNode(node.left, key);
        else if (key > node.key)
            node.right = deleteNode(node.right, key);
        else {
            // 找到要删除的节点

            if (node.left == null && node.right == null) {
                // 叶节点，直接删除
                node = null;
            } else if (node.left == null) {
                // 只有右子树，用右子树替换当前节点
                node = node.right;
            } else if (node.right == null) {
                // 只有左子树，用左子树替换当前节点
                node = node.left;
            } else {
                // 左右子树都存在，找到右子树中的最小节点
                AVLNode minNode = findMinNode(node.right);
                node.key = minNode.key;
                node.right = deleteNode(node.right, minNode.key);
            }
        }

        if (node == null)
            return null;

        updateHeight(node);

        int balanceFactor = getBalanceFactor(node);

        // 左左情况，执行右旋
        if (balanceFactor > 1 && getBalanceFactor(node.left) >= 0)
            return rightRotate(node);

        // 左右情况，先对左子树左旋，再对当前节点右旋
        if (balanceFactor > 1 && getBalanceFactor(node.left) < 0) {
            node.left = leftRotate(node.left);
            return rightRotate(node);
        }

        // 右右情况，执行左旋
        if (balanceFactor < -1 && getBalanceFactor(node.right) <= 0)
            return leftRotate(node);

        // 右左情况，先对右子树右旋，再对当前节点左旋
        if (balanceFactor < -1 && getBalanceFactor(node.right) > 0) {
            node.right = rightRotate(node.right);
            return leftRotate(node);
        }

        return node;
    }
    private AVLNode findMinNode(AVLNode node) {
        AVLNode current = node;
        while (current.left != null) {
            current = current.left;
        }
        return current;
    }

伪代码（注意判断旋转类型的内判断条件有误eg,LR:外getHeight(node.left>getHeight(node.right)

getHeight(node.left.left<getHeight(node.left.right))

树高上界

为使结点数尽可能少，需要层数尽可能多，所以左右子树高度差都会构造为1，所以左右子树的高度一个为d-1,一个为d-2，由此可以构造递归算法。

平衡树应用：

AVL、红黑树、B（+）树、

习题

四、散列查找

散列函数：一般情况下，需在关键字与记录在表中的存储位置之间建立一个函数关系，以 H(key) 作为关键字为 key 的记录在表中的位置，通常称这个函数 H(key) 为散列函数。

散列表：根据设定的 散列函数 H(key) 和提供的 处理冲突的方法，将一组关键字 映象到一个地址连续的地址空间上，并以关键字在地址空间中的 “象”作为相应记录在表中的 存储位置，如此构造所得的查找表称之为 散列表。

1.构造散列函数

1.直接散列函数

        取关键字本身或关键字的某个线性函数值作为散列地址

2.数字分析法

  设ｎ个ｄ位数的关键字，由ｒ个不同的符号组成，此ｒ个符号在关键字各位出现的频率不一定相同：
        ➢ 在某些位上均匀分布，即每个符号出现的次数都接近于ｎ／ｒ次；

        ➢ 在另一些位上分布不均匀。

        则 选择其中分布均匀的s位作为散列地址，即 H(key) =“key中数字均匀分布的s位

3.平方取中法

        关键字平方后的中间几位作为散列地址。

  求“关键字的平方值” 的目的是“ 扩大差别”和“ 贡献均衡”。关键字的各位都在平方值的中间几位有所贡献

4.折叠法

        关键字位数较长时，可将关键字 分割成位数相等的几部分（最后一部分位数可以不同）

这几部分的叠加和（舍去高位的进位）作为散列地址。位数由存储地址的位数确定

        ➢ 移位叠加法，即将每部分的最后一位对齐，然后相加；

        ➢ 边界叠加法，即把关键字看作一纸条，从一端向另一端沿边界逐次折叠，然后对齐相加。（s形排位）

5.除留余数法

        取关键字被某个不大于散列表长度m的数p除后的余数作为散列地址

         H（key）=key MOD p (p≤m)        • 一般取小于表长的最大质数

6.随机数法

        选择一个随机函数，取关键字的随机函数值作为散列地址

2.冲突处理

开放地址法

再散列法

链地址法

公共溢出区法

（1）开放地址

对增量 d i 有三种取法：

1) 线性探测再散列 (linear probing)

d i = c x i

一般情况： c=1

2) 平方探测再散列（二次探测再散列）

di = 1^2 , -1^2 , 2^2 , -2^2 , …, 或者 di=1^2 , 2^2 , 3^2 , …

3) 随机探测再散列

d i 是一组伪随机数列

(2)再散列法

（3）链地址法

将所有散列地址相同的记录都链接在同一链表中。

成功的分母是插入的地址的总个数，失败的分母是散列空间的大小 ASL失败=16/7

（4）公共溢出区

3.散列表查找

1）给定K值，根据构造表时所用的散列函数求散列地址 j

2）若此位置无记录, 则查找不成功

3）如果有记录，比较关键字

4）如果和给定的关键字相等则成功

5）否则根据构造表时设定的冲突处理的方法计算“下一地址”，重复2）

可能需要查重，避免在表装满或重复比较同一个关键字！

如果散列表始终留有空位，可以不用查重（线性探测？）

//11种结果等可能，需要移动的次数为10到2和两个1 ASL失败=56/11

习题

请回答采用线性探测再散列和链地址法处理冲突构建的哈希表中，查找失败时的平均查找长度如何计算？

例：已知一组关键字(19,14,23,1,68,20,84,27,55,11,10,79)

哈希函数为：H(key)=key MOD 13, 哈希表长为m=15，

设每个记录的查找概率相等，采用以上两种方法处理冲突，查找失败时的平均查找长度各是多少

ps:查找失败得平均长度分母是MOD后面得数也就是13

总结

第十三周

1.折半查找法的查找速度一定比顺序查找法快。❌

2.关于二分查找算法二分查找算法能适用于有序的链表。❌ 链表不好找中间顺序表✔

3.衡量查找算法效率的主要标准是（）平均查找长度

4.已知序列20,26,41,57,60,78,81,98,108,121,129，则用折半查找法查找81需要进行( )次比较.

依次比较78 108 81 共三次

第十四周

1.二叉排序树的查找效率和二叉排序树的髙度有关。✔

2.For a binary search tree, if its pre-order travel sequence is { 56, 28, 12, 35, 77, 64, 68, 72 }, then 68 is the parent of 72.

二叉查找树的中序遍历结果为升序序列 12，28，35，56，64，68，72，77，用中序划分左右子树，用前序找根。根56 划分左右。68和72在右子树。右子树中77为根。77的左子树中64为根。64的右子树中，68为根，72为其右孩子。

3.For a binary search tree, if its post-order travel sequence is { 30, 35, 45, 28, 64, 77, 68, 56 }, then 30 is the parent of 28.

中序 28，30，35，45，56，64，68，77

后序根为56，左子树的根28，28的右孩子中45为根，45的左孩子中35为根，35的左孩子中30为根

4.任何二叉搜索树中同一层的结点从左到右是有序的（从大到小）❌

有序，但是从小到大

5.平衡二叉树

Insert { 9, 8, 7, 2, 3, 5, 6, 4 } one by one into an initially empty AVL tree. How many of the following statements is/are FALSE?

the total number of rotations made is 4 (Note: double rotation counts 2 and single rotation counts 1)❌
the expectation (round to 0.01) of access time is 2.75✔（平均比较次数）
there are 1 nodes with a balance factor of -1❌