数据结构与算法分析笔记

小杨加油呀

已于 2023-12-18 20:46:47 修改

阅读量110

点赞数

文章标签：笔记

于 2023-11-15 14:34:58 首次发布

本文链接：https://blog.csdn.net/m0_74958469/article/details/134381034

版权

要求

第三章：Algorithm Analysis

概念：asymptotic algorithm analysis, growth rate, best/worst/average case, upper/lower bound, big-Oh/big-Omega/Theta notation

应用题：时间、空间复杂度分析（给定代码或教材中算法）

第四章：list

概念：list, array-based list, (singly/doubly) linked list, (array-based, linked) stack, (array-based, circular, linked) queue, FIFO（先进先出）, LIFO(后进先出)

应用题：stack 和 queue 中数据的出入顺序

算法：不同存储结构（array-based, linked）下的 list/stack/queue 中各种操作的算法

第五章: binary trees

概念：pre-/in-/post-order traversal, level-order(Breadth-First) traversal, full/complete binary tree, height/depth/level of a binary tree，full/complete binary tree 的性质及存储, BST, Huffman tree, heap, priority queue

应用题：BST 中的插入/删除，Huffman 树的构造，heap 的构造，二叉树各种遍历，基于（前序和中序、或中序和后序）遍历序列构造二叉树

算法：基于二叉树遍历的各种算法，BST 中的查找

第三章

要求&笔记

❑ 掌握算法的时间复杂度和空间复杂度的概念，计算方法，数量级表示。

❑ 掌握算法分析的基本方法，会分析一个程序的最好/最差/平均时间复杂度

算法的时间复杂度是指：算法执行过程中需要的基本运算次数

算法的空间复杂度是指：算法执行过程中所需要的存储空间

◼ Factors affect the algorithm’s cost

❑ time(envirionment, the programming language,the quality of code,etc)

❑ space(main memory and disk space)

◼ Primary consideration(estimating an algorithm’s performance)

❑ The number of basic operations required by the algorithm to process an input of a certain size. ◼ Size is the number of inputs processed.

◼ Basic operations must have the property that is time to complete does not depend on the particular values of its operands

Not all inputs of a given size take the same time to run.

存在Best,Worst,Average Cases

The most important factor affecting running time is normally size of the input

(指数型增长率>二次增长>线性增长

渐进算法分析（asymptotic algorithm analysis）：Ignoring the constants simplifies the analysis and keeps us thinking about the most important aspect: the growth rate

❑ 度量当问题规模变大时一种算法或实现它的程序的效率和开销。

❑ 一种估算方法，并不能判断两个“差不多”的程序的相对优越性。

❑ 但它能作为确定一个算法是否值得实现的一个有效工具

Upper Bound (Big-Oh notation)：

◼ Indicates the upper or highest growth rate that the algorithm can have.

◼ 如果算法的增长率上限（case）是f(n)，那么就说这种算法在 O(f(n))中。

Lower Bounds （大Ω 表示法）

◼ Indicates lowest growth rate that the algorithm can have

Theta Notation

◼ When big-Oh and Big-Omega coincide, we indicate this by using $\Theta$ (big-Theta) notation.

◼ Definition: An algorithm is said to be $\Theta$ (h(n)) if it is in O(h(n)) and it is in $\Theta$ (h(n))

Pay attention:

1、double nested for loops are $\Theta$ （n^2）

2、二分法查找（数组实现）

int binary(int[] A,int k)
{
    int left=-1;
    int right=A.length;
    while(left+1 != right){
        int i=(left+right)/2;
        if(l<A[i])    right=i;
        if(l=A[i])    return i;//the position
        if(l>A[i])    left=i;
    {
    return A.length;not in A
}

3、common mistake

◼ For most algorithms，the upper and lower bound for that cost function are always the same.

◼ The distinction between an upper and a lower bound is only worthwhile when you have incomplete knowledge about the thing being measured.

◼ The upper bound for an algorithm is not the same as the worst case for that algorithm for a given input of size n.

❑ 上限 ≠最差情况，下限≠最佳情况

◼ 规模小≠最佳情况，规模大≠最差情况。

❑ best- and worse case instances exist for each possible size of input

◼ Consider the cost for performing sequential search as the size of the array n gets bigger.

作业解答

Determine Θ for the following code fragments in the average case. Assume

that all variables are of type int.

第四章

要求&笔记

Lists, Stack, and Queues

第四章：list

概念：list, array-based list, (singly/doubly) linked list, (array-based, linked) stack, (array-based, circular, linked) queue, FIFO（先进先出）, LIFO(后进先出)

应用题：stack 和 queue 中数据的出入顺序

算法：不同存储结构（array-based, linked）下的 list/stack/queue 中各种操作的算法

◼ Lists

A list is a finite, ordered sequence of data items

❑ –With the exception of the first element, each element has one and only one direct precursor(前驱）.

❑ With the exception of the last element, each element has one and only one direct successor（后继）

concepts: empty list length head tail

1、Array-based list（顺序表）//操作不掌握

删除、插入都需要一个一个地移动

数组实现

2、Linked Lists(链表)//操作需要掌握

◼ Linked list uses dynamic memory allocation which allocates memory for new list elements as needed.

◼ A linked list is made up of a series of objects, called the nodes of the list.

◼ Type:

❑ Singly-linked list(One-way list) 单链表

❑ Doubly Linked Lists 双链表

❑ Singly-linked list(One-way list) 单链表实现（struct）

struct Node{
    int data;
    Node* next;
}

//插入
void insert(Node* curr){
    curr.next=new Node(data,curr.next);
    //尾结点插入 多一步
    if(curr==tail)    tail=curr.next;
}

//append
void append(Node* curr){
    tail=tail.next=new Node(data,NULL);
    //逻辑从右往左
}

//remove and return current element
void remove(Node* curr){
    int data=curr.next.data;
    Node temp=curr.next;
    curr.next=curr.next.next;
    free(temp);//需要temp    worry：curr.next
    //the same if(curr.next==tail) change tail
}

❑ Doubly Linked Lists 双链表实现（struct）

struct Node{
    int data;
    Node *prev;
    Node *next;
}

链表小结（对应线性表）

❑储存特点逻辑关系上相邻的两个元素在物理存储空间上不相邻

❑优点在插入、删除某一元素时，操作简单O（1）充分利用零散空间

❑缺点空间分配大（指针）

（待续）

◼ Stacks

Restricted form of list: Insert and remove only at front of list.

LIFO: Last In, First Out. （先进后出）

Notation:

◼ The accessible element is called TOP.

◼ Insert: PUSH 入栈

◼ Remove: POP 出栈

stack variations：

◼ array-based stack

◼ linked stack

eg.打开网页，一级一级的

Comparison of Array-Based and Linked Stacks

◼ Time

❑ All operations for the array-based and linked stack implementations take constant time.

◼ Space

❑ The array-based stack must declare a fixed-size array initially, and some of that space is wasted whenever the stack is not full.

❑ The linked stack can shrink and grow but requires the overhead of a link field for every element.

//A example of replacing recursion with a stack
long fact(int n, Stack<int>& S) { 
    // Compute n!
    // To fit n! in a long variable, require n <= 12

    Assert((n >= 0) && (n <= 12), "Input out of range");
    while (n > 1){
         S.push(n--); // Load up the stack
    }
    long result = 1; // Holds final result
    while(S.length() > 0){
         result = result * S.pop(); // Compute
    }
    return result;
}

堆的创建和删除

#include<stack.h>
#include<string.h>
stack<char> opnd; 
// empty() top() pop() push()

//用empty（）清空栈
if(!opnd.empty()){
    opnd.pop();
}

//录入小数 
//输入小数 判断 isnum||‘.’
ch=cin.get();
//判断可否录入//循环
{
    string str;
    str=str+ch;
    //get next ch
    //until ch is not num/'.'
}

opnd.push(atof(str.c_str()));

◼ Queues

◼ The queue is a list-like structure that provides restricted access to its elements.

◼ Queue elements may only be inserted at the back (called an enqueue operation) and removed from the front (called a dequeue operation).

◼ FIFO: First-In, First-Out(先进先出)

◼ Notation:

❑ enqueue(入队)

❑ dequeue(出队)

◼ two queues

❑ array-based queue

❑ linked queue

Array-Based Queue

解决：circular queue

允许队列直接从最后位置延续到最前面的位置。

◼ 实现方式:

取模运算。 ❑ front=(front+1)%size

❑ rear=(rear+1)%size （%size==mod(size)）

判断Full & empty

情况相同：解决——一个位置不存

◼ 存储n个元素, 数组大小size为n+1。

◼ 队首指针进1: front=front+1 ->front = (front +1)%size

◼ 队尾指针进1: rear=rear+1 ->rear = (rear +1)%size

◼ 元素个数: rear-front+1 ->(rear-front +1+size)%size//非改进版本注意是否有一个位置未用

◼ 空: 元素个数为0 ->front = (rear +1)%size

◼ 满：front = rear + 2 ->front = (rear +2)%size

//实现
#include<queue.h>

queue Q;
Q.push();//入队
Q.front();//出队

实现：linked queue

◼ 链式队列(linked queue): 不连续存放元素

❑ 队首在链头, 队尾在链尾。

❑ front 指向表头的指针

❑ rear 指向队尾元素的指针

◼ 链式队列在进队时无队满问题, 但有队空问题。

◼ 队空条件？（front==rear）

Comparison of Array-Based and Linked Queues

◼ 时间效率差不多: 所有操作只需常数时间。

◼ 空间代价

❑ 顺序队列不够满时, 一些空间将浪费掉；

❑ 链式队列长度可变, 但链接域会产生结构性开销

◼ Dictionaries

作业解答

判断字符串是否为回文

回文：“abccba” “1234321”

要求：运用栈和队的知识

关键：栈的特点是先入后出（LIFO）&队的特点是先入先出（FIFO）结合用

因此把字符串分为两部分分别放在两种数据结构中前放队/堆都行

第五章

要求&笔记

第五章: binary trees

概念：pre-/in-/post-order traversal, level-order(Breadth-First) traversal, full/complete binary tree, height/depth/level of a binary tree，full/complete binary tree 的性质及存储, BST, Huffman tree, heap, priority queue

应用题：BST 中的插入/删除，Huffman 树的构造，heap 的构造，二叉树各种遍历，基于（前序和中序、或中序和后序）遍历序列构造二叉树

算法：基于二叉树遍历的各种算法，BST 中的查找

树: 一种简单的非线性结构,数据元素之间具有层次特性。

◼ A binary tree is made up of a finite set of nodes that is either empty or consists of a node called the root together with two binary trees, called the left and right subtrees, which are disjoint from each other and from the root.

二叉树特点：

(1) 非空二叉树只有一个根结点。

(2) 每个结点最多有两棵子树

depth:相对node的高度

height：绝对高度

◼ Full binary tree满二叉树: Each node is either a leaf or internal node with exactly two non-empty children.

◼ Complete binary tree完全二叉树: If the height of the tree is d, then all leaves except possibly level d-1 are completely full. The bottom level has all nodes to the left side

(其实考试是全英文，但中文更好记？

例：

（左为满二叉树右为完全二叉树

（满二叉树就是一组组凑上去完全二叉树最后一行连着

满二叉树的性质：

Theorem1: The number of leaves in a non-empty full binary tree is one more than the number of internal nodes（子叶结点树比内部结点数多一，数学归纳法证明）

Theorem2: The number of empty subtrees in a non-empty binary tree is one more than the number of nodes in the tree（空子叶节点数比树总结点数多一，证明：把空子叶结点代替为叶节点）

Traversals：(实现代码也需要记得运用)

先序遍历（prve）	中左右
中序遍历（in）	左中右	BST正序排列，可表示算式
后序遍历（post）	左右中
考法：已知两种遍历结果，求另一种

例：

后序：

a may way:（不适用似乎

//the first implement (the same class)
template <typename Key, typename E>
class BSTNode : public BinNode<E> {
private:
E it; // The node’s value
BSTNode* lc; // Pointer to left child
BSTNode* rc; // Pointer to right child
}

//the second implement (leafnode and intlnode)

◼ Full binary tree满二叉树

对于满二叉树的实现

the first implement && the second implement

// Return the number of nodes in the tree

template <typename E>
int count(BinNode<E>* root) {
if (root == NULL) return 0; // Nothing to count
return 1 + count(root->left())+ count(root->right());
}

// Return the leaf number of nodes in the tree

template <typename E>
int count(BinNode<E>* root) {
if (root == NULL) return 0; // Nothing to count
if ((root->left()==NULL)&&(root->right()==NULL))    return 1;
return count(root->left())+count(root->right());
}

space overhead

Overhead is the amount of space necessary to maintain the data structure. In other words, it is any space not used to store data records.

◼ Complete binary tree完全二叉树

对于完全二叉树的实现

对完全二叉树，可以由节点的序号计算节点在二叉树中的位置

◼ if r ≠ 0.

◼ Left child(r) = 2r + 1 if 2r + 1 < n.

◼ Right child(r) = 2r + 2 if 2r + 2 < n

◼ Left sibling(r) = r - 1 if r is even.

◼ Right sibling(r) = r + 1 if r is odd and r + 1 < n.

Binary Search Trees

BST: All elements stored in the left subtree of a node with value K have values < K. All elements stored in the right subtree of a node with value K have values >= K

BST中序遍历为升序

？The position of insertion will either be a leaf node or an internal node with no child in the appropriate direction.

BST的操作

BST中的插入

◼The position of insertion will either be a leaf node or an internal node with no child in the appropriate direction.

◼ If duplicate keys are allowed, our convention will be to insert the duplicate in the right subtree.

BST 中的删除

first find R and then remove it from the tree.

◼ If R has no children, then R’s parent has its pointer set to NULL.

◼ If R has one child, then R’s parent has its pointer set to R’s child (similar to deletemin).

◼ If R has two children?——R作为“根”，选择“右子树”的最小值代替

算法：BST查找

结点查找：从根结点开始，在二叉查找树中检索值K。若结点值为K，则检索结束；若K小于结点值，则在左子树中检索；若K大于结点值，则在右子树中检索。以此类推，一直到K被找到或检索到叶结点为止。

代码实现：

template <typename Key, typename E>
E BST<Key, E>::findhelp(BSTNode<Key, E>* root,
const Key& k) const {
if (root == NULL) return NULL; // Empty tree
if (k < root->key())
return findhelp(root->left(), k); // Check left
else if (k > root->key())
return findhelp(root->right(), k); // Check right
else return root->element(); // Found it
}

Heaps and Priority Queues

◼ There are many situations where we wish to choose the next “most important” from a collection of people, tasks, or objects.

◼ Priority queue ：a collection of objects is organized by importance or priority.

❑ Unsorted list

search for the element with highest priority will take Θ(n) time.

❑ Sorted list

require Θ(n) time for either insertion or removal

❑ A BST that organizes records by priority

the total of n insertion an removal Θ(nlogn)

may be unbalanced, leading to bad performance