table/ sort/ tree_sort-tree-table-CSDN博客

本文链接：https://blog.csdn.net/yawen9790/article/details/89344064

本文介绍了哈希表的概念，包括其作用、优点和潜在的冲突问题。接着讨论了不同类型的排序算法，如冒泡排序、插入排序和选择排序，以及快速排序和归并排序等更高效的方法。最后，探讨了二叉树的基本概念，如二叉搜索树、AVL树和红黑树，并简要提到了图和图的表示方法。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

HashTable
Abstract datatype
faster data retrieval,
encrypt and decrypt digital signatures
a technique that is used to uniquely identify a specific object from a group of similar objects.
A hash table uses the key of each record to determine the location in an array structure. To do this, the key is passed into a hash function which will then return a numeric value based on the key.
a good hash function should be:
uniform (all indices are equally likely for the given set of possible keys)
random (not predictable)
Load Factor
number of records in table/ number of locations
Collisions
A hash function translates all possible keys into indexes in an array. This typically means that there are many many more keys than there are indexes. Thus, it is always possible that two or more keys will be translated into exactly the same index. When this happens you have a collision. Generally speaking, collisions are unavoidable.
Bucketing makes the hash table a 2D array instead of a single dimensional array. Every entry in the array is big enough to hold N items (N is not amount of data. Just a constant).
Problems:Lots of wasted space.|If N is exceeded, another strategy will need to be used|Not good for memory based implementations but doable if buckets are disk-based)
Chaining
At every location (hash index) in your hash table store a linked list of items.
For chaining, the runtimes depends on the load factor.The average length of each chain is λ.
Linear probing:single dimentional array.
bubble sort algorithm repeated bubbles the largest item within the list to the back.
voidbubbleSort(intarr[],intsize){
inttmp;inti;intj;
for(i=0;i<size-1;i++){
for(j=0;j<size-1-i;j++){
if(arr[j+1]<arr[j]){
tmp =arr[j];
arr[j]=arr[j+1];
arr[j+1]=tmp;}}}}
Insertion sort: the algorithm repeated inserts a value into an array that is already sorted. It essentially chops the array into two pieces. The first piece is sorted, the second is not. We repeatedly take a number from the second piece and insert it into the already sorted first piece of the array.
voidinsertionSort(intarr[],intsize){
intcurr;inti,j;
for(i=1;i<size;i++){
curr=arr[i];
for(j=i;j>0&&arr[j-1]>curr;j–){
arr[j]=arr[j-1];}
arr[j]=curr;}}
Selection sort: selecting the smallest value out of the unsorted part of the array and placing it at the back of the sorted part of the array.
void selectionSort(int arr[],int size){
int i,j,min;
for(int i=0;i<size-1;i++){
min=i;
for(int j=i+1;j<size;j++){
if(arr[j] < arr[min]){
min=j;}}
if(min!=i){
int tmp=arr[min];
arr[min]=arr[i];
arr[i]=tmp;}}}
The merge sort works on the idea of merging two already sorted lists. If there existed two already sorted list, merging the two together into a single sorted list can be accomplished in O(n) time.
• Have a way to “point” at the first element of each of the two list
• compare the values being pointed at and pick the smaller of the two
• copy the smaller to the merged list, and advance the “pointer” of just that list to the next item.
• Continue until one of the list is completely copied then copy over remainder of the rest of the list
Quick sort is fast and does not have the extra memory requirements of MergeSort. On average its run time is O(n log n)O(n log n)but it does have a worst case run time of O(n2)O(n2)
QuickSort works like this:
Pick a value from the array as the pivot
Let i=front, j= back
advance i until you find a value arr[i] > pivot
move j towards front of array until arr[j] < pivot
swap these arr[i] and arr[j].
repeat step 3 to 5 until i and j meet
The meeting point is used to break the array up into two pieces
QuickSort these two new arrays
Heap Sort is a sort based on the building and destroying of a binary heap. The binary heap is an implementation of a Priority Queue
Basic operations on the binary Heap include:
insert - add an item to the binary heap
• create a new empty node in the left most open spot at the bottom level of the tree
• If value can be placed into node without violating heap order property put it in
• otherwise pull the value from parent into the empty node
• repeat the previous two steps until the value can be placed
delete - removes the item with the highest priority in the binary heap.
• If the value could be placed into the empty node (remember, this starts at root) without violating the Heap Order Property, put it in and we are done
• otherwise move the child with the higher priority up (the empty spot moves down).
• Repeat until value is placed
Binary Heap- A binary heap is a complete binary treewhere the heap order propertyis always maintained.
Binary Tree- A binary tree is either a) empty (no nodes), or b) contains a root node with two children which are both binary trees.
Complete Binary Tree- A binary tree where there are no missing nodes in all except at the bottom level. At the bottom level the missing nodes must be to the right of all other nodes
Heap Order Property: For each node, the parent of the node must have a higher priority, while its children must have a lower priority. There is no ordering of priority other than this rule. Thus, the highest priority item will be at the root of the tree. Below is a heap where we define the smaller value as having higher priority:
This process effectively creates an empty node starting at the bottom of the tree. The empty node moves up until it is in the correct position and the value can be placed inside the empty node. This process of moving the empty node towards the root is called percolate up The process of moving the empty spot down the heap is called percolate down
ort an array in ascending order (small to big), our heap should be built so that larger values have higher priority.
Node: the thing that store the data within the tree . (each circle in the above diagram is a node)
Root Node: the top most node from which all other nodes come from. A is the root node of the tree.
Subtree: Some portion of the entire tree, includes a node (the root of the subtree) and every node that goes downwards from there. A is the root of the entire tree. B is the root of the subtree containing B,D,E and F
Empty trees: A tree with no nodes
Leaf Node: A node with only empty subtrees (no children) Ex. D,E,F,I,J,and G are all leaf nodes
Children: the nodes that is directly 1 link down from a node is that node’s child node. Ex. B is the child of A. I is the child of H
Parentthe node that is directly 1 link up from a node. Ex. A is parent of B. H is the parent of I
Sibling: All nodes that have the same parent node are siblings Ex. E and F are siblings of D but H is not
Ancestor: All nodes that can be reached by moving only in an upward direction in the tree. Ex. C, A and H are all ancestors of I but G and B are not.
Descendantsor Successorsof a node are nodes that can be reached by only going down in the tree. Ex. Descendants of C are G,H,I and J
Depth: Distance from root node of tree. Root node is at depth 0. B and C are at depth 1. Nodes at depth 2 are D,E,F,G and H. Nodes at depth 3 are I and J
Height: Total number of nodes from root to furthest leaf. Our tree has a height of 4.
Path: Set of branches taken to connect an ancestor of a node to the node. Usually described by the set of nodes encountered along the path.
Binary tree: A binary tree is a tree where every node has 2 subtrees that are also binary trees. The subtrees may be empty. Each node has a left child and a right child. Our tree is NOT a binary tree because B has 3 children.
Binary search trees (BST) are binary trees where values are placed in a way that supports efficient searching.
Traversals
depth first (follow a branch as far as it will go before backtracking to take another)
preorder(p-f-r)
inorder(s-l)(l-p-r)
postorder(l-r-p/o)
breadfirst, go through all nodes at one level before going to the next.
template
class BST{
struct Node{
T data_;
Node* left_;
Node* right_;
Node(const T& data, Node* left=nullptr, Node* right=nullptr){
data_=data;
left_=left;
right_=right;} };
Node* root_;
public:
BST(){root_=nullptr;}
voidinsert(constT&data){
if(root_nullptr){
root_=newNode(data);}
else{
boolisInserted=false;//set to true when once we insert the node
Nodecurr=root_;//used to iterate through nodes
while(!isInserted){
if(data data_){
//data belongs in left subtree because it is
//smaller than current node
if(curr->left_){
//there is a node to the left so go left
curr=curr->left_;}
else{curr->left_=newNode(data);
isInserted=true;}}else{
if(curr->right_){
curr=curr->right_;}else{
curr->right_=newNode(data);
isInserted=true;}}}}}
boolsearch(constT&data)const{
Nodecurr=root_;/
boolfound=false;
while(!found &&curr){
if(datacurr->data_){
found=true;}
elseif(data data_){
curr=curr->left_;}
else{curr=curr->right_;}}
returnfound;}
void breadthFirstPrint() const {…}
void inOrderPrint() const {…}
void preOrderPrint() const {…}
~BST(){…}};
Queue
voidbreadthFirstPrint()const{
Queue<Node*>theNodes
if(root_){
theNodes.enqueue(root_);}
while(!theNodes.isEmpty()){
Nodecurr=theNodes.front();
theNodes.dequeue();
if(curr->left_){
theNodes.enqueue(curr->left_);}
if(curr->right_){
theNodes.enqueue(curr->right_);}
std::cout <data_ <<" ";}
std::cout <<std::endl;}
boolsearch(constT&data,constNodesubtree)const{
boolrc=false;
if(subtree !=nullptr){
if(data subtree->data_){
rc=true;}
elseif(data data_){
rc=search(data,subtree->left_);}else{
rc=search(data,subtree->right_);}}
returnrc;}
voidinsert(constT&data,Node*&subtree){
if(subtreenullptr){
subtree=newNode(data);}
elseif(data data_){
insert(data,subtree->left_);}else{
insert(data,subtree->right_);}}
voidinOrderPrint(constNodesubtree)const{
if(subtree!=nullptr){
inOrderPrint(subtree->left_);
std::cout <data_ <<std::endl;
inOrderPrint(subtree->right_);}}
voidinOrderPrint(constNodesubtree)const{
if(subtree!=nullptr){
inOrderPrint(subtree->left_);
std::cout <data_ <<std::endl;
inOrderPrint(subtree->right_);}}
AVL tree is perfectly balanced if it is empty or the number of nodes in each subtree differ by no more than 1. In a perfectly balanced tree, we know that searching either the left or right subtree from any point will take the same amount of time. O(log n)
A height balanced tree is either empty or the height of the left and right subtrees differ by no more than 1
search through a height balanced tree is O(log n). Insert and delete can also be done in O(log n) time.
each node must not only maintain its data and children information, but also a height balance value.
Red-Black Trees are binary search trees that are named after the way the nodes are coloured.
Each node in a red-black tree is coloured either red or black. The height of a red black tree is at most 2 * log(n+1).
A red black tree must maintain the following colouring rules:
every node must have a colour either red or black.
The root node must be black
If a node is red, its children must be black. null nodes are considered to be black.
Every path from root to null pointer must have exactly the same number of black nodes
Fixing nodes:
If a node is red at the root, change it to black.
If the new node is red, and its parent is black you don’t need to do anything.
If a new node is red and its parent is red, what you do depends on colour of sibling of the parent
if the sibling of the parent is black, a rotation needs to be performed
if the sibling of the parent is red, a color swap with grandparent must be performed
A 2-3 Tree is a specific form of a B tree. A 2-3 tree is a search tree. However, it is very different from a binary search tree.
Here are the properties of a 2-3 tree:
each node has either one value or two value
a node with one value is either a leaf node or has exactly two children (non-null). Values in left subtree < value in node < values in right subtree
a node with two values is either a leaf node or has exactly three children (non-null). Values in left subtree < first value in node < values in middle subtree < second value in node < value in right subtree.
all leaf nodes are at the same level of the tree
A graph is made up of a set of vertices and edges that form connections between vertices. If the edges are directed, the graph is sometimes called a digraph. Graphs can be used to model data where we are interested in connections and relationships between data.
adjacent - Given two nodes A and B. B is adjacent to A if there is a connection from A to B. In a digraph if B is adjacent to A, it doesn’t mean that A is automatically adjacent to B.
edge weight/edge cost - a value associated with a connection between two nodes
path - a ordered sequence of vertices where a connection must exist between consecutive pairs in the sequence.
simplepath - every vertex in path is distinct
pathlength number of edges in a path
cycle - a path where the starting and ending node is the same
strongly connected - If there exists some path from every vertex to every other vertex, the graph is strongly connected.
weakly connect - if we take away the direction of the edges and there exists a path from every node to every other node, the digraph is weakly connected.
An adjacency matrix is in essence a 2 dimensional array. Each index value represents a node. When given 2 nodes, you can find out whether or not they are connected by simply checking if the value in corresponding array element is 0 or not. For graphs without weights, 1 represents a connection. 0 represents a non-connection.
An adjacency list uses an array of linked lists to represent a graph Each element represents a vertex. For each vertex it is connected to, a node is added to it’s linked list. For graphs with weights each node also stores the weight of the connection to the node.
Dijkstra’s Algorithm is an algorithm for finding the shortest path from one vertex to every other vertex. This algorithm is an example of a greedy algorithm. Greedy algorithms are algorithms that find a solution by picking the best solution encountered thus far and expand on the solution. Dijkstra’s Algorithm was first conceived by Edsger W. Dijkstra.
Decision problems are problems where for any given input, you will end up with a “yes” or “no” answer. These are the simplest answers.
A Turing machine is a mathematical model of computation that defines an abstract machine, which manipulates symbols on a strip of tape according to a table of rules
undecidable. The halting problem is described simply as this… is it possible to write a program that will determine if any program has an infinite loop.
P class problems are decision problems that can be solved in polynomial time. Note that linear is polymial time, but so is quadratic… polynomial is essentially ncnc where c is a constant. For example, matrix multiplication is a polynomial class problem even though the solution is n2n2
NP class stands for non-deterministic polynomial time. A non-deterministic machine is a machine that has a choice of what action to take after each instruction and furthermore, should one of the actions lead to a solution, it will always choose that action.
A problem is in the NP class if we can verifythat a given positive solution to our problem is correct in polynomial time. In other words, you don’t have to find the solution in polynomial time, just verify that a solution is correct in polynomial time.
Note that all problems of class P are also class NP.