一些重要的算法The Most Important Algorithms

下面是一些比较重要的算法,原文罗列了32个,但我觉得有很多是数论里的,和计算机的不相干,所以没有选取。下面的这些,有的我们经常在用,有的基本不用。有的很常见,有的很偏。不过了解一下也是好事。也欢迎你留下你觉得有意义的算法。(注:本篇文章并非翻译,其中的算法描述大部份摘自Wikipedia,因为维基百科描述的很专业了)

  1. A*搜寻算法
    俗称A星算法。这是一种在图形平面上,有多个节点的路径,求出最低通过成本的算法。常用于游戏中的NPC的移动计算,或线上游戏的BOT的移动计算上。该算法像Dijkstra算法一样,可以找到一条最短路径;也像BFS一样,进行启发式的搜索。
  2. Beam Search
    束搜索(beam search) 方法是解决优化问题的一种启发式方法,它是在分枝定界方法基础上发展起来的,它使用启发式方法估计k 个最好的路径,仅从这k 个路径出发向下搜索,即每一层只有满意的结点会被保留,其它的结点则被永久抛弃,从而比分枝定界法能大大节省运行时间。束搜索于20 世纪70 年代中期首先被应用于人工智能领域,1976 年Lowerre 在其称为HARPY的语音识别系统中第一次使用了束搜索方法,他的目标是并行地搜索几个潜在的最优决策路径以减少回溯,并快速地获得一个解。
  3. 二分取中查找算法
    一种在有序数组中查找某一特定元素的搜索算法。搜素过程从数组的中间元素开始,如果中间元素正好是要查找的元素,则搜素过程结束;如果某一特定元素大于或者小于中间元素,则在数组大于或小于中间元素的那一半中查找,而且跟开始一样从中间元素开始比较。这种搜索算法每一次比较都使搜索范围缩小一半。
  4. Branch and bound
    分支定界 (branch and bound) 算法是一种在问题的解空间树上搜索问题的解的方法。但与回溯算法不同,分支定界算法采用广度优先或最小耗费优先的方法搜索解空间树,并且,在分支定界算法中,每一个活结点只有一次机会成为扩展结点。
  5. 数据压缩
    数据压缩是通过减少计算机中所存储数据或者通信传播中数据的冗余度,达到增大数据密度,最终使数据的存储空间减少的技术。数据压缩在文件存储和分布式系统领域有着十分广泛的应用。数据压缩也代表着尺寸媒介容量的增大和网络带宽的扩展。
  6. Diffie–Hellman密钥协商
    Diffie–Hellman key exchange,简称“D–H”, 是一种安全协议。它可以让双方在完全没有对方任何预先信息的条件下通过不安全信道建立起一个密钥。这个密钥可以在后续的通讯中作为对称密钥来加密通讯内容。
  7. Dijkstra’s 算法
    迪科斯彻算法(Dijkstra)是由荷兰计算机科学家艾兹格·迪科斯彻(Edsger Wybe Dijkstra)发明的。算法解决的是有向图中单个源点到其他顶点的最短路径问题。举例来说,如果图中的顶点表示城市,而边上的权重表示著城市间开车行经的距离,迪科斯彻算法可以用来找到两个城市之间的最短路径。
  8. 动态规划
    动态规划是一种在数学和计算机科学中使用的,用于求解包含重叠子问题的最优化问题的方法。其基本思想是,将原问题分解为相似的子问题,在求解的过程中通过子问题的解求出原问题的解。动态规划的思想是多种算法的基础,被广泛应用于计算机科学和工程领域。比较著名的应用实例有:求解最短路径问题,背包问题项目管理网络流优化等。这里也有一篇文章说得比较详细。
  9. 欧几里得算法
    在数学中,辗转相除法,又称欧几里得算法,是求最大公约数的算法。辗转相除法首次出现于欧几里得的《几何原本》(第VII卷,命题i和ii)中,而在中国则可以追溯至东汉出现的《九章算术》。
  10. 最大期望(EM)算法
    在统计计算中,最大期望(EM)算法是在概率probabilistic)模型中寻找参数最大似然估计的算法,其中概率模型依赖于无法观测的隐藏变量(Latent Variable)。最大期望经常用在机器学习计算机视觉数据聚类Data Clustering)领域。最大期望算法经过两个步骤交替进行计算,第一步是计算期望(E),利用对隐藏变量的现有估计值,计算其最大似然估计值;第二步是最大化(M),最大化在 E 步上求得的最大似然值来计算参数的值。M 步上找到的参数估计值被用于下一个 E 步计算中,这个过程不断交替进行。
  11. 快速傅里叶变换 (FFT)
    快速傅里叶变换(Fast Fourier Transform,FFT),是离散傅里叶变换的快速算法,也可用于计算离散傅里叶变换的逆变换。快速傅里叶变换有广泛的应用,如数字信号处理、计算大整数乘法、求解偏微分方程等等。本条目只描述各种快速算法,对于离散傅里叶变换的性质和应用,请参见离散傅里叶变换
  12. 哈希函数
    Hash Function是一种从任何一种数据中创建小的数字“指纹”的方法。该函数将数据打乱混合,重新创建一个叫做散列值的指纹。散列值通常用来代表一个短的随机字母和数字组成的字符串。好的散列函数在输入域中很少出现散列冲突。在散列表和数据处理中,不抑制冲突来区别数据,会使得数据库记录更难找到。
  13. 堆排序
    Heapsort 是指利用堆积树)这种数据结构所设计的一种排序算法。堆积树是一个近似完全二叉树的结构,并同时满足堆积属性:即子结点的键值或索引总是小于(或者大于)它的父结点。
  14. 归并排序
    Merge sort是建立在归并操作上的一种有效的排序算法。该算法是采用分治法(Divide and Conquer)的一个非常典型的应用。
  15. RANSAC 算法
    RANSAC 是”RANdom SAmple Consensus”的缩写。该算法是用于从一组观测数据中估计数学模型参数的迭代方法,由Fischler and Bolles在1981 提出,它是一种非确定性算法,因为它只能以一定的概率得到合理的结果,随着迭代次数的增加,这种概率是增加的。 该算法的基本假设是观测数据集中存在”inliers”(那些对模型参数估计起到支持作用的点)和”outliers”(不符合模型的点),并且这组观测数据受到噪声影响。RANSAC 假设给定一组”inliers”数据就能够得到最优的符合这组点的模型。
  16. RSA加密演算法
    这是一个公钥加密算法,也是世界上第一个适合用来做签名的算法。今天的RSA已经专利失效,其被广泛地用于电子商务加密,大家都相信,只要密钥足够长,这个算法就会是安全的
  17. 并查集Union-find
    并查集是一种树型的数据结构,用于处理一些不相交集合(Disjoint Sets)的合并及查询问题。常常在使用中以森林来表示。
  18. Viterbi algorithm
    寻找最可能的隐藏状态序列(Finding most probable sequence of hidden states)

附录

from: http://coolshell.cn/articles/2583.html


附:英文版原文:

The Most Important Algorithms

After a long discussion with some of my RISC colleagues about what the 5 most important algorithms on the world are, we couldn't reach a consensus on this question. So I suggested to perform a little survey. The criterion for suggestions was that these algorithms should be widely used. Further we restrict ourselves to the fields of computer science and mathematics. 
As I expected the number of different suggestions is close to

5 * (no. of participants)

In the following you find the results (in alphabetical order) of this survey (which of course is highly non-representative since most of the participants are computer scientists).
  1. A* search algorithm 
    Graph search algorithm that finds a path from a given initial node to a given goal node. It employs a heuristic estimate that ranks each node by an estimate of the best route that goes through that node. It visits the nodes in order of this heuristic estimate. The A* algorithm is therefore an example of best-first search.
  2. Beam Search 
    Beam search is a search algorithm that is an optimization of best-first search. Like best-first search, it uses a heuristic function to evaluate the promise of each node it examines. Beam search, however, only unfolds the first m most promising nodes at each depth, where m is a fixed number, the beam width.
  3. Binary search 
    Technique for finding a particular value in a linear array, by ruling out half of the data at each step.
  4. Branch and bound 
    A general algorithmic method for finding optimal solutions of various optimization problems, especially in discrete and combinatorial optimization.
  5. Buchberger's algorithm 
    In computational algebraic geometry and computational commutative algebra, Buchberger's algorithm is a method of transforming a given set of generators for a polynomial ideal into a Gröbner basis with respect to some monomial order. One can view it as a generalization of the Euclidean algorithm for univariate gcd computation and of Gaussian elimination for linear systems.
  6. Data compression 
    Data compression or source coding is the process of encoding information using fewer bits (or other information-bearing units) than an unencoded representation would use through use of specific encoding schemes.
  7. Diffie-Hellman key exchange 
    Cryptographic protocol which allows two parties that have no prior knowledge of each other to jointly establish a shared secret key over an insecure communications channel. This key can then be used to encrypt subsequent communications using a symmetric key cipher.
  8. Dijkstra's algorithm 
    Algorithm that solves the single-source shortest path problem for a directed graph with nonnegative edge weights.
  9. Discrete differentiation 
    I.e., the formula f'(x) = (f(x+h) - f(x-h)) / 2h.
  10. Dynamic programming 
    Dynamic programming is a method for reducing the runtime of algorithms exhibiting the properties of overlapping subproblems and optimal substructure, described below.
  11. Euclidean algorithm 
    Algorithm to determine the greatest common divisor (gcd) of two integers. It is one of the oldest algorithms known, since it appeared in Euclid's Elements around 300 BC. The algorithm does not require factoring the two integers.
  12. Expectation-maximization algorithm (EM-Training)
    In statistical computing, an expectation-maximization (EM) algorithm is an algorithm for finding maximum likelihood estimates of parameters in probabilistic models, where the model depends on unobserved latent variables. EM alternates between performing an expectation step, which computes the expected value of the latent variables, and a maximization step, which computes the maximum likelihood estimates of the parameters given the data and setting the latent variables to their expectation.
  13. Fast Fourier transform (FFT) 
    Efficient algorithm to compute the discrete Fourier transform (DFT) and its inverse. FFTs are of great importance to a wide variety of applications, from digital signal processing to solving partial differential equations to algorithms for quickly multiplying large integers.
  14. Gradient descent 
    Gradient descent is an optimization algorithm that approaches a local minimum of a function by taking steps proportional to the negative of the gradient (or the approximate gradient) of the function at the current point. If instead one takes steps proportional to the gradient, one approaches a local maximum of that function; the procedure is then known as gradient ascent.
  15. Hashing 
    A function for summarizing or probabilistically identifying data. Typically this means one applies a mathematical formula to the data, producing a string which is probably more or less unique to that data. The string is much shorter than the original data, but can be used to uniquely identify it.
  16. Heaps (heap sort) 
    In computer science a heap is a specialized tree-based data structure. Heaps are favourite data structures for many applications: Heap sort, selection algorithms (finding the min, max or both of them, median or even any kth element in sublinear time), graph algorithms.
  17. Karatsuba multiplication 
    For systems that need to multiply numbers in the range of several thousand digits, such as computer algebra systems and bignum libraries, long multiplication is too slow. These systems employ Karatsuba multiplication, which was discovered in 1962.
  18. LLL algorithm 
    The Lenstra-Lenstra-Lovasz lattice reduction (LLL) algorithm is an algorithm which, given a lattice basis as input, outputs a basis with short, nearly orthogonal vectors. The LLL algorithm has found numerous applications in cryptanalysis of public-key encryption schemes: knapsack cryptosystems, RSA with particular settings, and so forth.
  19. Maximum flow 
    The maximum flow problem is finding a legal flow through a flow network that is maximal. Sometimes it is defined as finding the value of such a flow. The maximum flow problem can be seen as special case of more complex network flow problems. The maximal flow is related to the cuts in a network by the Max-flow min-cut theorem. The Ford-Fulkerson algorithm computes the maximum flow in a flow network.
  20. Merge sort 
    A sorting algorithm for rearranging lists (or any other data structure that can only be accessed sequentially, e.g. file streams) into a specified order.
  21. Newton's method 
    Efficient algorithm for finding approximations to the zeros (or roots) of a real-valued function. Newton's method is also a well-known algorithm for finding roots of equations in one or more dimensions. It can also be used to find local maxima and local minima of functions.
  22. Q-learning 
    Q-learning is a reinforcement learning technique that works by learning an action-value function that gives the expected utility of taking a given action in a given state and following a fixed policy thereafter. A strength with Q-learning is that it is able to compare the expected utility of the available actions without requiring a model of the environment.
  23. Quadratic sieve 
    The quadratic sieve algorithm (QS) is a modern integer factorization algorithm and, in practice, the second fastest method known (after the number field sieve, NFS). It is still the fastest for integers under 110 decimal digits or so, and is considerably simpler than the number field sieve.
  24. RANSAC 
    RANSAC is an abbreviation for "RANdom SAmple Consensus". It is an algorithm to estimate parameters of a mathematical model from a set of observed data which contains "outliers". A basic assumption is that the data consists of "inliers", i. e., data points which can be explained by some set of model parameters, and "outliers" which are data points that do not fit the model.
  25. RSA 
    Algorithm for public-key encryption. It was the first algorithm known to be suitable for signing as well as encryption. RSA is still widely used in electronic commerce protocols, and is believed to be secure given sufficiently long keys.
  26. Schönhage-Strassen algorithm 
    In mathematics, the Schönhage-Strassen algorithm is an asymptotically fast method for multiplication of large integer numbers. The run-time is O(N log(N) log(log(N))). The algorithm uses Fast Fourier Transforms in rings.
  27. Simplex algorithm 
    In mathematical optimization theory, the simplex algorithm a popular technique for numerical solution of the linear programming problem. A linear programming problem consists of a collection of linear inequalities on a number of real variables and a fixed linear functional which is to be maximized (or minimized).
  28. Singular value decomposition (SVD) 
    In linear algebra, SVD is an important factorization of a rectangular real or complex matrix, with several applications in signal processing and statistics, e.g., computing the pseudoinverse of a matrix (to solve the least squares problem), solving overdetermined linear systems, matrix approximation, numerical weather prediction.
  29. Solving a system of linear equations 
    Systems of linear equations belong to the oldest problems in mathematics and they have many applications, such as in digital signal processing, estimation, forecasting and generally in linear programming and in the approximation of non-linear problems in numerical analysis. An efficient way to solve systems of linear equations is given by the Gauss-Jordan elimination or by the Cholesky decomposition.
  30. Strukturtensor 
    In pattern recognition: Computes a measure for every pixel which tells you if this pixel is located in a homogenous region, if it belongs to an edge, or if it is a vertex.
  31. Union-find 
    Given a set of elements, it is often useful to partition them into a number of separate, nonoverlapping groups. A disjoint-set data structure is a data structure that keeps track of such a partitioning. A union-find algorithm is an algorithm that performs two useful operations on such a data structure: 
    Find: Determine which group a particular element is in. 
    Union: Combine or merge two groups into a single group.
  32. Viterbi algorithm 
    Dynamic programming algorithm for finding the most likely sequence of hidden states - known as the Viterbi path - that result in a sequence of observed events, especially in the context of hidden Markov models.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
精简下面表达:Existing protein function prediction methods integrate PPI networks and multivariate bioinformatics data to improve the performance of function prediction. By combining multivariate information, the interactions between proteins become diverse. Different interactions’ functions in functional prediction are various. Combining multiple interactions simply between two proteins can effectively reduce the effect of false negatives and increase the number of predicted functions, but it can also increase the number of false positive functions, which contribute to nonobvious enhancement for the overall functional prediction performance. In this article, we have presented a framework for protein function prediction algorithms based on PPI network and semantic similarity with the addition of protein hierarchical functions to them. The framework relies on diverse clustering algorithms and the calculation of protein semantic similarity for protein function prediction. Classification and similarity calculations for protein pairs clustered by the functional feature are more accurate and reliable, allowing for the prediction of protein function at different functional levels from different proteomes, and giving biological applications greater flexibility.The method proposed in this paper performs well on protein data from wine yeast cells, but how well it matches other data remains to be verified. Yet until now, most unknown proteins have only been able to predict protein function by calculating similarities to their homologues. The predictions result of those unknown proteins without homologues are unstable because they are relatively isolated in the protein interaction network. It is difficult to find one protein with high similarity. In the framework proposed in this article, the number of features selected after clustering and the number of protein features selected for each functional layer has a significant impact on the accuracy of subsequent functional predictions. Therefore, when making feature selection, it is necessary to select as many functional features as possible that are important for the whole interaction network. When an incorrect feature was selected, the prediction results will be somewhat different from the actual function. Thus as a whole, the method proposed in this article has improved the accuracy of protein function prediction based on the PPI network method to a certain extent and reduces the probability of false positive prediction results.
02-27

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值