关闭

重复代码(克隆代码)的几个概念与类型

330人阅读 评论(0) 收藏 举报
分类:

文章转载:http://blog.csdn.net/lovelion/article/details/9326307


本文内容来源于以下两篇参考文献:

       [1] Chanchal K. Roy, James R. Cordy, Rainer Koschke. Comparison and Evaluation of Code Clone Detection Techniques and Tools: A Qualitative Approach. Science of Computer Programming, 2009, 74(7): 470-495.

       [2] Hamid Abdul Basit, Stan Jarzabek. A Data Mining Approach for Detecting Higher-level Clones in Software. IEEE Transactions on Software Engineering, 2009, 35(4): 497-513.

 

Code FragmentA code fragment (CF) is any sequence of code lines (with or without comments.) It can be of any granularity, e.g., function definition, begin-end block, or sequence of statements. A CF is identified by its file name and begin-end line numbers in the original code base and is denoted as a triple (CF.FileName, CF.BeginLine, CF.EndLine).

代码片段:代码片段(CF)是任意一个代码行序列(可能包含注释,也可能不包含注释)。它可以是任意粒度的,例如,代码片段可以是一个函数的定义,一个begin-end语句块或者一个语句序列。一个代码片段可通过它所在的文件名、源代码中的起始行号和结束行号来标识,它可以通过一个三元组表示:CF.FileName(文件名),CF.BeginLine(起始代码行行号),CF.EndLine(结束代码行行号)。

 

Code Clone: A code fragment CF2 is a clone of another code fragment CF1 if they are similar by some given definition of similarity, that is, f(CF1) = f(CF2) where f is the similarity function. Two fragments that are similar to each other form a clone pair (CF1, CF2), and when many fragments are similar, they form a clone class or clone group.

代码克隆:代码片段CF2是另一个代码片段CF1的一个克隆,是指根据一些给定的相似性定义它们之间是相似的,也就是说,f(CF1) = f(CF2)f表示相似度函数。两个相似的代码片段构成了一个克隆对Clone Pair(CF1, CF2),多个相似的代码片段构成了一个克隆类(Clone Class)或克隆组(Clone Group)

 

Clone Type: There are two main kinds of similarity between code fragments. Fragments can be similar based on the similarity of their program text, or they can be similar based on their functionality (independent of their text). The first kind of clone is often the result of copying a code fragment and pasting into another location. In the following we provide the types of clones based on both the textual (Types 1 to 3) and functional similarities:

Type-1: Identical code fragments except for variations in whitespace, layout and comments.

Type-2: Syntactically identical fragments except for variations in identifiers, literals, types, whitespace, layout and comments.

Type-3: Copied fragments with further modifications such as changed, added or removed statements, in addition to variations in identifiers, literals, types, whitespace, layout and comments.

Type-4: Two or more code fragments that perform the same computation but are implemented by different syntactic variants.

克隆类型:代码片段之间的相似性主要有两种类型,第一类片段之间的相似性基于它们程序文本之间的相似性,第二类相似性是基于函数性的(独立于程序文本)。前者通常是由于拷贝一段代码片段并粘贴到另一个位置而产生的。接下来我们提供了四种基于文本(类型一至类型三)和函数相似性的克隆类型:

类型一:除空格、布局和注释不同之外,其余部分都相同的代码片段。

类型二:除标识符、字面量、类型、空格、布局和注释外,语法结构相同的代码片段。

类型三:除标识符、字面量、类型、空格、布局和注释外,进一步对克隆代码段进行改动,例如修改、增加或者删除语句。

类型四:两个或多个代码片段执行相同的计算,但是语法结构的实现方式不同。

 

       上述四类克隆统称为简单克隆(Simple Clone),将简单克隆组合成高层的粗粒度克隆称为结构克隆(Structural Clone)

 

【作者:刘伟 http://blog.csdn.net/lovelion


0
0

查看评论
* 以上用户言论只代表其个人观点,不代表CSDN网站的观点或立场
    个人资料
    • 访问:317433次
    • 积分:4451
    • 等级:
    • 排名:第6677名
    • 原创:5篇
    • 转载:589篇
    • 译文:0篇
    • 评论:5条
    最新评论