我的毕业大论文第四章翻译

最新推荐文章于 2021-09-13 16:27:20 发布

rain_bow_iris

最新推荐文章于 2021-09-13 16:27:20 发布

阅读量5.4k

点赞数

毕业之际，因为一篇论文跟带我的博士闹得不可开交，最终那个博士因为没有得到我的研究成果而不指导我继续发论文，并且说他对这篇论文没有任何不良企图，但是实际上他已经逼迫我将我的毕业论文第四章（也就是没有发表论文的那一部分研究成果）翻译成英文给他，为了防止他利用这个英文论文发表（学校规定毕业论文里面的研究成果必须留在学校发表，否则属于学术不端行为），特此将我已经完成的毕业论文翻译传上来，以作证据。这属于个人研究成果，未得本人同意，不得转载使用。

Chapter4 Packet Classification Algorithm Based on Prefix Length Clustering inOpenFlow Network

Network has gradually been a part of thecritical infrastructure of modern society. This trims the work of the networkresearchers more important, but at the same time, with the development ofnetwork technology, the possibility of new technologies they create slimmer.The existing of network equipment and protocols creates a very high barrier tonew technologies and new ideas, it is generally believed that the currentnetwork infrastructure is already in a "rigid" state. In order toaddress this problem, network community initiated efforts to developprogrammable network, therefore, OpenFlow protocol came into being. In order toimprove the flexibility and scalability, OpenFlow network designs 12 tuplefields to match the flow table, and will continue to add additional fields insubsequent versions. Matching 12 different fields to get the highest priorityof the flow table entry, the complexity is very important. And the traditional5-tuple packet classification algorithm considers only five fields, unable toadapt to the classification of 12 tuple package classification. OpenFlowimprovements to existing packet classification algorithms are improved on thebasis of the 5-tuple, to a certain extent, limits the performance of thealgorithm.

The article is directedagainst the characteristics of OpenFlow flow classification, and designed apacket classification algorithm based on prefix length clustering according tothe source address field prefix length of the rule set to eliminate level Trieback problems. After the clustering, the algorithm will merge certain clustersselectively to prevent a single tree too large or too small, and achieve a morebalanced lookup performance. For matching 12 tuples in OpenFlow, we design ahierarchical structure to deal with these matching fields in layered way.Compare the design scheme in this paper with the advanced packet classificationalgorithms, which is the multi-bit compressed Trie, experiments show that ourproposed algorithm is superior 23.16% ~ 46.45% in the lookup performance thanmulti-bit compressed Trie, while the memory performance higher than the 13.20%~ 28.33%.

4.1 The OpenFlow Research Status

Nowadays, packetclassification is still a hot research topic with the faster speed of thenetwork, because it is a basic support function of network, and now, packetclassification is faced with new challenges. With the progress of networktechnology, new technology and protocol standards are emerging continuously,therefore, in order to make next-generation routers to support a variety ofnetwork functions, packet classification requires improvement to support thesenew technologies continuously. In the traditional network applications, packetclassification problem is usually considered the fixed 5-tuple fields: a 32-bitsource/destination IP address, 16-bit source/destination port number and 8 bitsof transport layer protocols. The emergence of modern network technologiesgradually makes the multidimensional packet classification developing from thetraditional fixed combination with five tuples into a flexible and large numberof packet header fields matching.

As an importantfeature of the next generation of enterprise data centers and cloud computingnetworks, the network virtualization appeared. Virtualization technology canabstract physical resources as logical resources, reducing the limit of bodilyresources. Programmable virtualization network can reduce the barriers to entryof new ideas, stepping up the pace of changing the network infrastructure. Thestandardization and flexibility which brought by software-defined network (SDN)and OpenFlow brings endless possibilities to the development of networkvirtualization.

OpenFlow originatedfrom Clean Slate Project group in the Stanford. The aim of the project is toallow researchers to test new ideas and technologies on the network in useevery day to improve the existing network infrastructure changes, change thedesign which impede the network development, and make the network structurebecome more rigid and flexible. OpenFlow adds to the commercial Internetswitches, routers and wireless access points as a function, and providesstandardized hooks, allows researchers to experiment without the need for thevendor to open the inner workings of its network equipment. OpenFlow providesan open protocol on different switches and routers to edit the flow table.Network administrators can control the flow by selecting the routing of datapackets used and the processing results. In this way, researchers can trydifferent routing protocols, security model, solutions, or even a replacementfor IP protocol. In the same network, the generated traffic is separated toensure that the experimental test does not affect the operation of the realnetwork.

Nick McKeown et alpublished an article entitled "OpenFlow: Enabling Innovation in CampusNetworks" in the previous year 2008 ACM SIGCOMM, for the first timedetails the OpenFlow related concepts. OpenFlow first proposed mainly used tosolve complex network configurations work, to centralize network using usesoftware way, while separate the experimental flow and production flow, tofacilitate new ideas and technologies can be tested in a real networkenvironment. From that time the progress has, OpenFlow has far exceeded the scenariosdescribed in the paper at that time.

The main core ideaof OpenFlow is to converse the original packet forwarding network whichcompletely controlled by switching equipment to the process of accomplishing byOpenFlow switches and controllers independently respectively, as shown inFigure 4.1. In traditional routers and switches, fast packet forwarding (datapath) and high-level routing decisions (control path) are present on the samedevice. While the OpenFlow switch separates these two functions. The data pathis retained in switching and routing decisions were moved to a separatecontroller, the controller is typically a standard server. OpenFlow switchesand controllers communicate through the OpenFlow protocol, which defines themessages of the receiving and sending of packets, modifying forwarding tableand statistics and so on. Data path includes a flow meter and flow tableentries associated with the action. As the action set supported by OpenFlowswitches is scalable, to achieve high-performance and low-cost, the data pathmust have a certain degree of flexibility, which means that OpenFlow willabstain the capability of processing any specified fields for each packet, andto seek more effective but adequately use range of actions, and therefore,OpenFlow having a specific set of actions.

Figure 4.1OpenFlowNetwork

The same with thetraditional data forwarding process, OpenFlow switch also maintains a flowtable, but this flow table is different from traditional forwarding,traditional forwarding match will be discarded the packet by default in thecase of the packet cannot be found a match rule, and OpenFlow will send packetto the control for confirm the transmission path, and hand the control power tothe controller for processing. The data path of OpenFlow switches provides anabstraction of a flow table, each table contains a series of data packets flowfields need to match, an action (such as sending a packet to port, modify or discardthe field). When an OpenFlow switch receives a packet not previously receive,and the packet does not match any of the rules in the flow table, then it willsend the packet to the controller. This controller makes a decision afterreceiving the packet to process the packet. It can be discarded, or to add aflow entry for the packet, the entry is recorded in the process similar to howfuture packets. It can be discarded, or to add an entry for the packet flow,the entry records how to process the similar packets in the future.

4.2The OpenFlowNetwork Characteristics

OpenFlow network consists of 3 parts,which are OpenFlow switches, Flow Visor and Controller, as shown in Figure 4.2.OpenFlow switch forwards the packets mainly in the data layer; Flow Visorvisualized network; and Controller mainly realize the control layer functions,control the network centralized.

Similar to computer virtualization, FlowVisor is a virtual layer located between hardware and software. The main roleof Flow Visor is to allow multiple controllers monitoring the OpenFlow switchat the same time, but the purview of each controller is only a virtual networksection of the switch. Therefore, under the condition that does not affect thenormal operation of the production flow, the test platform which Flow Visorcreated allow multiple networks experiments in different virtual networkssimultaneously. Flow Visor compatible with common commercial switches,therefore, it does not require the use of other hardware such as FPGA andnetwork processor and so on.

Figure 4.2OpenFlow network of structures

The biggest feature of OpenFlow is the separation of data processinglayer and exchange control layer. OpenFlow switch mainly forwards the packetsin the data layer, and Controller is to achieve the functions of control layer,such as adding and deleting entries in the flow table, control the userpermissions in the network, Controller can also make a response to the packetrequest actively or passively. Controllercontrolled the flow table in OpenFlow switches by the OpenFlow protocol, thuscontrol the whole network centralized. Controller function is mainly achievedby running said NOX, i.e., NOX are OpenFlow network operating system. At thesame time, taking into account the performance and single point of failure,OpenFlow protocol also supports running multiple controllers simultaneouslyconfigure and manage OpenFlow switches.

The role of OpenFlow switch is taking the corresponding treatment basedon the flow table, such as forwarding, discard, modify a header field,forwarded to the controller. OpenFlow Switch consists of three parts, namely,the flow table, a secure channel and of OpenFlow protocol, shown in Figure 4.3.A switch may communicate with the controller, you can communicate with multiplecontrollers, so as to avoid a single controller and switch connection isinterrupted, increasing reliability. When a connection problem appeared, theswitch will attempt to connect to the standby controller, after repeatedattempts failed, the switch into emergency mode, resets all connections. Atthis point, all packets will match entries in emergency mode, all other normalentries of the switch will be deleted. By default, the switch gets into theemergency mode when just booted.

Secure channel is mainly used to connect OpenFlow switches andcontrollers. The controller controls the switch via a secure channel, and thecontroller switches the event processing request is accepted, and sends thepacket to the switch after the processing. All messages carried in a securechannel of communication switches and controllers must be sent in accordancewith the provisions of OpenFlow protocol format. Secure channel using TLSencrypted connections. When the switch is connected to the controller to startthe first TCP port (default is 6633). They exchanged certificates forauthentication. Each switch requires at least two certificates, one for issuingcertification to the controller, the controller is used to authenticateanother.

Figure4.3The Structure of the OpenFlow SwitchNetwork

OpenFlow protocol standards used to definethe message and interface standard switches and controller communication.Binding core of the protocol is the information structure. OpenFlow protocolmessages are defined mainly includes three types: Controller-to-Switch,Asynchronous and Symmetric, and each type has multiple subtypes.Controller-to-Switch be initiated by the Controller, OpenFlow switch used tocheck the status. Asynchronous be initiated by OpenFlow switches, used toupdate and change the current state of events. In the absence of a request,Symmetric initiated by the Controller or OpenFlow switches to establish aconnection.

And the controller switches to establish thesecure channel connection after the connection is established OpenFlow, both sendeach other ofpt_hello message, which carries the current protocol supported bytheir respective highest version number, and use both the lowest version numberof the protocol version number communication, if both parties support theagreement, at this point the connection is established, or else sendofpt_error, describing the reasons for the connection fails, and interrupt thisconnection.

The flow table of OpenFlow is different withthe table of traditional packet classification , at the first, the heads oftuples need to be matched in OpenFlow are more than 5, OpenFlow protocolsupports ten tuple matches in the initial version, with the development of OpenFlow, the support matchingfields will continue to increase. Secondly, OpenFlow supports only specificrule actions, and OpenFlow flow table contains a counter for a variety ofstatistical information.

As shown in Figure 4.3, OpenFlow flow tablecontains five parts, the header fields needed for matching, priority, counters,and timeouts appropriate action. By OpenFlow protocol, the controller outsidethe OpenFlow router can manage the flow table. Flow table supports remoteaccess control, so that you can configure and manage the flow table strippedout of the network equipment itself, making the entire network flow tablecontrol and manage centralized becomes possible, there by separate the physicalnetwork in the traditional sense and logical definition of the network.

OpenFlow flow fields in the table with thecontinuous development of OpenFlow while increasing, OpenFlow originallyreleased 0.8.9 version supports only ten tuples, version 0.9 adds the VLANpart, subsequently, version 1.0 adds ToS (Type of Service) field, and the 1.1version adds metadata fields, MPLS part, while version 1.2 and version 1.3provides 13 fields and other options to achieve the field. Therefore, in orderto make OpenFlow features supported expanding, need to match the length of theflow meter has been increased, which also makes OpenFlow rules to find morecomplex, the traditional rule table lookup field only five of 104 bits, whilethe 1.0 version of the 12 fields have 250 bits, the number of subsequentversions of the field more. Here we focus on the 1.0 version of the 12 fields,detailed information about these fields is shown in Table 4.1. A large numberof matching field will undoubtedly consume a lot of time and space, so inversion 1.1 presented in a multi-stage pipeline structure and flow table,multi-level flow through Fenwick table query to compress the need to find aflow table space, as shown in Figure 4.4.

Table 4.1 OpenFlow Flow Table Fields

Fields	bit	Application Range	Introduction
Ingress Port		All packets	The number of port, from 1
Ethernet source address	48	All packet using the port
Ethernet destination address	48	All packet using the port
Ethernet Type	16	All packet using the port
VLAN ID	12	All packet with kind of 0x8100
VLAN priority	3	All packet with kind of 0x8100	VLAN PCP Field
IP source address	32	All IP and ARP packets	May be mask
IP destination	32	All IP and ARP packets	May be mask
IP protocol	8	All IP and ARP packets	ARP low 8 bit
IP ToS bits	6	All IP packets	Assigned 8 bit
Transport source port/ICMP Type	16	All TCP,UDP,ICMP packets	Only use low 8 bit for ICMP
Transport destination port/ICMP code	16	All TCP,UDP,ICMP packets	Only use low 8 bit for ICMP

Figure4.4OpenFlowmulti-level flow table query

When the OpenFlow switch receives a packet,packet header will be parsed at first to extract the data from the match fieldin the packet for the search. The extracted fields contain the matching fieldtypes and values. Next, it uses the extracted fields begin to find the firstflow table, because pipelined processing solutions are adopted here,therefore, other streams would be lookedup immediately after the other table. During the matching process, OpenFlowswitch would have the flow table entry with the highest priority, at this time,OpenFlow need to update the counter corresponding to the flow table entry andexecutes the corresponding action set. If there is no matching is found at thelast stream table, then the packet matches the default Table-miss entry, theentry is used for the recording of the match that did not process the packet,usually, Table-miss has the lowest priority.

Counters need to maintain a flow chart foreach port, traffic entrance, main memory segments, queues, groups, metering andmeasurement tape, utilized statistical traffic. Priority is used to indicatethe order flow matching entries. The maximum time a timeout for statisticalcount or the effective flow of time entries.

OpenFlow flow table entries for each streamcontain a set of actions, when the packet matches the flow table entries, willperform the corresponding action. OpenFlow flow table action field consistsmainly of three types: one for forwarding data packets to a particular port,which generally refers to a physical port, there may be a switch to a logicalport or reserved ports, reserved ports can specify many common forwardingaction , for example, be forwarded to the controller, Flooding or othernon-OpenFlow way (normal mode) forwarding, you can specify the port logicalport aggregation, tunnel or loopback interface; 2 encapsulation and forwardsthe packet to the controller;. 3 discard the packet.

OpenFlow switch supports three ports:reserved ports, physical ports, logical ports. Physical hardware interfacecorresponds to a switch, in some deployments, OpenFlow switches can implementhardware virtualization, so that one can represent physical port switches withOpenFlow hardware interface corresponding to the virtual slice. When thelogical port can usually be mapped to different physical ports, the onlydifference between it and the physical port is a logical port data packetsassociated with additional data, the logical port when the received packet isforwarded to the Controller, the logical port and physical port are reported tothe Controller.

In [59] , the authors propose a solution toan OpenFlow packet classification. In this article, the author will be callednext-generation packet classification based on OpenFlow packet classificationproblems. Author of the 5 -tuple packet classification based on the proposeddecision tree -based classification scheme, and the use of FPGA accelerated thespeed of packet classification. For Openflow12 tuple packet classificationrequirements , they have made a decision tree based multi- pipelinearchitecture , and the architecture uses a framework called the decision of theforest, dividing a given set of rules 12-tuple for multiple subsets , eachsubsets using a small number of header fields to build depth bounded tree . In thebuild process, using rules and precise rules to reduce the overlapping of thesetwo cutting optimization techniques minimizes duplication rules, thus realizingthe demand for memory showed a linear relationship with the number of rules.The tree is mapped into the process line , use of fine-grained stage nodemapping mechanism that forces the boundary of the memory size in number ofstages of each stage , so that the structure of the memory utilization ismaximized , while the program also allows the use of an external SRAM to handlea larger set of rules . Finally, the authors used high-speed dual-port RAMblocks provide FPGA to achieve each clock cycle (PPC) two packetssimultaneously to achieve a high throughput. And the linear structure alsomakes the rules without Terminal Services instant updates possible. All routingpaths are positioned to avoid the routing delay, in order to achieve a higherclock frequency.

In [60], the authors modify and extend theEffiCuts algorithm [61] to adapt to OpenFlow12 tuple packet classificationrequirements, improve the performance of each packet to identify the memoryspace and the number of memory accesses. To achieve this objective, the authorsproposed a solution. EffiCuts separable in the tree using a program, which willbe divided according to the size of the rules of the rules, the 5-tupleclassification to 32 kinds of conditions, that is, the tree 32 , but if appliedto the 12 yuan group, category reaches 4096 species, which requires memoryaccess is great . Therefore, the authors performed using HiCuts inspired cut toreduce the negative impact of a large cut in the number of over- EffiCutsbrought. But this process will lead to a substantial increase in the number ofnodes, making the number of memory used will be greatly increased. Therefore,the authors recommend consuming a small amount of memory at the expense ofcost, resizing leaf nodes to limit the number of memory accesses. The paperalso uses adaptive compression factor to limit the number of classes created inthe initial sorting process, in order to limit the number of memory accessesrequired. In addition, adaptive compression factor can also limit the number ofclasses created during the initial sort order to limit the number of memoryaccesses required.

4.3. OpenFlowpacket classification algorithm based on prefix length clustering

This chapter willdetail optimized packet classification scheme based on the prefix length andits selective combining inspiration, and then describes searching solutions ofthe algorithm. Finally, a search framework for OpenFlow would be proposed.

4.3.1 Prefix length clusteringmethod

In OpenFlow,although there are many differences between flow tables and traditionalfive-tuple packet classification rule tables, but there are still similaritiesin some areas. There are three types of matching schemas in the rule tables,prefix matching, range matching and exact matching, as shown in Table 4.2.

As can be seenfrom Table 1, OpenFlow 12 tuple flow table is more complicated thanconventional 5-tuple classification tables, and because of the number offields, each field has their own characteristics, and also uses a differentmatching. For this complex packet classification system, based on no prefixrelationship clustering approach has suited up, because each field is not thesame, there is a prefix species are incalculable, and then these fields if noprefix relationship judgments, will produce an infinite cluster, the actualoperation is not possible. This design idea is to find a common solution can betreated in these fields unified, so that both can be for the whole of the bulk,but also to facilitate the design of the pipeline architecture.

Table4.2OpenFlowflow table Sample

Rule Number

Import

Internet Source Address

Internet destination Address

Internet type

VLAN ID

VLAN prioprity

IP Source Address

IP Destination Address

Source Port

Destination Port

IP ToS

Protocol

00:13

00:06

100

000*

TCP

00:07

00:10

000101*

UDP

00:FF

0001111*

[44,56]

[78,91]

TCP

00:1F

0x8100

0010*

00*

[98,100]

UDP

0x0800

100

00011*

00*

[78,98]

TCP

0x0800

0110*

10*

TCP

0x8100

01111*

TCP

4095

UDP

00:FF

00:00

4095

1001*

1011*

[56,78]

TCP

00:1F

00:2A

0x0800

11011*

001*

[90,100]

UDP

R10

4095

110101*

[1,100]

UDP

R11

11:A2

111101*

110*

TCP

R12

0x8100

100

1111111*

101*

TCP

In considerationof the 12 field, most of the fields are the prefix matching, and the rangematching can be also converted to prefix matching. Therefore, the paper only analyzesthe prefix matching. As the IPsource/destination address, the Ethernet source/destination address using theprefix matching, but the length of these two addresses is not the same,accounting for 48 Ethernet address has 48 bits and IP addresses have 32 bits.In order to deal with these prefixes in the same way, here according the prefixlength to cluster, dividing these prefixes into different classes which has thecorresponding length, and then processed separately for each class, as shown inFigure 4.5. The algorithm is shown in Algorithm 4.1.

As shown inFigure 3, firstly the source IP address field of the flow table is extractedfrom the different prefixes and prefix length to be divided into differentclasses according to the prefix length, wherein Ci indicated the length of theprefix i clustering. The source IP address field of R0 is 000*, its prefixlength is 3, the prefix of R0 000* divided into C3 cluster, the prefix of R1000101* prefix length is 6, and divided it into C6 clusters, and in this waydivided six clusters. As can be seen, considering the longestsource/destination IP address, Ethernet source/destination address, then thereare up to 48 clusters existed. Such a huge number of various circumstances willbe considered in the planning for 48 kinds of circumstances, greatly reducingthe complexity of the classification, which can better achieve OpenFlowclassification scheme.

Algorithm 4.1 Prefix length clustering algorithm

Initialization: Get all the prefix of SA field into the Set S and set the length set SL as NULL

Loop:

1: if S is NULL then

2: return;

3: end if

4：Get a prefix P out of the Set S and get the length of P as L;

5: if the length of L is not exsited in SL then

6: Build a new Set and set the length as L;

7: Add P into the Set;

8: return ;

9: else

10: Add P into the corresponing SL with the length of L;

11: return;

12: end if

End Loop

4.3.2 Selective merger plan

Assessment canbe used to predict the quality of clustering quality and description ofquality. This chapter describes the user to assess the quality of clusteringquality, this uses more generic description of the quality standard partitionutility, which is defined as follows:

(4.1)

That category ofeach cluster of a utility average, which is defined as the category utility:

(4.2)

Where is a cluster. Is the property by thecomposition of the attributedomain.

From theanalysis of the previous section can be seen, Although the above scheme whichgreatly reduces the need to consider combinations, which achieve therequirements for the whole of scattered, but 48 clusters for specificimplementations are too many, and does not exclude the existence of too manyelements in a cluster or too few cases, such a clustering analysis for too muchof the time there will be a large cluster resource consuming problems affectingthe balance of the allocation of resources, to a certain extent, resulting in awaste of resources.

Figure 4.5Prefix length clustering

In response tothese problems, we can be appropriately selected some of the smaller clustersto merge, so that we can achieve the balance between clusters. We need tointroduce two parameters before the introduction of selective merge:

ClusMaxSize: Theparameter is to constrain the maximum number of the elements in a singlecluster to prevent a single cluster is too large, the parameter can becustomized its size by the user according to the actual situation;

ClusMixSize:This parameter can be used to estimate the minimum number of elements ofclustering, only when the number of elements is less than this parameter, thecluster can be used to merge. For clustering, which ClusMixSize = . The purpose of this parameter is used for recordinga cluster is saturated, as all the elements corresponding to the length of thecluster have been present in the cluster, the cluster is already saturated, andthen merging is meaningless, therefore, in the consolidation process bydetermining the parameters can skip the cluster.

This paper usesthese two parameters to constrain the merging process. In the merger process,we adopt the principle of precedence merger between two approaching clustersuntil the merger did not meet the requirements of the cluster so far. In themerger process, the two clusters can merge must be met three conditions: 1 thesize of the two clusters must be less than ClusMixSize; 2 after merging two clusters, the size of the cluster mustbe less than ClusMaxSize; 3 two clusters are not have prefix relationship exists.Specific merger process is shown in Figure 4.6. The algorithm is shown inAlgorithm.

As shown inFigure 4.6, the six preceding cluster selectively merged. Due to theconsideration of the comparison of the prefix relationship, and the twoclusters prefix length closer, the possibility of the existence of the prefixrelationship smaller, so we use the principles of merger between two closecluster, begin to merge, where specified in the merger process, ClusMaxSize is5, that is, the number of clusters of elements combined cannot exceed 5. First merge C1 and C3, C4 and C5, C6 and C7,such as process ①. Since thenumbers of the elements in C1 and C3 are 1, and the cluster number of elementsafter it is combined in 2, C1 and C3 elements does not exist prefixrelationships, therefore, C1 and C3 can be combined. The both the number ofelements C4 and C5 is 3, the number of clusters of elements after the merger is6, which more than the maximum allowable number of clusters of elements,therefore, C4 and C5 cannot be merged. So, C6 and C7 do not have the prefixrelationship, and after the merger the number of elements is 5, can be merged.After the merger, the cluster will be called cluster Ci, j, cluster of clustersCi and Cj after the merger. Therefore, through the process of① will be theoriginal cluster into six clusters 4 : C1, 3, C4, C5 and C6, 7. Continue tomerge the resulting four clusters, as the process②. We combine C1, 3 and C4, C5 and C6, 7. As can be seen C1, 3 in 1 * andC4 prefix 11011 * prefix relationships exist, it cannot be merged. The numberof elements in C6, 7 is 5 which has the number of elements equal to the maximumallowed, therefore, it cannot be combined.

Figure4.6Selectivemerge

When approachingclusters cannot be merged, we will consider other clustering combination, asshown in the process③, through allthe clusters which can be combined to find the merger, after traversing, duethe number of elements C6,7 has reached the threshold, C1, 3 and C5 existprefix relationship, therefore, there did not find a combination that can bemerged. Ultimately, the original six clusters merged into four clusters.

As can be seen, notonly the number of clusters after the merger of clusters reduced, but also theelement between the cluster is more balanced, so the analysis time can bebalanced between the individual clusters and improve efficiency. Meanwhile, itcan be seen by the introduction of the next section, less clustering canachieve higher memory performance.

4.3.3 Prefixlength-based clustering algorithm for OpenFlow packet classification overallframework

Through the aboveanalysis, we got four clusters, then using these four clusters to build packetclassification architecture based on prefix length clustering.

Build process

Chapter III usedhere with a similar build process. Since the constructed four clusters have noprefix relationship internally, so we can use directly these four clusters tobuild the first-Trie structure of hierarchical Trie, using the built directlyinsert manner, as shown in Figure 7.

As can be seen fromFigure 4.7, C1, 3 contains two elements, the first element is selected as theroot node inserted, the second element is inserted in accordance with the firstbit prefix to the root of the left child, therefore, C1, 3 builds a Triecontains two nodes. So, for C4, build a Trie contains three nodes. At the end,use four clusters built four Trie, here we use the way to build a directinsertion, each one Trie are no intermediate nodes, saving a lot of storagespace, while, due to the realization of the path compression, finding has ashorter path, thereby to achieve the increasing the time performance at acertain extent.

Algorithm 4.2 Selective merge Algorithm

Initialization: Get the length set SL and set the variable of CluMaxSize and CluMixSize, set the Set SLO NULL, and variable of the length LO

Loop:

1: if the size of SLO is the same with the LOthen

2: return;

3: end if

4: if the SL is NULL then

5: Set SL as SLO;

6: Set LO as the size of SL;

7: end if

8: Get set S1 with the length L1 and set S2 with the similar length L2 out of SL;

9: if the element number of S1 is bigger than CluMixSize or CluMaxSizethen

10: continue;

12: end if

13: if the elemet number of S2 is bigger than CluMixSize or CluMaxSizethen

14: continue;

15: end if

16: if the result of L1 plus L2 is bigger than CluMaxSizethen

17: continue;

18: end if

19: Combine the set S1 and S2 and add the result set into SLO;

End Loop

As can be seenfrom the above build process, the number of clusters is equal to the finalbuild number Trie tree, therefore, less clusters are able to build fewer Trie,which can optimize memory performance to a certain extent, and improve searchefficiency.

After completingthe construction of the first layer in the Trie, we constructed the secondlayer based on the destination IP address field of the rule using insertedapproach, as shown in Figure 4.8.

Figure 4.7 The first layer Trie tree construction

For rule R0 (000*, 0*), search every Trie, find the Trie where the prefix000* in, and then insert the rule directly in the root node. Here, what we needto highlight is that a variable is used to store the prefix that the Trie existin each of them, before insertion, firstly the prefix compared with the prefixstored in the variable, if variable does not exist the prefix need to insert,then skip the Trie. This will not needto enter the Trie for searching, greatly improve the access efficiency. Meanwhile,the experiment proved that adding an additional variable, the memory increaseis caused negligible. So insert all the rules, and complete the building of thearchitecture shown in Figure 4.8.

Figure 4.8 Complete packet classification architecture basedon prefix length clustering

As can be seen from the structure, the data structure does not exist anyunnecessary intermediate nodes, each node corresponds to a data, which achievehigh utilization of the memory, while the shorter path can improve searchingefficiency. Additional variables are used to facilitate the constructionprocess to some extent, the latter step of searching confirmed that thevariable can optimize the search process, and further improve the performanceof time.

Searching Process

Chapter III of the same structure to findsimilar processes, each of Trie in turn search to find the matching rules. However,due to an additional variable is stored in each tree, therefore we can comparewith the variable the prefix stored in before finding in the Trie, only theTrie exists the prefix that the packet satisfies, then enter the Trie forsearching, as shown in Figure 4.9.

Figure4.9 based packet classification prefix length clustering search process

Figure 4.9 illustrates in detail thestructure of the search process. Thick line shows the search path. For thesource IP address is 11010111, destination IP address of the packet is01011001, first looks in the first tree, you can see that the prefix 1* matchthe source IP address field of the packet, enter the first one to find Trietree , the root node corresponds to the prefix 1*, so enter into the secondlayer for lookup, find the rule R7, but a destination IP address prefix 1* ofR7 is not satisfied IP packet's destination address 01011001, therefore, R7does not match the data packet, search continue. Comparison with Trie 2, youcan see that the prefix Trie stored are not met the data packet; therefore,continue to look at the Trie 3. Similarly, the Trie 3 does not exist the prefix that met the packet, solook at the last Trie, thus skipping the two and three Trie searching. Finally,the prefix 110101* exists in the last Trie satisfies the packet, thereforeenter into the Trie for searching, the root node of 000101* does not satisfiedthe packet, proceed to the right child node, the prefix 110101* corresponding to right meet the data packet, so enter the level 2Trie for searching, and find matching rule R10. Therefore, as the Trie does notexist the prefix relationship, that is the first layer of every Triehierarchical tree structure only exists one maximum prefix matching the packet,consequently the finding in the Trie end, gets the final match of the rule R10.

As can be seen from the above searchingprocess, the entire search process only visited five nodes, greatly reducingthe number of access memory, saving the search time and space. Moreover, theintroduction of additional variables also optimizes the search process andimproves the search efficiency.

4.3.4 OpenFlow packet classification hierarchy searchingarchitecture

Before OpenFlow multistage flow table query,a header analysis step will be taken, the step extracts header fields from thepacket and obtains the corresponding values, as shown in Figure 4.10.

Figure 4.10 shows the head OpenFlow parsingthe entire process. The process will first initialize the matching field, usingthe extracted input ports, Ethernet source/destination address, Ethernet typefield data from the packet initialization, while the other fields areinitialized to 0, move to the next header field. After reaching the next headerfield, firstly judging whether the field of the head has the VLAN tag, if theheader field contains the VLAN tag, then initialized with the VLAN ID and PCP,and for the next time to use the Ethernet type checking after the Ethernet VLANhdr type, then skip the rest of the VLAN tag into the next step.

Figure 4.10 OpenFlow header parsing process

If the head is not a VLAN tag or has beenprocessed with VLAN header, then the next began determine whether the switchsupports multi-protocol label switching, that is MPLS process, if the switchsupports this feature, then determine whether the next header of packet is notMPLS shim header, which the Ethernet type is 0x8847 or 0x8848, if it is theMPLS shim header, then use MPLS labels and TC initialized, and skip the rest ofthe MPLS shim header processing, and also skip all operations down directlyinto data packet searching steps. Otherwise, enter the next step.

If the switch does not support MPLSprocessing or the packet does not contain MPLS shim header, then enter the nextstep to determine whether the switch supports ARP processing. If the switchsupports ARP processing, then determine whether the next packet header is anARP head, that is the Ethernet type is 0x0806, if the next packet header is theARP head, then use IP source/destination address in the ARP head and theoperation ARP code initialize and skip the next step, and enter the packetssearching. Otherwise, continue to the next step judgment.

Finally, judged whether the next header isthe IP header, i.e. Ethernet type is 0x0800, if yes, use the IPsource/destination addresses, protocol, and the ToS field initialize, ordirectly enter data packet searching. After the initialization, we require todetermine whether the IP packet is an IP Fragment, if not, we can directlysearch. If so, we need to further determine whether the IP protocol is 6, 17 or132, if it is, utilize UDP/TCP/SCTP source and destination port initializationL4 field, enter the data packet lookup. And if the protocol is not the abovetype, then determine whether the protocol type is 1, if equal to 1, then usethe ICMP type and opcode initialize L4 field, eventually entering the datapacket lookup stage.

From the head parsing, we can obviously getthe following five points: 1 input port, Ethernet source/destination MACaddress and Ethernet type must be used to match field; 2 if the packet exitsVLAN tag. then the VLAN ID and VLAN Priority field will be used; 3 if thepacket is an IP packet, the IP source/destination address, protocol, and ToSfield will be used; 4 If the packet is an IP fragmentation and IP protocolfield is 6,27 and 132, the input/output port field will be used; 5 If thepacket is an IP protocol and IP fragmentation is 17, then the ICMP type andoperation code will be used.

After the above analysis, we can be drawnthat not all fields of the flow table are needed to be matched. The flow tableexists a wildcards entries, the entry indicates which fields need to be matchedin the corresponding row, and which fields can be ignored. And this 12 fieldhas a fixed field mix, that is, if only select VLAN ID without using VLANPriority, then the match is meaningless, these two fields must appear or notappear at the same time. The conclusion can be drawn through the front that thefield combinations are the following four:1. Input Port, Ethernetsource/destination MAC address, Ethernet type; 2 VLAN ID, VLAN Priority; 3 IPsource/destination addresses, protocol, ToS, field, input port, output port;. 4IP source/destination addresses, protocol, ToS field, ICMP type, opcode. This12 field will be divided into four groups , both fields of these four groupsmust be used or not in use at the same time, you can match each of these fourfields, then merge the match results, shown in Figure 4.11.

Analysis of these four fields combinations wecan be seen that in the group 3 and group 4 has the four identical IP Fieldssection, therefore, in order to avoid duplication of these four fieldsextracted, we divide them into another group, so now there is 5 fieldcombinations. Design five search engines for these five field combinations, thefive search engines are: Ethernet search engines, including field input port,Ethernet source/destination MAC address, Ethernet type; VLAN search engines,including VLAN ID field and VLAN Priority; IP search engines, including thefield of the IP source/destination address, protocol, ToS field; port searchengines, including field input and output ports; ICMP search engines, includingfield ICMP type and operating codes. Then search inside each search engineseparately and find out the results, then this five search engines searchresults were combined to give a final matching.

Experiments show that pre-filtering for IPfield will filter out a lot of the rules, looking up at the few remaining rulescan save a lot of resources, and therefore we designed a hierarchical searchingstructure as shown in Figure 4.11. In this architecture, we first use the IPsearch engine to filter rules, and get the rules after filtering, and then usethe other four search engines searching these rules and find out the finalmatching rules.

In this paper, this architecture is scatteredwith a rounded approach, a large processing architecture is divided into fivesmall search engines to solve the simultaneous processing 12 fields of highenergy consumption, but also adopt the specific treatment to the specificclassification field. Meanwhile, the hierarchical structure can reduce thesearch time for searching, which can save memory space; the search efficiencyis further improved. Meanwhile, the design of search engines can ease theframework for future expansion, when the OpenFlow version updates again, we canintroduce a new search engine, which has good scalability.

Figure 4.11 OpenFlowpacket classification searching architecture

4.4 Experimental results and analysis

This article willcompare OpenFlow packet classification algorithm based on the prefix lengthclustering with the currently advanced multi-bit compressed Trie. Multi-bitcompressed tree is used relatively frequently and extensively in the field ofpacket classification algorithms, and also has better adaptability in OpenFlownetwork environment. This paper uses a range of tools to build OpenFlowenvironment, operating performance evaluation experiments in the OpenFlownetwork background.

This articlebuilds the OpenFlow network environment using the tools of Floodlight andMininet[14], the operating environment is: Intel (R) Core (TM) i5-3210M CPU,1GB RAM, ubuntu-12.04 operating system. Floodlight and Mininet are able tobuild a small OpenFlow network environment without hardware platform which iseasy to test and development. The experimental evaluation tools still usedPALAC statistical tools, while adding a Wireshark as a supplementary analysistools. Wireshark is network packet analysis software. It is able to get networkpackets, show the details of the content which network packet encapsulation aspossible as it can.

4.4.1 Experimental scheme

It can be obtained from the previous analysisthat an OpenFlow network requires three components. OpenFlow switches, FlowVisor and controllers, here we only build a simple experimental environment,thus providing OpenFlow Switch and Controller will be able to build a workingOpenFlow network.

Floodlight is enterprise-class OpenFlowcontroller using apache protocol and developed by the Java language, it canwork on the virtual switch and on the physical switch simultaneously, whichprovide a lot of convenience for experiments, therefore, here we use Floodlightas the controller. The Mininet is a process virtualization platform that allowsusers to run a software defined networks on their computer, and a series ofresearch and development, and the final realization of the code can migratedirectly to real hardware environment. Therefore, we use Mininet to virtual OpenFlowswitches and Host nodes, and provide the network topology.

Experimental framework of this paper isrunning Floodlight on the PC as a remote controller, and install a virtualmachine on the PC, running the virtual machine Mininet inside it, providingOpenFlow switches, Host node and network topology, and define its remotecontroller as the Floodlight running on the PC. Mininet uses a custom networktopology as shown in Figure 4.12.

4.12experimental topology

As shown in Figure4.12, Experimental takes the tree topology with the depth of 2, fan-out of 3,and totally has 4 switches and 9 hosts.

4.4.2 Algorithm Searching Time Performance Comparison

Here we have done two experiments: onelarge group classification rules, the number of rules in each of them is100-1000, the differences between adjacent classifier are 50, totally 20sub-groups; other is small classifier, each classifier contains rules for10-100, differences between adjacent classifier five rules, totally 20 groups. Wetake experiment on each of these two classification groups and get the realizedresults as shown in Figure 4.13.

As can be seen from Fig 4.13 (a), in thecase of the smaller rule set(10 to 100), with the increasing set of rules, theprefix length based clustering OpenFlow packet classification algorithm (thatis the algorithm represented as OFLC in the figure, OpenFlow LayerClassification), the gap between its classification time and multi-bitcompressed Trie tree (MT algorithm as figure represented, Multiple-Bit Tree)are increasing, while as shown in Figure 4.13 (b), in the big rule set case,the difference between the two algorithms is very obvious, and with the largerset of rules, the searching time of the OpenFlow packet classificationalgorithm based on the prefix length clustering gradually stabilize, furtherproved that the proposed algorithm not only in performance is greatly improved,and with the increase of the rule set, stability is strong, which makes theclassification algorithm more suitable for the increasing requirements of theclassification.

(a) (b)

Figure4.13 OpenFlow classification time performance comparison

4.4.3 Performance comparison algorithm memory

In memory performance comparisonexperiment, we take the same method at the time performance comparison, theexperimental results are shown in Figure 4.14.

(a) (b)

Figure 4.14 OpenFlow classification memory performancecomparison

Figure 4.14 (a) shows in the case ofsmaller rule sets with 10-30 rule, because the rule set is small which hadoccupied less memory, so that the proposed algorithm performance is not thatgreat advantages, and with the increasing number of rules, its compressionperformance advantage is still very evident than the growth of the multi-bitTrie tree.

In the case of large rule, as shown infigure 4.14(b), the gap between them is widening, also, with the increasingnumber of rules, the need of the proposed algorithm to increase the memory isnot particularly evident than Trie tree multi-bit compression algorithm, so itis more stable, and memory requirements are reduced a lot.

4.5 Conclusion

With the continuous development of networktechnology, the existing network has been unable to meet the increasing demandfor new network applications, therefore, network virtualization produced. Itsappearance accelerates the pace of reforming network infrastructure, providingendless possibilities for the future development of the Internet. Thedevelopment of OpenFlow also provides limitless prospects for the future ofnetwork virtualization. OpenFlow has changed the pattern of the originalInternet, changing the original data completely controlled by the networkswitching equipment forwarding process into two separate controls andforwarding functions, the controller is only responsible for the management ofOpenFlow switches and makes routing decisions, and OpenFlow switches makeforwarding operations. This provides flexibility for the future development ofnetwork virtualization, and lay a solid foundation for the next generationenterprise data centers and cloud computing networks. However, OpenFlow networkprovides flexibility and scalability as much as possible which brings a greatdeal of complexity for the dealing of the OpenFlow flow table. OpenFlow flowtable includes 12 fields (OpenFlow1.0 version), which is undoubtedly morecomplex than the traditional five-tuple packet classification, how to quicklydeal with 12 fields, and achieve better forwarding performance, has become akey issue for the future development of OpenFlow.

Firstly, this article proposed packetclassification algorithm based on prefix length clustering for the diversity ofOpenFlow flow table which has 12 field. The algorithm cluster based on theprefix length, divide different prefix length into different clusters, and thendeal separately, this would resolve the problem of the difference of dealingapproach with the 12 field, for the whole of the bulk, which can process thedifferent fields at the same way. Meanwhile, in order to further optimize, thearticle proposed selectivity merger schema, merging smaller clusters, further savethe processing costs and improve the classification performance. Finally, sinceOpenFlow has many fields need to processing, this paper designed a hierarchicalstructure search using the IP search engine pre-filtering the large rulesinitially at first to obtain a smaller set of rules, and then searching in thesmall set of rules, which further saving search time and memory. Experimentalresults show that the proposed classification algorithm has been greatlyimproved in time and memory. In summary, this paper proposed OpenFlow packetclassification algorithm which can solve the high consumption problem ofOpenFlow 12-tuple classification, which achieves a higher classificationperformance, improves the classification efficiency, and has a good practicalsignificance.