i.MX6 Gigabit Ethernet

最新推荐文章于 2021-07-04 19:53:23 发布

jianzhengzhouzjz

最新推荐文章于 2021-07-04 19:53:23 发布

阅读量1.7k

点赞数

We’ve recently been doing some digging into Gigabit Ethernet performance issues and questions for our i.MX6 boards and it’s time to publish some of our results. We’ve discovered a number of settings and code updates that can dramatically improve network stability and throughput.

For the impatient

There are some architectural limitations on i.MX6 boards but some configuration options and driver issues are more likely to cause performance issues. We’ve identified a number of fixes that make the situation markedly better as shown below.

Before (TCP)

 
        root@linaro-nano:~ 
        # cat /proc/cmdline 
       
        video=mxcfb0:dev=hdmi,1280x720M@60, 
        if 
        =RGB24 video=mxcfb1:off video=mxcfb2:off ... 
       
        root@linaro-nano:~ 
        # cat /proc/version 
       
        Linux version 3.0.35-2026-geaaf30e (b21710@bluemeany) ... 
       
        root@linaro-nano:~ 
        # while iperf -c 192.168.0.162 -r \ 
       
        |  
        grep 
         Mbits ;  
        do 
         echo 
         -n ;  
        done 
       
        [  5]  0.0-10.0 sec   474 MBytes   397 Mbits 
        /sec 
       
        [  4]  0.0-10.1 sec  10.1 MBytes  8.47 Mbits 
        /sec 
       
        [  5]  0.0-10.0 sec   474 MBytes   397 Mbits 
        /sec 
       
        [  4]  0.0-10.0 sec  10.4 MBytes  8.72 Mbits 
        /sec 
       
        [  5]  0.0-10.0 sec   472 MBytes   396 Mbits 
        /sec 
       
        [  4]  0.0-10.0 sec  17.2 MBytes  14.4 Mbits 
        /sec

After (TCP)

 
        root@linaro-nano:~ 
        # cat /proc/cmdline 
       
        enable_wait_mode=off video=mxcfb0:dev=hdmi,1280x720M@60, 
        if 
        =RGB24 video=mxcfb1:off ... 
       
        root@linaro-nano:~ 
        # cat /proc/version 
       
        Linux version 3.0.35-2026-geaaf30e-02076-g68b5fa7 ... 
       
        root@linaro-nano:~ 
        # while iperf -c 192.168.0.162 -r \ 
       
        |  
        grep 
         Mbits ;  
        do 
         echo 
         -n ;  
        done 
       
        [  5]  0.0-10.0 sec   473 MBytes   397 Mbits 
        /sec 
       
        [  4]  0.0-10.0 sec   509 MBytes   426 Mbits 
        /sec 
       
        [  5]  0.0-10.0 sec   473 MBytes   397 Mbits 
        /sec 
       
        [  4]  0.0-10.0 sec   508 MBytes   426 Mbits 
        /sec 
       
        [  5]  0.0-10.0 sec   471 MBytes   395 Mbits 
        /sec 
       
        [  4]  0.0-10.0 sec   510 MBytes   427 Mbits 
        /sec

In the output from iperf above, each pair of lines indicate the transmit and receive bandwidth in that order. Note the horrible performance numbers for receive in the baseline. The UDP performance is markedly better, with transmit throughput of ~450 Mbits/s and receive speeds that can exceed 600 Mbits/s.

After (UDP)

1 2	`[ 4] 0.0- 1.0 sec 55.3 MBytes 462 Mbits` `/sec` `0.084 ms 15` `/39459` `(0.038%)` `[ 3] 0.0- 1.0 sec 72.8 MBytes 611 Mbits` `/sec` `0.012 ms -1` `/51843` `(-0.0019%)`

The details below will provide details of how we tested things, and describe a series of patches that lead to this improvement in both stability and speed.

Test environment

Four devices were used during the tests defined below:

A Sony Vaio laptop with internal Gb Ethernet adapter,
A Nitrogen6X board with i.MX6Quad TO 1.0
A SABRE SD board with i.MX6Quad TO 1.1
A SABRE Lite board with i.MX6Quad TO 1.2
A Cisco Linksys SE2500 Gigabit Ethernet switch

The tests used a Linaro nano userspace. A tar-ball is available here that contains all of the kernel versions mentioned. Specific baseline kernel versions include:

Blue Meany – This is the binary kernel (uImage) provided in the images_L3.0.35_12.09.01-GArelease. We did not re-compile this kernel.
Boundary Before – This is the first version we compiled, as a test to ensure that it matchesBlue Meany and serves as the baseline for this series of tests.
Boundary Latest – This is the latest release from the boundary-L3.0.35_12.09.01-GA branch of our Github kernel.

All of the testing was done with kernels based on Freescale’s L3.0.35_12.09.01_GA release with various patches as described.

First change: enable_wait_mode=off

We’ve documented the first change made in this post a week ago, but we didn’t mention the throughput implications. In the output below, you can see around a 10% improvement in the Blue Meany kernel by just adding enable_wait_mode=off to the kernel command-line. This change made a huge difference on Tapeout 1.2. It increased the receive speed from on the order of 10 Mbits/s to ~200 Mbits/s. on Tapeout 1.0 devices, the difference was less dramatic, presumably because a number of the spots that use enable_wait_mode in the kernel are also conditional on the silicon revision. In any case, with just this change, both revisions of board have markedly increased receive performance as shown below.

 
        root@linaro-nano:~ 
        # cat /proc/version 
       
        Linux version 3.0.35-2026-geaaf30e (b21710@bluemeany)... 
       
        root@linaro-nano:~ 
        # cat /proc/cmdline 
       
        enable_wait_mode=off ... 
       
        root@linaro-nano:~ 
        # while iperf -c 192.168.0.162 -r | grep Mbits ; do echo -n ; done 
       
        [  5]  0.0-10.0 sec   443 MBytes   372 Mbits 
        /sec 
       
        [  4]  0.0-10.0 sec   250 MBytes   210 Mbits 
        /sec 
       
        [  5]  0.0-10.0 sec   476 MBytes   399 Mbits 
        /sec 
       
        [  4]  0.0-10.0 sec   252 MBytes   211 Mbits 
        /sec 
       
        ^C

As noted in the previous post, this environment update will also make ping times more consistent.

Measuring performance

In the summary above, we showed the output from the simplest invocation of iperf . When used as shown, the program will connect over TCP:

 
        root@linaro-nano:~ 
        # iperf -c 192.168.0.162 -r 
       
        ------------------------------------------------------------ 
       
        Server listening on TCP port 5001 
       
        TCP window size: 85.3 KByte (default) 
       
        ------------------------------------------------------------ 
       
        ------------------------------------------------------------ 
       
        Client connecting to 192.168.0.162, TCP port 5001 
       
        TCP window size: 58.4 KByte (default) 
       
        ------------------------------------------------------------ 
       
        [  5]  
        local 
         192.168.0.119 port 52681 connected with 192.168.0.162 port 5001 
       
        [ ID] Interval       Transfer     Bandwidth 
       
        [  5]  0.0-10.0 sec   475 MBytes   398 Mbits 
        /sec 
       
        [  4]  
        local 
         192.168.0.119 port 5001 connected with 192.168.0.162 port 42421 
       
        [  4]  0.0-10.0 sec   228 MBytes   191 Mbits 
        /sec

Because of the use of TCP, flow control is imposed on the link by the upper layers, and the bandwidth is throttled to the slower of the speeds of the two ends. Using UDP removes this possible bottleneck and also allows a flag to set the target bandwidth ( -b SPEED ) and will show us the amount of packet loss. The -t flag allows us to override the default 10 second test for quicker results.

 
        root@linaro-nano:~ 
        # iperf -c 192.168.0.162 -r -u -b 200M -t 2 
       
        ------------------------------------------------------------ 
       
        Server listening on UDP port 5001 
       
        Receiving 1470 byte datagrams 
       
        UDP buffer size:  106 KByte (default) 
       
        ------------------------------------------------------------ 
       
        ------------------------------------------------------------ 
       
        Client connecting to 192.168.0.162, UDP port 5001 
       
        Sending 1470 byte datagrams 
       
        UDP buffer size:  106 KByte (default) 
       
        ------------------------------------------------------------ 
       
        [  4]  
        local 
         192.168.0.119 port 51275 connected with 192.168.0.162 port 5001 
       
        [ ID] Interval       Transfer     Bandwidth 
       
        [  4]  0.0- 2.0 sec  48.2 MBytes   202 Mbits 
        /sec 
       
        [  4] Sent 34359 datagrams 
       
        [  4] Server Report: 
       
        [  4]  0.0- 2.0 sec  48.1 MBytes   202 Mbits 
        /sec   
         0.063 ms   71 
        /34358 
         (0.21%) 
       
        [  4]  0.0- 2.0 sec  1 datagrams received out-of-order 
       
        [  3]  
        local 
         192.168.0.119 port 5001 connected with 192.168.0.162 port 53796 
       
        [  3]  0.0- 2.0 sec  48.3 MBytes   203 Mbits 
        /sec   
         0.038 ms    0 
        /34483 
         (0%) 
       
        [  3]  0.0- 2.0 sec  1 datagrams received out-of-order

Doing a quick smoke-test at a few key rates shows some interesting results. The following are slightly edited to make them more readable:

100Mbit/s UDP test

1

2

3

 
        root@linaro-nano:~ 
        # iperf -c 192.168.0.162 -u -r -b 100M ; 
       
 
        [  4]  0.0- 2.0 sec  23.9 MBytes   100 Mbits 
        /sec   
         0.048 ms    0 
        /17082 
         (0%) 
       
 
        [  3]  0.0- 2.0 sec  24.0 MBytes   101 Mbits 
        /sec   
         0.001 ms    3 
        /17094 
         (0.018%) 
       

400Mbit/s UDP test

1

2

3

 
        root@linaro-nano:~ 
        # iperf -c 192.168.0.162 -u -r -b 400M -t 2; 
       
 
        [  4]  0.0- 2.0 sec  34.1 MBytes   143 Mbits 
        /sec   
         0.091 ms    0 
        /24338 
         (0%) 
       
 
        [  3]  0.0- 5.7 sec   205 MBytes   301 Mbits 
        /sec   
         0.013 ms 198303 
        /344825 
         (58%) 
       

1Gbit/s UDP test

1

2

3

 
        root@linaro-nano:~ 
        # iperf -c 192.168.0.162 -u -r -b 1000M -t 2; 
       
 
        [  4]  0.0- 2.0 sec   108 MBytes   453 Mbits 
        /sec   
         0.036 ms   54 
        /77241 
         (0.07%) 
       
 
        [  3]  0.0- 2.1 sec  64.9 MBytes   254 Mbits 
        /sec  
         15.539 ms 95165 
        /141465 
         (67%) 
       

As you can see, there’s no loss at 100M, very little loss at 400M and a huge amount of receiver loss at 1G (the second line reports the receiver numbers). Interestingly, the received bandwidth also decreased when going from 400M to 1Gbit/s. Using a script to be a bit more thorough, and convince ourselves that the pattern holds:

 
        root@linaro-nano:~ 
        # cat > bwtest.sh << EOF 
       
 
        #!/bin/sh 
       
 
        bw=50; 
       
 
        while 
         [ \$bw - 
        le 
         1000 ];  
        do 
       
 
             
        echo 
         "----------bandwidth \$bw" 
         ; 
       
 
             
        iperf -c 192.168.0.162 -u -r -t 2 -b \${bw}M |  
        grep 
         % ; 
       
 
             
        bw=\` 
        expr 
         \$bw + 50\` ; 
       
 
        done 
       
 
        EOF 
       
 
        root@linaro-nano:~ 
        # chmod a+x bwtest.sh 
       
 
        root@linaro-nano:~ 
        # ./bwtest.sh  
       
 
        root@linaro-nano:~ 
        # ./bwtest.sh  
       
 
        ----------bandwidth 50 
       
 
        [  4]  0.0- 2.0 sec  11.9 MBytes  50.0 Mbits 
        /sec   
         0.010 ms    0/ 8510 (0%) 
       
 
        [  3]  0.0- 2.0 sec  11.9 MBytes  50.0 Mbits 
        /sec   
         0.002 ms    0/ 8511 (0%) 
       
 
        ----------bandwidth 100 
       
 
        [  4]  0.0- 2.0 sec  24.0 MBytes   100 Mbits 
        /sec   
         0.048 ms    0 
        /17085 
         (0%) 
       
 
        [  3]  0.0- 2.0 sec  24.0 MBytes   100 Mbits 
        /sec   
         0.009 ms    4 
        /17094 
         (0.023%) 
       
 
        ----------bandwidth 150 
       
 
        [  4]  0.0- 2.0 sec  35.9 MBytes   150 Mbits 
        /sec   
         0.063 ms    8 
        /25601 
         (0.031%) 
       
 
        [  3]  0.0- 2.0 sec  35.9 MBytes   151 Mbits 
        /sec   
         0.010 ms    0 
        /25641 
         (0%) 
       
 
        ----------bandwidth 200 
       
 
        [  4]  0.0- 2.0 sec  48.2 MBytes   202 Mbits 
        /sec   
         0.066 ms    0 
        /34413 
         (0%) 
       
 
        [  3]  0.0- 2.0 sec  48.3 MBytes   203 Mbits 
        /sec   
         0.028 ms    0 
        /34483 
         (0%) 
       
 
        ----------bandwidth 250 
       
 
        [  4]  0.0- 2.0 sec  59.2 MBytes   248 Mbits 
        /sec   
         0.056 ms   52 
        /42246 
         (0.12%) 
       
 
        [  3]  0.0- 2.0 sec  59.7 MBytes   250 Mbits 
        /sec   
         0.028 ms    0 
        /42553 
         (0%) 
       
 
        ----------bandwidth 300 
       
 
        [  4]  0.0- 2.0 sec  71.7 MBytes   301 Mbits 
        /sec   
         0.030 ms   55 
        /51222 
         (0.11%) 
       
 
        [  3]  0.0- 2.0 sec  71.8 MBytes   302 Mbits 
        /sec   
         0.024 ms   33 
        /51282 
         (0.064%) 
       
 
        ----------bandwidth 350 
       
 
        [  4]  0.0- 2.0 sec  83.8 MBytes   352 Mbits 
        /sec   
         0.040 ms   87 
        /59888 
         (0.15%) 
       
 
        [  3]  0.0- 2.0 sec  83.7 MBytes   355 Mbits 
        /sec   
         0.018 ms  868 
        /60606 
         (1.4%) 
       
 
        ----------bandwidth 400 
       
 
        [  4]  0.0- 2.0 sec  95.6 MBytes   401 Mbits 
        /sec   
         0.043 ms    5 
        /68180 
         (0.0073%) 
       
 
        [  3]  0.0- 2.0 sec  90.2 MBytes   379 Mbits 
        /sec   
         0.012 ms 4601 
        /68965 
         (6.7%) 
       
 
        ----------bandwidth 450 
       
 
        [  4]  0.0- 2.0 sec   105 MBytes   440 Mbits 
        /sec   
         0.036 ms  369 
        /75113 
         (0.49%) 
       
 
        [  3]  0.0- 2.0 sec  98.9 MBytes   415 Mbits 
        /sec   
         0.013 ms 6388 
        /76922 
         (8.3%) 
       
 
        ----------bandwidth 500 
       
 
        [  4]  0.0- 2.0 sec   110 MBytes   460 Mbits 
        /sec   
         0.031 ms   36 
        /78302 
         (0.046%) 
       
 
        [  3]  0.0- 2.0 sec  99.5 MBytes   420 Mbits 
        /sec   
         0.010 ms 15956 
        /86956 
         (18%) 
       
 
        ----------bandwidth 550 
       
 
        [  4]  0.0- 2.0 sec   110 MBytes   459 Mbits 
        /sec   
         0.031 ms   22 
        /78186 
         (0.028%) 
       
 
        [  3]  0.0- 2.0 sec   105 MBytes   440 Mbits 
        /sec   
         0.008 ms 20359 
        /95236 
         (21%) 
       
 
        ----------bandwidth 600 
       
 
        [  4]  0.0- 2.0 sec   109 MBytes   456 Mbits 
        /sec   
         0.034 ms    0 
        /77709 
         (0%) 
       
 
        [  3]  0.0- 2.0 sec  90.7 MBytes   381 Mbits 
        /sec   
         0.009 ms 40526 
        /105254 
         (39%) 
       
 
        ----------bandwidth 650 
       
 
        [  4]  0.0- 2.0 sec   109 MBytes   458 Mbits 
        /sec   
         0.035 ms    0 
        /77991 
         (0%) 
       
 
        [  3]  0.0- 2.2 sec  91.2 MBytes   340 Mbits 
        /sec  
         15.658 ms 46033 
        /111110 
         (41%) 
       
 
        ----------bandwidth 700 
       
 
        [  4]  0.0- 2.0 sec   109 MBytes   458 Mbits 
        /sec   
         0.034 ms  111 
        /78120 
         (0.14%) 
       
 
        [  3]  0.0- 1.9 sec  82.6 MBytes   358 Mbits 
        /sec   
         0.009 ms 66049 
        /124997 
         (53%) 
       
 
        ----------bandwidth 750 
       
 
        [  4]  0.0- 2.0 sec   110 MBytes   463 Mbits 
        /sec   
         0.031 ms   62 
        /78837 
         (0.079%) 
       
 
        [  3]  0.0- 2.2 sec  82.0 MBytes   311 Mbits 
        /sec  
         15.645 ms 74847 
        /133328 
         (56%) 
       
 
        ----------bandwidth 800 
       
 
        [  4]  0.0- 2.0 sec   109 MBytes   458 Mbits 
        /sec   
         0.029 ms   11 
        /78013 
         (0.014%) 
       
 
        [  3]  0.0- 2.0 sec  75.1 MBytes   315 Mbits 
        /sec   
         0.006 ms 88480 
        /142033 
         (62%) 
       
 
        ----------bandwidth 850 
       
 
        [  4]  0.0- 2.0 sec   109 MBytes   456 Mbits 
        /sec   
         0.056 ms   10 
        /77684 
         (0.013%) 
       
 
        [  3]  0.0- 2.2 sec  70.2 MBytes   262 Mbits 
        /sec  
         15.214 ms 99717 
        /149777 
         (67%) 
       
 
        ----------bandwidth 900 
       
 
        [  4]  0.0- 2.0 sec   109 MBytes   457 Mbits 
        /sec   
         0.032 ms   85 
        /77943 
         (0.11%) 
       
 
        [  3]  0.0- 2.0 sec  69.4 MBytes   290 Mbits 
        /sec   
         0.009 ms 100431 
        /149932 
         (67%) 
       
 
        ----------bandwidth 950 
       
 
        [  4]  0.0- 2.0 sec   108 MBytes   451 Mbits 
        /sec   
         0.075 ms    0 
        /76778 
         (0%) 
       
 
        [  3]  0.0- 2.2 sec  71.4 MBytes   266 Mbits 
        /sec  
         15.250 ms 91053 
        /142012 
         (64%) 
       
 
        ----------bandwidth 1000 
       
 
        [  4]  0.0- 2.0 sec   108 MBytes   453 Mbits 
        /sec   
         0.076 ms    0 
        /77143 
         (0%) 
       
 
        [  3]  0.0- 1.9 sec  71.2 MBytes   311 Mbits 
        /sec   
         0.029 ms 90616 
        /141376 
         (64%) 
       

From this, it’s pretty clear that the transmit throughput rises pretty linearly to ~450 Mbits/s and stays there. The receiver bandwidth scales linearly to ~400 Mbits/s and then starts losing ground as the rate increases. Also note that we don’t lose packets on the transmit side, only on the receiver side.

Cratering performance

Using UDP also exposed some issues on the boundary kernel. Using the boundary-before kernel, we see that the receive performance degrades as the bandwidth is increased past 400M. Note that this test has an updated bwtest.sh script that allows the test time to be set through the tsecs environment variable and the bandwidth increment to be set through incr .

 
        root@linaro-nano:~ 
        # tsecs=2 incr=200 ./bwtest.sh  
       
 
        ----------bandwidth 200 
       
 
        [  4]  0.0- 2.0 sec  48.1 MBytes   203 Mbits 
        /sec   
         0.061 ms  164 
        /34479 
         (0.48%) 
       
 
        [  3]  0.0- 2.0 sec  48.3 MBytes   203 Mbits 
        /sec   
         0.034 ms    0 
        /34483 
         (0%) 
       
 
        ----------bandwidth 400 
       
 
        [  4]  0.0- 2.0 sec  96.5 MBytes   405 Mbits 
        /sec   
         0.040 ms   67 
        /68911 
         (0.097%) 
       
 
        [  3]  0.0- 1.9 sec  93.9 MBytes   406 Mbits 
        /sec   
         0.035 ms 1990 
        /68965 
         (2.9%) 
       
 
        ----------bandwidth 600 
       
 
        [  4]  0.0- 2.0 sec   110 MBytes   460 Mbits 
        /sec   
         0.030 ms  234 
        /78615 
         (0.3%) 
       
 
        [  3]  0.0- 2.3 sec   110 MBytes   410 Mbits 
        /sec  
         15.672 ms 26703 
        /105262 
         (25%) 
       
 
        ----------bandwidth 800 
       
 
        [  4]  0.0- 2.0 sec   110 MBytes   461 Mbits 
        /sec   
         0.033 ms    0 
        /78511 
         (0%) 
       
 
        [  3]  0.0- 2.2 sec  2.91 MBytes  11.1 Mbits 
        /sec  
         101.865 ms 140266 
        /142342 
         (99%) 
       
 
        ----------bandwidth 1000 
       
 
        [  4]  0.0- 2.0 sec   110 MBytes   461 Mbits 
        /sec   
         0.033 ms    0 
        /78383 
         (0%) 
       
 
        [  3]  0.0- 0.2 sec  90.4 KBytes  3.18 Mbits 
        /sec  
         110.420 ms 141295 
        /141358 
         (1e+02%) 
       

This one took a while to find because it turned out to not be a code change between Blue Meany and our source tree, but a configuration change to enable a new driver API (NAPI). This API is is described on this web page . It is an architecture to decrease the interrupt overhead on high-performance networks. When enabled, the interrupt handler in the FEC driver schedules but does not process incoming packets. Instead, those are handled out of interrupt context. The change to our config file is trivial , but performance is much better at higher speeds as shown below:

 
        ----------bandwidth 200 
       
 
        [  5]  0.0- 2.0 sec  48.1 MBytes   203 Mbits 
        /sec   
         0.063 ms  153 
        /34482 
         (0.44%) 
       
 
        [  3]  0.0- 2.0 sec  48.3 MBytes   203 Mbits 
        /sec   
         0.029 ms    0 
        /34483 
         (0%) 
       
 
        ----------bandwidth 400 
       
 
        [  4]  0.0- 2.0 sec  96.4 MBytes   404 Mbits 
        /sec   
         0.052 ms  151 
        /68888 
         (0.22%) 
       
 
        [  3]  0.0- 1.9 sec  85.5 MBytes   381 Mbits 
        /sec   
         0.018 ms 7949 
        /68965 
         (12%) 
       
 
        ----------bandwidth 600 
       
 
        [  4]  0.0- 2.0 sec   109 MBytes   458 Mbits 
        /sec   
         0.075 ms  269 
        /78262 
         (0.34%) 
       
 
        [  3]  0.0- 1.9 sec   102 MBytes   447 Mbits 
        /sec   
         0.007 ms 32747 
        /105262 
         (31%) 
       
 
        ----------bandwidth 800 
       
 
        [  4]  0.0- 2.0 sec   110 MBytes   461 Mbits 
        /sec   
         0.090 ms    0 
        /78464 
         (0%) 
       
 
        [  3]  0.0- 2.0 sec  82.2 MBytes   347 Mbits 
        /sec   
         0.006 ms 84223 
        /142847 
         (59%) 
       
 
        ----------bandwidth 1000 
       
 
        [  4]  0.0- 2.0 sec   110 MBytes   461 Mbits 
        /sec   
         0.072 ms  123 
        /78698 
         (0.16%) 
       
 
        [  3]  0.0- 2.1 sec  70.6 MBytes   278 Mbits 
        /sec  
         15.230 ms 91863 
        /142251 
         (65%) 
       

Note that there’s still a lot of loss at rates of 400M and above.

How to improve this

Where is that loss coming from? If we look at ifconfig we can see that the network driver is aware of the dropped packets:

 
        root@linaro-nano:~ 
        # ifconfig eth0 
       
        eth0      Link encap:Ethernet  HWaddr 00:19:b8:00:fa:9a   
       
        inet addr:192.168.0.119  Bcast:192.168.0.255  Mask:255.255.255.0 
       
        UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1 
       
        RX packets:3901387 errors:782502 dropped:0 overruns:782502 frame:782502 
       
        TX packets:3775053 errors:0 dropped:0 overruns:0 carrier:0 
       
        collisions:0 txqueuelen:1000  
       
        RX bytes:4178248807 (4.1 GB)  TX bytes:853327178 (853.3 MB)

The condition that increments the overrun count is here . Table 23-85 in the i.MX6DQ Reference Manual says that this means “A receive FIFO overrun occurred during frame reception”. The i.MX6DQ Errata document states in Errata ERR004512 that the maximum performance “is limited to 470 Mbps (total for Tx and Rx)”. The errata doesn’t say what the precise symptom of exceeding that limit might be, but this sure looks like it. The numbers above are pretty close to the 400Mbit/s reported in the errata. It turns out that we can do something about this. The Ethernet spec calls for a form of flow control using something called “pause frames”, which allows a receiver to tell a sender to back off for a quantum of time. That’s what this patch does. The very first part of the commit shows the addition of SUPPORTED_Pause to the phy device for i.MX6Quad and i.MX6DualLite processors. That part is key.

Sidebar: check out some other tools

Before we go too much further, we need to introduce a couple of key tools to understanding this. The first is a tool called ethtool . It is designed to allow you control the low-level functions of a network adapter. We’ll use it to see the state of the link negotiation. The second is a tool we developed named devregs . It is designed to allow access to device registers through /dev/mem . You can find details in this post . The post describes the use of the program on i.MX5x, but it’s perfectly happy to run on i.MX6 and we have a lot of registers defined in devregs_imx6x.dat . Let’s look at the output before and after the patch:

Before

 
        root@linaro-nano:~ 
        # cat /proc/version 
       
        Linux version 3.0.35-2026-geaaf30e (b21710@bluemeany) 
       
        root@linaro-nano:~ 
        # ethtool eth0 
       
        Settings  
        for 
         eth0: 
       
        Supported ports: [ TP MII ] 
       
        Supported link modes:   10baseT 
        /Half 
         10baseT 
        /Full 
       
        100baseT 
        /Half 
         100baseT 
        /Full 
       
        1000baseT 
        /Half 
         1000baseT 
        /Full 
       
        Supported pause frame use: No 
       
        Supports auto-negotiation: Yes 
       
        Advertised link modes:  10baseT 
        /Half 
         10baseT 
        /Full 
       
        100baseT 
        /Half 
         100baseT 
        /Full 
       
        1000baseT 
        /Half 
         1000baseT 
        /Full 
       
        Advertised pause frame use: No 
       
        Advertised auto-negotiation: Yes 
       
        Speed: 1000Mb 
        /s 
       
        Duplex: Full 
       
        Port: MII 
       
        PHYAD: 6 
       
        Transceiver: external 
       
        Auto-negotiation: on 
       
        Link detected:  
        yes 
       
        root@linaro-nano:~ 
        # devregs ENET_RCR 
       
        ENET_RCR:0x02188084     =0x05ee0244

After

 
        root@linaro-nano:~ 
        # cat /proc/version 
       
        root@linaro-nano:~ 
        # cat /proc/version 
       
        Linux version 3.0.35-2026-geaaf30e-02074-g92a9e1e ... 
       
        root@linaro-nano:~ 
        # ethtool eth0 
       
        Settings  
        for 
         eth0: 
       
        Supported ports: [ TP MII ] 
       
        Supported link modes:   10baseT 
        /Half 
         10baseT 
        /Full 
       
        100baseT 
        /Half 
         100baseT 
        /Full 
       
        1000baseT 
        /Half 
         1000baseT 
        /Full 
       
        Supported pause frame use: Symmetric 
       
        Supports auto-negotiation: Yes 
       
        Advertised link modes:  10baseT 
        /Half 
         10baseT 
        /Full 
       
        100baseT 
        /Half 
         100baseT 
        /Full 
       
        1000baseT 
        /Half 
         1000baseT 
        /Full 
       
        Advertised pause frame use: Symmetric 
       
        Advertised auto-negotiation: Yes 
       
        Speed: 1000Mb 
        /s 
       
        Duplex: Full 
       
        Port: MII 
       
        PHYAD: 6 
       
        Transceiver: external 
       
        Auto-negotiation: on 
       
        Link detected:  
        yes 
       
        root@linaro-nano:~ 
        # devregs ENET_RCR 
       
        ENET_RCR:0x02188084     =0x05ee0264

The key things to look at are the line that says “Supported pause frame use” and the line that shows the ENET_RCR register. Bit 5 of the ENET_RCR enables flow control (generation of pause frames) if set, and you can see that after the patch, flow control is enabled. Unfortunately, the situation is still much the same: The receive error numbers using bwtest.sh start rising and the bandwidth starts falling as we exceed 450 MBits/s.

Pause frames on TO1.0

The next patch in the series fixes this. It turns out that tweaking the default almost empty threshold on Tapeout 1.0 helps the situation, as does increasing the receive FIFO section full register. After applying this patch, we can see stable results when we overload the ethernet receiver. Note that we’ve also added a couple of lines to bwtest.sh to read the values of the ENET_IEEE_T_FDXFC and ENET_IEEE_R_MACERR statistics registers. These tell us how many pause frames were transmitted and how many receive FIFO overruns are seen.

 
        root@linaro-nano:~ 
        # tsecs=2 incr=200 ./bwtest.sh  
       
        ----------bandwidth 200 
       
        [  4]  0.0- 1.7 sec  40.1 MBytes   203 Mbits 
        /sec  
         50.557 ms  176 
        /28804 
         (0.61%) 
       
        [  3]  0.0- 2.0 sec  48.3 MBytes   203 Mbits 
        /sec   
         0.034 ms    0 
        /34483 
         (0%) 
       
        ENET_IEEE_T_FDXFC:0x02188270    =0x00000d6c 
       
        ENET_IEEE_R_MACERR:0x021882d8    =0x00000000 
       
        ----------bandwidth 400 
       
        [  4]  0.0- 2.0 sec  96.5 MBytes   405 Mbits 
        /sec   
         0.043 ms  103 
        /68952 
         (0.15%) 
       
        [  3]  0.0- 1.9 sec  90.0 MBytes   406 Mbits 
        /sec   
         0.021 ms 4751 
        /68965 
         (6.9%) 
       
        ENET_IEEE_T_FDXFC:0x02188270    =0x00001ad0 
       
        ENET_IEEE_R_MACERR:0x021882d8    =0x00000000 
       
        ----------bandwidth 600 
       
        [  4]  0.0- 2.0 sec   110 MBytes   462 Mbits 
        /sec   
         0.056 ms    0 
        /78679 
         (0%) 
       
        [  3]  0.0- 1.9 sec   129 MBytes   583 Mbits 
        /sec   
         0.061 ms 4750 
        /96927 
         (4.9%) 
       
        ENET_IEEE_T_FDXFC:0x02188270    =0x0000f544 
       
        ENET_IEEE_R_MACERR:0x021882d8    =0x00000000 
       
        ----------bandwidth 800 
       
        [  4]  0.0- 2.0 sec   110 MBytes   461 Mbits 
        /sec   
         0.030 ms   92 
        /78732 
         (0.12%) 
       
        [  3]  0.0- 2.0 sec   138 MBytes   580 Mbits 
        /sec   
         0.062 ms   20 
        /98693 
         (0.02%) 
       
        ENET_IEEE_T_FDXFC:0x02188270    =0x00000310 
       
        ENET_IEEE_R_MACERR:0x021882d8    =0x00000000 
       
        ----------bandwidth 1000 
       
        [  4]  0.0- 2.0 sec   107 MBytes   449 Mbits 
        /sec   
         0.060 ms  465 
        /76969 
         (0.6%) 
       
        [  3]  0.0- 1.9 sec   129 MBytes   583 Mbits 
        /sec   
         0.021 ms 4687 
        /96830 
         (4.8%) 
       
        ENET_IEEE_T_FDXFC:0x02188270    =0x0000f482 
       
        ENET_IEEE_R_MACERR:0x021882d8    =0x00000000

Now that’s better! We’re seeing no FIFO overruns even up to 1G and a substantial increase in receive performance. Tapeout 1.2 shows even better performance, peaking at over 630 Mbit/s.

Final changes

The final two patches are really belt and suspenders updates. The first sets the Frame truncation receive length register so a FIFO error will not result in an extra long frame and spew error messages to the kernel log. The second treats frames with FIFO errors in the same way as framing errors and doesn’t forward them to the network stack for processing. We found that this increased performance in the presence of FIFO overruns.

Recap

We’ve uploaded the SD card image used in this testing so that you can repeat our results:

imx6-iperf-test-20121214.tar.gz

If you format a single-partition SD card as ext3 , you can extract it like so:

1

2

3

4

5

 
        ~/$  
        sudo 
         mkfs.ext3 -L iperf  
        /dev/sdc1 
       
 
        ~/$ udisks -- 
        mount 
         /dev/sdc1 
       
 
        ... Assuming auto- 
        mount 
         as  
        /media/iperf 
       
 
        ~/$  
        sudo 
         tar 
         -C  
        /media/iperf/ 
         -zxvf imx6-iperf- 
        test 
        -20121214. 
        tar 
        .gz 
       
 
        ~/$  
        sync 
         &&  
        sudo 
         umount 
         /media/iperf 
       

As mentioned earlier, this started off as a Linaro nano filesystem. We updated it to include a boot script, the devregs program and each of the kernels used in the tests above. The SD card image has each in the /boot directory. We encourage you to download the image, test it out on your boards and report back. Note that we haven’t yet updated the Android kernel tree, but will do that shortly. We’ll also be testing i.MX6 Solo, Dual-Lite, and the new SABRE SDB boards in upcoming days. Stay tuned to the blog for updates. If you’re using Gigabit ethernet, you’re likely to see improvements by adopting these updates.