数据分析
得到了以下列字符串开头的文本数据,我们需要进行处理
名称 | 特性 |
---|---|
correct | 此id的性别、活动时间都存在 |
errTime | 此id的性别有,活动时间无 (改成noTime可能更好) |
unkownsex | 此id的性别无法得知 |
notexist | 此id不存在相应用户 |
httperror | 此id由于服务器故障,需要回滚处理 |
回滚
我们需要对httperror的数据进行再处理
因为代码的原因,具体可见本系列文章(二),会导致文本里面同一个id连续出现几次httperror记录:
<code class="hljs cs has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">//httperror265001_266001.txt</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">265002</span> httperror <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">265002</span> httperror <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">265002</span> httperror <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">265002</span> httperror <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">265003</span> httperror <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">265003</span> httperror <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">265003</span> httperror <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">265003</span> httperror</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li></ul>
所以我们在代码里要考虑这种情形,不能每一行的id都进行处理,是判断是否重复的id。
java里面有缓存方法可以避免频繁读取硬盘上的文件,python其实也有,可以见这篇文章。
<code class="hljs python has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">main</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">()</span>:</span> reload(sys) sys.setdefaultencoding(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'utf-8'</span>) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">global</span> sexRe,timeRe,notexistRe,url1,url2,file1,file2,file3,file4,startNum,endNum,file5 sexRe = re.compile(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'em>\u6027\u522b</em>(.*?)</li'</span>) timeRe = re.compile(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'em>\u4e0a\u6b21\u6d3b\u52a8\u65f6\u95f4</em>(.*?)</li'</span>) notexistRe = re.compile(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'(p>)\u62b1\u6b49\uff0c\u60a8\u6307\u5b9a\u7684\u7528\u6237\u7a7a\u95f4\u4e0d\u5b58\u5728<'</span>) url1 = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'http://rs.xidian.edu.cn/home.php?mod=space&uid=%s'</span> url2 = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'http://rs.xidian.edu.cn/home.php?mod=space&uid=%s&do=profile'</span> file1 = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ruisi\\correct_re.txt'</span> file2 = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ruisi\\errTime_re.txt'</span> file3 = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ruisi\\notexist_re.txt'</span> file4 = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ruisi\\unkownsex_re.txt'</span> file5 = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ruisi\\httperror_re.txt'</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#遍历文件夹里面以httperror开头的文本</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> filename <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> os.listdir(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">r'E:\pythonProject\ruisi'</span>): <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> filename.startswith(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'httperror'</span>): count = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> newName = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'E:\\pythonProject\\ruisi\\%s'</span> % (filename) readFile = open(newName,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'r'</span>) oldLine = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'0'</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> line <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> readFile: <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#newLine 用来比较是否是重复的id</span> newLine = line <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> (newLine != oldLine): nu = newLine.split()[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>] oldLine = newLine count += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> searchWeb((int(nu),)) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"%s deal %s lines"</span> %(filename, count) </code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li></ul>
本代码为了简便,没有再把httperror的那些id分类,直接存储为下面这5个文件里
<code class="hljs bash has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"> file1 = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ruisi\\correct_re.txt'</span> file2 = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ruisi\\errTime_re.txt'</span> file3 = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ruisi\\notexist_re.txt'</span> file4 = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ruisi\\unkownsex_re.txt'</span> file5 = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ruisi\\httperror_re.txt'</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li></ul>
可以看下输出Log
记录,总共处理了多少个httperror
的数据。
<code class="hljs ruby has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"D:\Program Files\Python27\python.exe"</span> <span class="hljs-constant" style="box-sizing: border-box;">E</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:/pythonProject/webCrawler/reload</span>.py httperror132001-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">133001</span>.txt deal <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">21</span> lines httperror2001-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">3001</span>.txt deal <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">4</span> lines httperror251001-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">252001</span>.txt deal <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">5</span> lines httperror254001-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">255001</span>.txt deal <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> lines</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li></ul>
单线程统计 unkownsex 数据
代码简单,我们利用单线程统计一下unkownsex
(由于权限原因无法获取、或者该用户没有填写)的用户。另外,经过我们检查,没有性别的用户也是没有活动时间的。
数据格式如下:
<code class="hljs has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">253042 unkownsex 253087 unkownsex 253102 unkownsex 253118 unkownsex 253125 unkownsex 253136 unkownsex 253161 unkownsex</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li></ul>
<code class="hljs python has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> os,time sumCount = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> startTime = time.clock() <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> filename <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> os.listdir(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">r'E:\pythonProject\ruisi'</span>): <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> filename.startswith(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'unkownsex'</span>): count = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> newName = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'E:\\pythonProject\\ruisi\\%s'</span> % (filename) readFile = open(newName,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'r'</span>) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> line <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> open(newName): count += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> sumCount +=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"%s deal %s lines"</span> %(filename, count) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'%s unkowns sex'</span> %(sumCount) endTime = time.clock() <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"cost time "</span> + str(endTime - startTime) + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">" s"</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li></ul>
处理速度很快,输出如下:
<code class="hljs livecodeserver has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">unkownsex1-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1001.</span>txt deal <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">204</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">lines</span> unkownsex100001-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">101001.</span>txt deal <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">50</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">lines</span> unkownsex10001-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">11001.</span>txt deal <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">206</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">lines</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#...省略中间输出信息</span> unkownsex99001-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">100001.</span>txt deal <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">56</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">lines</span> unkownsex_re.txt deal <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1085</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">lines</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">14223</span> unkowns sex cost <span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">time</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.0813142301261</span> s</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li></ul>
单线程统计 correct 数据
数据格式如下:
<code class="hljs css has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">31024 男 2014<span class="hljs-tag" style="color: rgb(0, 0, 0); box-sizing: border-box;">-11-11</span> 13<span class="hljs-pseudo" style="color: rgb(0, 0, 0); box-sizing: border-box;">:20</span> 31283 男 2013<span class="hljs-tag" style="color: rgb(0, 0, 0); box-sizing: border-box;">-3-25</span> 19<span class="hljs-pseudo" style="color: rgb(0, 0, 0); box-sizing: border-box;">:41</span> 31340 保密 2015<span class="hljs-tag" style="color: rgb(0, 0, 0); box-sizing: border-box;">-2-2</span> 15<span class="hljs-pseudo" style="color: rgb(0, 0, 0); box-sizing: border-box;">:17</span> 31427 保密 2014<span class="hljs-tag" style="color: rgb(0, 0, 0); box-sizing: border-box;">-8-10</span> 09<span class="hljs-pseudo" style="color: rgb(0, 0, 0); box-sizing: border-box;">:17</span> 31475 保密 2013<span class="hljs-tag" style="color: rgb(0, 0, 0); box-sizing: border-box;">-7-2</span> 08<span class="hljs-pseudo" style="color: rgb(0, 0, 0); box-sizing: border-box;">:59</span> 31554 保密 2014<span class="hljs-tag" style="color: rgb(0, 0, 0); box-sizing: border-box;">-10-17</span> 17<span class="hljs-pseudo" style="color: rgb(0, 0, 0); box-sizing: border-box;">:02</span> 31621 男 2015<span class="hljs-tag" style="color: rgb(0, 0, 0); box-sizing: border-box;">-5-16</span> 19<span class="hljs-pseudo" style="color: rgb(0, 0, 0); box-sizing: border-box;">:27</span> 31872 保密 2015<span class="hljs-tag" style="color: rgb(0, 0, 0); box-sizing: border-box;">-1-11</span> 16<span class="hljs-pseudo" style="color: rgb(0, 0, 0); box-sizing: border-box;">:49</span> 31915 保密 2014<span class="hljs-tag" style="color: rgb(0, 0, 0); box-sizing: border-box;">-5-4</span> 11<span class="hljs-pseudo" style="color: rgb(0, 0, 0); box-sizing: border-box;">:01</span> 31997 保密 2015<span class="hljs-tag" style="color: rgb(0, 0, 0); box-sizing: border-box;">-5-16</span> 20<span class="hljs-pseudo" style="color: rgb(0, 0, 0); box-sizing: border-box;">:14</span> </code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li></ul>
代码如下,实现思路就是一行一行读取,利用line.split()
获取性别信息。sumCount 是统计一个多少人,boycount 、girlcount 、secretcount 分别统计男、女、保密的人数。我们还是利用unicode进行正则匹配。
<code class="hljs python has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> os,sys,time reload(sys) sys.setdefaultencoding(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'utf-8'</span>) startTime = time.clock() sumCount = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> boycount = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> girlcount = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> secretcount = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> filename <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> os.listdir(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">r'E:\pythonProject\ruisi'</span>): <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> filename.startswith(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'correct'</span>): newName = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'E:\\pythonProject\\ruisi\\%s'</span> % (filename) readFile = open(newName,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'r'</span>) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> line <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> readFile: sexInfo = line.split()[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>] sumCount +=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> sexInfo == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'\u7537'</span> : boycount += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">elif</span> sexInfo == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'\u5973'</span>: girlcount +=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">elif</span> sexInfo == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'\u4fdd\u5bc6'</span>: secretcount +=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"until %s, sum is %s boys; %s girls; %s secret;"</span> %(filename, boycount,girlcount,secretcount) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"total is %s; %s boys; %s girls; %s secret;"</span> %(sumCount, boycount,girlcount,secretcount) endTime = time.clock() <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"cost time "</span> + str(endTime - startTime) + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">" s"</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li></ul>
注意,我们输出的是截止某个文件的统计信息,而不是单个文件的统计情况。输出结果如下:
<code class="hljs applescript has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">until</span> correct1-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1001.</span>txt, sum <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">is</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">110</span> boys; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">7</span> girls; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">414</span> secret; <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">until</span> correct100001-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">101001.</span>txt, sum <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">is</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">125</span> boys; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">13</span> girls; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">542</span> secret; <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#...省略</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">until</span> correct99001-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">100001.</span>txt, sum <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">is</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">11070</span> boys; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">3113</span> girls; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">26636</span> secret; <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">until</span> correct_re.txt, sum <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">is</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">13937</span> boys; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">4007</span> girls; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">28941</span> secret; total <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">is</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">46885</span>; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">13937</span> boys; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">4007</span> girls; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">28941</span> secret; cost <span class="hljs-property" style="box-sizing: border-box;">time</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">3.60047888495</span> s</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li></ul>
多线程统计数据
为了更快统计,我们可以利用多线程。
作为对比,我们试下单线程需要的时间。
<code class="hljs python has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># encoding: UTF-8</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> threading <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> time,os,sys <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#全局变量</span> SUM = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> BOY = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> GIRL = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> SECRET = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> NUM =<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#本来继承自threading.Thread,覆盖run()方法,用start()启动线程</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#这和java里面很像</span> <span class="hljs-class" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">class</span> <span class="hljs-title" style="box-sizing: border-box; color: rgb(102, 0, 102);">StaFileList</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(threading.Thread)</span>:</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#文本名称列表</span> fileList = [] <span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">__init__</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(self, fileList)</span>:</span> threading.Thread.__init__(self) self.fileList = fileList <span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">run</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(self)</span>:</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">global</span> SUM, BOY, GIRL, SECRET <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#可以加上个耗时时间,这样多线程更加明显,而不是顺序的thread-1,2,3</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#time.sleep(1)</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#acquire获取锁</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> mutex.acquire(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>): self.staFiles(self.fileList) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#release释放锁</span> mutex.release() <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#处理输入的files列表,统计男女人数</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#注意这儿数据同步问题,global使用全局变量</span> <span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">staFiles</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(self, files)</span>:</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">global</span> SUM, BOY, GIRL, SECRET <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> name <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> files: newName = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'E:\\pythonProject\\ruisi\\%s'</span> % (name) readFile = open(newName,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'r'</span>) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> line <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> readFile: sexInfo = line.split()[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>] SUM +=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> sexInfo == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'\u7537'</span> : BOY += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">elif</span> sexInfo == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'\u5973'</span>: GIRL +=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">elif</span> sexInfo == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'\u4fdd\u5bc6'</span>: SECRET +=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># print "thread %s, until %s, total is %s; %s boys; %s girls;" \</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># " %s secret;" %(self.name, name, SUM, BOY,GIRL,SECRET)</span> <span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">test</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">()</span>:</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#files保存多个文件,可以设定一个线程处理多少个文件</span> files = [] <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#用来保存所有的线程,方便最后主线程等待所以子线程结束</span> staThreads = [] i = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> filename <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> os.listdir(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">r'E:\pythonProject\ruisi'</span>): <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#没获取10个文本,就创建一个线程</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> filename.startswith(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'correct'</span>): files.append(filename) i+=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#一个线程处理20个文件</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> i == <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">20</span> : staThreads.append(StaFileList(files)) files = [] i = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#最后剩余的files,很可能长度不足10个</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> files: staThreads.append(StaFileList(files)) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> t <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> staThreads: t.start() <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 主线程中等待所有子线程退出,如果不加这个,速度更快些?</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> t <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> staThreads: t.join() <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> __name__ == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'__main__'</span>: reload(sys) sys.setdefaultencoding(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'utf-8'</span>) startTime = time.clock() mutex = threading.Lock() test() <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"Multi Thread, total is %s; %s boys; %s girls; %s secret;"</span> %(SUM, BOY,GIRL,SECRET) endTime = time.clock() <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"cost time "</span> + str(endTime - startTime) + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">" s"</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li><li style="box-sizing: border-box; padding: 0px 5px;">33</li><li style="box-sizing: border-box; padding: 0px 5px;">34</li><li style="box-sizing: border-box; padding: 0px 5px;">35</li><li style="box-sizing: border-box; padding: 0px 5px;">36</li><li style="box-sizing: border-box; padding: 0px 5px;">37</li><li style="box-sizing: border-box; padding: 0px 5px;">38</li><li style="box-sizing: border-box; padding: 0px 5px;">39</li><li style="box-sizing: border-box; padding: 0px 5px;">40</li><li style="box-sizing: border-box; padding: 0px 5px;">41</li><li style="box-sizing: border-box; padding: 0px 5px;">42</li><li style="box-sizing: border-box; padding: 0px 5px;">43</li><li style="box-sizing: border-box; padding: 0px 5px;">44</li><li style="box-sizing: border-box; padding: 0px 5px;">45</li><li style="box-sizing: border-box; padding: 0px 5px;">46</li><li style="box-sizing: border-box; padding: 0px 5px;">47</li><li style="box-sizing: border-box; padding: 0px 5px;">48</li><li style="box-sizing: border-box; padding: 0px 5px;">49</li><li style="box-sizing: border-box; padding: 0px 5px;">50</li><li style="box-sizing: border-box; padding: 0px 5px;">51</li><li style="box-sizing: border-box; padding: 0px 5px;">52</li><li style="box-sizing: border-box; padding: 0px 5px;">53</li><li style="box-sizing: border-box; padding: 0px 5px;">54</li><li style="box-sizing: border-box; padding: 0px 5px;">55</li><li style="box-sizing: border-box; padding: 0px 5px;">56</li><li style="box-sizing: border-box; padding: 0px 5px;">57</li><li style="box-sizing: border-box; padding: 0px 5px;">58</li><li style="box-sizing: border-box; padding: 0px 5px;">59</li><li style="box-sizing: border-box; padding: 0px 5px;">60</li><li style="box-sizing: border-box; padding: 0px 5px;">61</li><li style="box-sizing: border-box; padding: 0px 5px;">62</li><li style="box-sizing: border-box; padding: 0px 5px;">63</li><li style="box-sizing: border-box; padding: 0px 5px;">64</li><li style="box-sizing: border-box; padding: 0px 5px;">65</li><li style="box-sizing: border-box; padding: 0px 5px;">66</li><li style="box-sizing: border-box; padding: 0px 5px;">67</li><li style="box-sizing: border-box; padding: 0px 5px;">68</li><li style="box-sizing: border-box; padding: 0px 5px;">69</li><li style="box-sizing: border-box; padding: 0px 5px;">70</li><li style="box-sizing: border-box; padding: 0px 5px;">71</li><li style="box-sizing: border-box; padding: 0px 5px;">72</li><li style="box-sizing: border-box; padding: 0px 5px;">73</li><li style="box-sizing: border-box; padding: 0px 5px;">74</li><li style="box-sizing: border-box; padding: 0px 5px;">75</li><li style="box-sizing: border-box; padding: 0px 5px;">76</li><li style="box-sizing: border-box; padding: 0px 5px;">77</li><li style="box-sizing: border-box; padding: 0px 5px;">78</li><li style="box-sizing: border-box; padding: 0px 5px;">79</li><li style="box-sizing: border-box; padding: 0px 5px;">80</li><li style="box-sizing: border-box; padding: 0px 5px;">81</li><li style="box-sizing: border-box; padding: 0px 5px;">82</li><li style="box-sizing: border-box; padding: 0px 5px;">83</li><li style="box-sizing: border-box; padding: 0px 5px;">84</li><li style="box-sizing: border-box; padding: 0px 5px;">85</li><li style="box-sizing: border-box; padding: 0px 5px;">86</li><li style="box-sizing: border-box; padding: 0px 5px;">87</li><li style="box-sizing: border-box; padding: 0px 5px;">88</li><li style="box-sizing: border-box; padding: 0px 5px;">89</li></ul>
输出
<code class="hljs vhdl has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">Multi Thread, total <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">is</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">46885</span>; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">13937</span> boys; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">4007</span> girls; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">28941</span> secret; cost <span class="hljs-typename" style="color: rgb(102, 0, 102); box-sizing: border-box;">time</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.132137192794</span> s</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul>
我们发现时间和单线程差不多。因为这儿涉及到线程同步问题,获取锁和释放锁都是需要时间开销的,线程间切换保存中断和恢复中断也都是需要时间开销的。
较多数据的单线程和多线程对比
我们可以对correct、errTime 、unkownsex的文本都进行处理。
单线程代码
<code class="hljs python has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># coding=utf-8</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> os,sys,time reload(sys) sys.setdefaultencoding(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'utf-8'</span>) startTime = time.clock() sumCount = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> boycount = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> girlcount = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> secretcount = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> unkowncount = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> filename <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> os.listdir(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">r'E:\pythonProject\ruisi'</span>): <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 有性别、活动时间</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> filename.startswith(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'correct'</span>) : newName = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'E:\\pythonProject\\ruisi\\%s'</span> % (filename) readFile = open(newName,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'r'</span>) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> line <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> readFile: sexInfo =line.split()[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>] sumCount +=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> sexInfo == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'\u7537'</span> : boycount += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">elif</span> sexInfo == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'\u5973'</span>: girlcount +=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">elif</span> sexInfo == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'\u4fdd\u5bc6'</span>: secretcount +=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># print "until %s, sum is %s boys; %s girls; %s secret;" %(filename, boycount,girlcount,secretcount)</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#没有活动时间,但是有性别</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">elif</span> filename.startswith(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"errTime"</span>): newName = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'E:\\pythonProject\\ruisi\\%s'</span> % (filename) readFile = open(newName,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'r'</span>) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> line <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> readFile: sexInfo =line.split()[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>] sumCount +=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> sexInfo == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'\u7537'</span> : boycount += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">elif</span> sexInfo == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'\u5973'</span>: girlcount +=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">elif</span> sexInfo == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'\u4fdd\u5bc6'</span>: secretcount +=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># print "until %s, sum is %s boys; %s girls; %s secret;" %(filename, boycount,girlcount,secretcount)</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#没有性别,也没有时间,直接统计行数</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">elif</span> filename.startswith(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"unkownsex"</span>): newName = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'E:\\pythonProject\\ruisi\\%s'</span> % (filename) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># count = len(open(newName,'rU').readlines())</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#对于大文件用循环方法,count 初始值为 -1 是为了应对空行的情况,最后+1得到0行</span> count = -<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> count, line <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> enumerate(open(newName, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'rU'</span>)): <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">pass</span> count += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> unkowncount += count sumCount += count <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># print "until %s, sum is %s unkownsex" %(filename, unkowncount)</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"Single Thread, total is %s; %s boys; %s girls; %s secret; %s unkownsex;"</span> %(sumCount, boycount,girlcount,secretcount,unkowncount) endTime = time.clock() <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"cost time "</span> + str(endTime - startTime) + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">" s"</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li><li style="box-sizing: border-box; padding: 0px 5px;">33</li><li style="box-sizing: border-box; padding: 0px 5px;">34</li><li style="box-sizing: border-box; padding: 0px 5px;">35</li><li style="box-sizing: border-box; padding: 0px 5px;">36</li><li style="box-sizing: border-box; padding: 0px 5px;">37</li><li style="box-sizing: border-box; padding: 0px 5px;">38</li><li style="box-sizing: border-box; padding: 0px 5px;">39</li><li style="box-sizing: border-box; padding: 0px 5px;">40</li><li style="box-sizing: border-box; padding: 0px 5px;">41</li><li style="box-sizing: border-box; padding: 0px 5px;">42</li><li style="box-sizing: border-box; padding: 0px 5px;">43</li><li style="box-sizing: border-box; padding: 0px 5px;">44</li><li style="box-sizing: border-box; padding: 0px 5px;">45</li><li style="box-sizing: border-box; padding: 0px 5px;">46</li><li style="box-sizing: border-box; padding: 0px 5px;">47</li><li style="box-sizing: border-box; padding: 0px 5px;">48</li><li style="box-sizing: border-box; padding: 0px 5px;">49</li><li style="box-sizing: border-box; padding: 0px 5px;">50</li><li style="box-sizing: border-box; padding: 0px 5px;">51</li><li style="box-sizing: border-box; padding: 0px 5px;">52</li><li style="box-sizing: border-box; padding: 0px 5px;">53</li><li style="box-sizing: border-box; padding: 0px 5px;">54</li><li style="box-sizing: border-box; padding: 0px 5px;">55</li><li style="box-sizing: border-box; padding: 0px 5px;">56</li><li style="box-sizing: border-box; padding: 0px 5px;">57</li></ul>
输出为
<code class="hljs vbnet has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">Single</span> Thread, total <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">is</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">61111</span>; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">13937</span> boys; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">4009</span> girls; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">28942</span> secret; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">14223</span> unkownsex; cost time <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.37444645628</span> s</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul>
多线程代码
<code class="hljs python has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">__author__ = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'admin'</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># encoding: UTF-8</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#多线程处理程序</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> threading <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> time,os,sys <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#全局变量</span> SUM = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> BOY = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> GIRL = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> SECRET = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> UNKOWN = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> <span class="hljs-class" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">class</span> <span class="hljs-title" style="box-sizing: border-box; color: rgb(102, 0, 102);">StaFileList</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(threading.Thread)</span>:</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#文本名称列表</span> fileList = [] <span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">__init__</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(self, fileList)</span>:</span> threading.Thread.__init__(self) self.fileList = fileList <span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">run</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(self)</span>:</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">global</span> SUM, BOY, GIRL, SECRET <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> mutex.acquire(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>): self.staManyFiles(self.fileList) mutex.release() <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#处理输入的files列表,统计男女人数</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#注意这儿数据同步问题</span> <span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">staCorrectFiles</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(self, files)</span>:</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">global</span> SUM, BOY, GIRL, SECRET <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> name <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> files: newName = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'E:\\pythonProject\\ruisi\\%s'</span> % (name) readFile = open(newName,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'r'</span>) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> line <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> readFile: sexInfo = line.split()[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>] SUM +=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> sexInfo == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'\u7537'</span> : BOY += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">elif</span> sexInfo == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'\u5973'</span>: GIRL +=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">elif</span> sexInfo == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'\u4fdd\u5bc6'</span>: SECRET +=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># print "thread %s, until %s, total is %s; %s boys; %s girls;" \</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># " %s secret;" %(self.name, name, SUM, BOY,GIRL,SECRET)</span> <span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">staManyFiles</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(self, files)</span>:</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">global</span> SUM, BOY, GIRL, SECRET,UNKOWN <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> name <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> files: <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> name.startswith(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'correct'</span>) : newName = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'E:\\pythonProject\\ruisi\\%s'</span> % (name) readFile = open(newName,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'r'</span>) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> line <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> readFile: sexInfo = line.split()[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>] SUM +=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> sexInfo == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'\u7537'</span> : BOY += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">elif</span> sexInfo == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'\u5973'</span>: GIRL +=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">elif</span> sexInfo == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'\u4fdd\u5bc6'</span>: SECRET +=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># print "thread %s, until %s, total is %s; %s boys; %s girls;" \</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># " %s secret;" %(self.name, name, SUM, BOY,GIRL,SECRET)</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#没有活动时间,但是有性别</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">elif</span> name.startswith(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"errTime"</span>): newName = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'E:\\pythonProject\\ruisi\\%s'</span> % (name) readFile = open(newName,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'r'</span>) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> line <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> readFile: sexInfo = line.split()[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>] SUM +=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> sexInfo == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'\u7537'</span> : BOY += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">elif</span> sexInfo == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'\u5973'</span>: GIRL +=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">elif</span> sexInfo == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'\u4fdd\u5bc6'</span>: SECRET +=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># print "thread %s, until %s, total is %s; %s boys; %s girls;" \</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># " %s secret;" %(self.name, name, SUM, BOY,GIRL,SECRET)</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#没有性别,也没有时间,直接统计行数</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">elif</span> name.startswith(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"unkownsex"</span>): newName = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'E:\\pythonProject\\ruisi\\%s'</span> % (name) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># count = len(open(newName,'rU').readlines())</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#对于大文件用循环方法,count 初始值为 -1 是为了应对空行的情况,最后+1得到0行</span> count = -<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> count, line <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> enumerate(open(newName, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'rU'</span>)): <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">pass</span> count += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> UNKOWN += count SUM += count <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># print "thread %s, until %s, total is %s; %s unkownsex" %(self.name, name, SUM, UNKOWN)</span> <span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">test</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">()</span>:</span> files = [] <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#用来保存所有的线程,方便最后主线程等待所以子线程结束</span> staThreads = [] i = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> filename <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> os.listdir(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">r'E:\pythonProject\ruisi'</span>): <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#没获取10个文本,就创建一个线程</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> filename.startswith(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"correct"</span>) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">or</span> filename.startswith(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"errTime"</span>) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">or</span> filename.startswith(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"unkownsex"</span>): files.append(filename) i+=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> i == <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">20</span> : staThreads.append(StaFileList(files)) files = [] i = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#最后剩余的files,很可能长度不足10个</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> files: staThreads.append(StaFileList(files)) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> t <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> staThreads: t.start() <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 主线程中等待所有子线程退出</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> t <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> staThreads: t.join() <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> __name__ == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'__main__'</span>: reload(sys) sys.setdefaultencoding(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'utf-8'</span>) startTime = time.clock() mutex = threading.Lock() test() <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"Multi Thread, total is %s; %s boys; %s girls; %s secret; %s unkownsex"</span> %(SUM, BOY,GIRL,SECRET,UNKOWN) endTime = time.clock() <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"cost time "</span> + str(endTime - startTime) + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">" s"</span> endTime = time.clock() <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"cost time "</span> + str(endTime - startTime) + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">" s"</span> </code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li><li style="box-sizing: border-box; padding: 0px 5px;">33</li><li style="box-sizing: border-box; padding: 0px 5px;">34</li><li style="box-sizing: border-box; padding: 0px 5px;">35</li><li style="box-sizing: border-box; padding: 0px 5px;">36</li><li style="box-sizing: border-box; padding: 0px 5px;">37</li><li style="box-sizing: border-box; padding: 0px 5px;">38</li><li style="box-sizing: border-box; padding: 0px 5px;">39</li><li style="box-sizing: border-box; padding: 0px 5px;">40</li><li style="box-sizing: border-box; padding: 0px 5px;">41</li><li style="box-sizing: border-box; padding: 0px 5px;">42</li><li style="box-sizing: border-box; padding: 0px 5px;">43</li><li style="box-sizing: border-box; padding: 0px 5px;">44</li><li style="box-sizing: border-box; padding: 0px 5px;">45</li><li style="box-sizing: border-box; padding: 0px 5px;">46</li><li style="box-sizing: border-box; padding: 0px 5px;">47</li><li style="box-sizing: border-box; padding: 0px 5px;">48</li><li style="box-sizing: border-box; padding: 0px 5px;">49</li><li style="box-sizing: border-box; padding: 0px 5px;">50</li><li style="box-sizing: border-box; padding: 0px 5px;">51</li><li style="box-sizing: border-box; padding: 0px 5px;">52</li><li style="box-sizing: border-box; padding: 0px 5px;">53</li><li style="box-sizing: border-box; padding: 0px 5px;">54</li><li style="box-sizing: border-box; padding: 0px 5px;">55</li><li style="box-sizing: border-box; padding: 0px 5px;">56</li><li style="box-sizing: border-box; padding: 0px 5px;">57</li><li style="box-sizing: border-box; padding: 0px 5px;">58</li><li style="box-sizing: border-box; padding: 0px 5px;">59</li><li style="box-sizing: border-box; padding: 0px 5px;">60</li><li style="box-sizing: border-box; padding: 0px 5px;">61</li><li style="box-sizing: border-box; padding: 0px 5px;">62</li><li style="box-sizing: border-box; padding: 0px 5px;">63</li><li style="box-sizing: border-box; padding: 0px 5px;">64</li><li style="box-sizing: border-box; padding: 0px 5px;">65</li><li style="box-sizing: border-box; padding: 0px 5px;">66</li><li style="box-sizing: border-box; padding: 0px 5px;">67</li><li style="box-sizing: border-box; padding: 0px 5px;">68</li><li style="box-sizing: border-box; padding: 0px 5px;">69</li><li style="box-sizing: border-box; padding: 0px 5px;">70</li><li style="box-sizing: border-box; padding: 0px 5px;">71</li><li style="box-sizing: border-box; padding: 0px 5px;">72</li><li style="box-sizing: border-box; padding: 0px 5px;">73</li><li style="box-sizing: border-box; padding: 0px 5px;">74</li><li style="box-sizing: border-box; padding: 0px 5px;">75</li><li style="box-sizing: border-box; padding: 0px 5px;">76</li><li style="box-sizing: border-box; padding: 0px 5px;">77</li><li style="box-sizing: border-box; padding: 0px 5px;">78</li><li style="box-sizing: border-box; padding: 0px 5px;">79</li><li style="box-sizing: border-box; padding: 0px 5px;">80</li><li style="box-sizing: border-box; padding: 0px 5px;">81</li><li style="box-sizing: border-box; padding: 0px 5px;">82</li><li style="box-sizing: border-box; padding: 0px 5px;">83</li><li style="box-sizing: border-box; padding: 0px 5px;">84</li><li style="box-sizing: border-box; padding: 0px 5px;">85</li><li style="box-sizing: border-box; padding: 0px 5px;">86</li><li style="box-sizing: border-box; padding: 0px 5px;">87</li><li style="box-sizing: border-box; padding: 0px 5px;">88</li><li style="box-sizing: border-box; padding: 0px 5px;">89</li><li style="box-sizing: border-box; padding: 0px 5px;">90</li><li style="box-sizing: border-box; padding: 0px 5px;">91</li><li style="box-sizing: border-box; padding: 0px 5px;">92</li><li style="box-sizing: border-box; padding: 0px 5px;">93</li><li style="box-sizing: border-box; padding: 0px 5px;">94</li><li style="box-sizing: border-box; padding: 0px 5px;">95</li><li style="box-sizing: border-box; padding: 0px 5px;">96</li><li style="box-sizing: border-box; padding: 0px 5px;">97</li><li style="box-sizing: border-box; padding: 0px 5px;">98</li><li style="box-sizing: border-box; padding: 0px 5px;">99</li><li style="box-sizing: border-box; padding: 0px 5px;">100</li><li style="box-sizing: border-box; padding: 0px 5px;">101</li><li style="box-sizing: border-box; padding: 0px 5px;">102</li><li style="box-sizing: border-box; padding: 0px 5px;">103</li><li style="box-sizing: border-box; padding: 0px 5px;">104</li><li style="box-sizing: border-box; padding: 0px 5px;">105</li><li style="box-sizing: border-box; padding: 0px 5px;">106</li><li style="box-sizing: border-box; padding: 0px 5px;">107</li><li style="box-sizing: border-box; padding: 0px 5px;">108</li><li style="box-sizing: border-box; padding: 0px 5px;">109</li><li style="box-sizing: border-box; padding: 0px 5px;">110</li><li style="box-sizing: border-box; padding: 0px 5px;">111</li><li style="box-sizing: border-box; padding: 0px 5px;">112</li><li style="box-sizing: border-box; padding: 0px 5px;">113</li><li style="box-sizing: border-box; padding: 0px 5px;">114</li><li style="box-sizing: border-box; padding: 0px 5px;">115</li><li style="box-sizing: border-box; padding: 0px 5px;">116</li><li style="box-sizing: border-box; padding: 0px 5px;">117</li><li style="box-sizing: border-box; padding: 0px 5px;">118</li><li style="box-sizing: border-box; padding: 0px 5px;">119</li><li style="box-sizing: border-box; padding: 0px 5px;">120</li><li style="box-sizing: border-box; padding: 0px 5px;">121</li><li style="box-sizing: border-box; padding: 0px 5px;">122</li><li style="box-sizing: border-box; padding: 0px 5px;">123</li><li style="box-sizing: border-box; padding: 0px 5px;">124</li><li style="box-sizing: border-box; padding: 0px 5px;">125</li><li style="box-sizing: border-box; padding: 0px 5px;">126</li><li style="box-sizing: border-box; padding: 0px 5px;">127</li><li style="box-sizing: border-box; padding: 0px 5px;">128</li><li style="box-sizing: border-box; padding: 0px 5px;">129</li><li style="box-sizing: border-box; padding: 0px 5px;">130</li></ul>
输出为
<code class="hljs vhdl has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">Multi Thread, total <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">is</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">61111</span>; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">13937</span> boys; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">4009</span> girls; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">28942</span> secret; cost <span class="hljs-typename" style="color: rgb(102, 0, 102); box-sizing: border-box;">time</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.23049112201</span> s</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul>
可以看出多线程还是优于单线程的,由于使用的同步,数据统计是一直的。
总结思考
注意python在类内部经常需要加上self,这点和java区别很大。
<code class="hljs python has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"> <span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">__init__</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(self, fileList)</span>:</span> threading.Thread.__init__(self) self.fileList = fileList <span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">run</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(self)</span>:</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">global</span> SUM, BOY, GIRL, SECRET <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> mutex.acquire(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>): <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#调用类内部方法需要加self</span> self.staFiles(self.fileList) mutex.release()</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li></ul>
参考文章
<code class="hljs vhdl has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">total <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">is</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">61111</span>; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">13937</span> boys; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">4009</span> girls; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">28942</span> secret; <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">14223</span> unkownsex; cost <span class="hljs-typename" style="color: rgb(102, 0, 102); box-sizing: border-box;">time</span> <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.25413238673</span> s</code>