python 常用的案例1

最新推荐文章于 2024-04-01 11:38:21 发布

ronon77

最新推荐文章于 2024-04-01 11:38:21 发布

阅读量410

点赞数

分类专栏： python&nodejs 文章标签： python 案例

本文链接：https://blog.csdn.net/ronon77/article/details/84774576

版权

python&nodejs 专栏收录该内容

311 篇文章 3 订阅

订阅专栏

python

Python中文转拼音代码(支持全拼和首字母缩写)

2015/07/05 by Crazyant 暂无评论

本文的代码，从https://github.com/cleverdeng/pinyin.py升级得来，针对原文的代码，做了以下升级：

        
            1 
          
            2 
          
            3 
          
            4 
          
           1、可以传入参数 
           firstcode：如果为 
           true，只取汉子的第一个拼音字母；如果为 
           false，则会输出全部拼音； 
          
           2、修复：如果为英文字母，则直接输出； 
          
           3、修复：如果分隔符为空字符串，仍然能正常输出； 
          
           4、升级：可以指定词典的文件路径

代码很简单，直接读取了一个词典（字符和英文的映射），然后挨个替换中文中的拼音即可；

          
Python

            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
            18 
          
            19 
          
            20 
          
            21 
          
            22 
          
            23 
          
            24 
          
            25 
          
            26 
          
            27 
          
            28 
          
            29 
          
            30 
          
            31 
          
            32 
          
            33 
          
            34 
          
            35 
          
            36 
          
            37 
          
            38 
          
            39 
          
            40 
          
            41 
          
            42 
          
            43 
          
            44 
          
            45 
          
            46 
          
            47 
          
            48 
          
            49 
          
            50 
          
            51 
          
            52 
          
            53 
          
            54 
          
            55 
          
            56 
          
            57 
          
            58 
          
            59 
          
            60 
          
            61 
          
            62 
          
            63 
          
            64 
          
            65 
          
            66 
          
            67 
          
            68 
          
            69 
          
            70 
          
            71 
          
            72 
          
            73 
          
            74 
          
            75 
          
            76 
          
           #!/usr/bin/env python 
          
           # -*- coding:utf-8 -*- 
          
           """ 
          
           原版代码：https://github.com/cleverdeng/pinyin.py 
          
           新增功能： 
          
               1、可以传入参数firstcode：如果为true，只取汉子的第一个拼音字母；如果为false，则会输出全部拼音； 
          
               2、修复：如果为英文字母，则直接输出； 
          
               3、修复：如果分隔符为空字符串，仍然能正常输出； 
          
               4、升级：可以指定词典的文件路径 
          
           """ 
          
           __version__ 
           = 
           '0.9' 
          
           __all__ 
           = 
           [ 
           "PinYin" 
           ] 
          
           import 
           os.path 
          
           class 
           PinYin 
           ( 
           object 
           ) 
           : 
          
           def 
           __init__ 
           ( 
           self 
           ) 
           : 
          
           self 
           . 
           word_dict 
           = 
           { 
           } 
          
           def 
           load_word 
           ( 
           self 
           , 
           dict_file 
           ) 
           : 
          
           self 
           . 
           dict_file 
           = 
           dict_file  
          
           if 
           not 
           os.path 
           . 
           exists 
           ( 
           self 
           . 
           dict_file 
           ) 
           : 
          
           raise 
           IOError 
           ( 
           "NotFoundFile" 
           ) 
          
           with 
           file 
           ( 
           self 
           . 
           dict_file 
           ) 
           as 
           f_obj 
           : 
          
           for 
           f_line  
           in 
           f_obj 
           . 
           readlines 
           ( 
           ) 
           : 
          
           try 
           : 
          
           line 
           = 
           f_line 
           . 
           split 
           ( 
           '    ' 
           ) 
          
           self 
           . 
           word_dict 
           [ 
           line 
           [ 
           0 
           ] 
           ] 
           = 
           line 
           [ 
           1 
           ] 
          
           except 
           : 
          
           line 
           = 
           f_line 
           . 
           split 
           ( 
           '   ' 
           ) 
          
           self 
           . 
           word_dict 
           [ 
           line 
           [ 
           0 
           ] 
           ] 
           = 
           line 
           [ 
           1 
           ] 
          
           def 
           hanzi2pinyin 
           ( 
           self 
           , 
           string 
           = 
           "" 
           , 
           firstcode 
           = 
           False 
           ) 
           : 
          
           result 
           = 
           [ 
           ] 
          
           if 
           not 
           isinstance 
           ( 
           string 
           , 
           unicode 
           ) 
           : 
          
           string 
           = 
           string 
           . 
           decode 
           ( 
           "utf-8" 
           ) 
          
           for 
           char  
           in 
           string 
           : 
          
           key 
           = 
           '%X' 
           % 
           ord 
           ( 
           char 
           ) 
          
           value 
           = 
           self 
           . 
           word_dict 
           . 
           get 
           ( 
           key 
           , 
           char 
           ) 
          
           outpinyin 
           = 
           str 
           ( 
           value 
           ) 
           . 
           split 
           ( 
           ) 
           [ 
           0 
           ] 
           [ 
           : 
           - 
           1 
           ] 
           . 
           lower 
           ( 
           ) 
          
           if 
           not 
           outpinyin 
           : 
          
           outpinyin 
           = 
           char 
          
           if 
           firstcode 
           : 
          
           result 
           . 
           append 
           ( 
           outpinyin 
           [ 
           0 
           ] 
           ) 
          
           else 
           : 
          
           result 
           . 
           append 
           ( 
           outpinyin 
           ) 
          
           return 
           result 
          
           def 
           hanzi2pinyin_split 
           ( 
           self 
           , 
           string 
           = 
           "" 
           , 
           split 
           = 
           "" 
           , 
           firstcode 
           = 
           False 
           ) 
           : 
          
           """提取中文的拼音 
          
                   @param string:要提取的中文 
          
                   @param split:分隔符 
          
                   @param firstcode: 提取的是全拼还是首字母？如果为true表示提取首字母，默认为False提取全拼   
          
                   """ 
          
           result 
           = 
           self 
           . 
           hanzi2pinyin 
           ( 
           string 
           = 
           string 
           , 
           firstcode 
           = 
           firstcode 
           ) 
          
           return 
           split 
           . 
           join 
           ( 
           result 
           ) 
          
           if 
           __name__ 
           == 
           "__main__" 
           : 
          
           test 
           = 
           PinYin 
           ( 
           ) 
          
           test 
           . 
           load_word 
           ( 
           'word.data' 
           ) 
          
           string 
           = 
           "Java程序性能优化-让你的Java程序更快更稳定" 
          
           print 
           "in: %s" 
           % 
           string 
          
           print 
           "out: %s" 
           % 
           str 
           ( 
           test 
           . 
           hanzi2pinyin 
           ( 
           string 
           = 
           string 
           ) 
           ) 
          
           print 
           "out: %s" 
           % 
           test 
           . 
           hanzi2pinyin_split 
           ( 
           string 
           = 
           string 
           , 
           split 
           = 
           "" 
           , 
           firstcode 
           = 
           True 
           ) 
          
           print 
           "out: %s" 
           % 
           test 
           . 
           hanzi2pinyin_split 
           ( 
           string 
           = 
           string 
           , 
           split 
           = 
           "" 
           , 
           firstcode 
           = 
           False 
           )

实例中main函数的代码输出结果

代码使用方法：

如果需要其他的提取，可以修改一下代码实现；

代码（包含词典）打包下载：

Posted in: python

Python使用list字段模式或者dict字段模式读取文件的方法

2014/12/05 by Crazyant 暂无评论

Python用于处理文本数据绝对是个利器，极为简单的读取、分割、过滤、转换支持，使得开发者不需要考虑繁杂的流文件处理过程（相对于JAVA来说的，嘻嘻）。博主自己工作中，一些复杂的文本数据处理计算，包括在HADOOP上编写Streaming程序，均是用Python完成。

而在文本处理的过程中，将文件加载内存中是第一步，这就涉及到怎样将文件中的某一列映射到具体的变量的过程，最最愚笨的方法，就是按照字段的下标进行引用，比如这样子：

将文件行映射到各个字段最愚笨的方法
          
Python

            1 
          
            2 
          
            3 
          
            4 
          
           # fields是读取了一行，并且按照分隔符分割之后的列表 
          
           user_id 
           = 
           fields 
           [ 
           0 
           ] 
          
           user_name 
           = 
           fields 
           [ 
           1 
           ] 
          
           user_type 
           = 
           fields 
           [ 
           2 
           ]

如果按照这种方式读取，一旦文件有顺序、增减列的变动，代码的维护是个噩梦，这种代码一定要杜绝。

本文推荐两种优雅的方式来读取数据，都是先配置字段模式，然后按照模式读取，而模式则有字典模式和列表模式两种形式；

读取文件，按照分隔符分割成字段数据列表

首先读取文件，按照分隔符分割每一行的数据，返回字段列表，以便后续处理。

代码如下：

读取文件并进行分割的函数
         
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
            18 
          
            19 
          
            20 
          
           def  
           read_file_data 
           ( 
           filepath 
           ) 
           : 
          
           '' 
           '根据路径按行读取文件, 参数filepath：文件的绝对路径 
          
               @param filepath: 读取文件的路径 
          
               @return: 按\t分割后的每行的数据列表 
          
               ' 
           '' 
          
           fin 
           = 
           open 
           ( 
           filepath 
           , 
           'r' 
           ) 
          
           for 
           line  
           in 
           fin 
           : 
          
           try 
           : 
          
           line 
           = 
           line 
           [ 
           : 
           - 
           1 
           ] 
          
           if 
           not 
           line 
           : 
           continue 
          
           except 
           : 
          
           continue 
          
           try 
           : 
          
           fields 
           = 
           line 
           . 
           split 
           ( 
           "\t" 
           ) 
          
           except 
           : 
          
           continue 
          
           # 抛出当前行的分割列表 
          
           yield  
           fields 
          
           fin 
           . 
           close 
           ( 
           )

使用yield关键字，每次抛出单个行的分割数据，这样在调度程序中可以用for fields in read_file_data(fpath)的方式读取每一行。

映射到模型之方法1：使用配置好的字典模式，装配读取的数据列表

这种方法配置一个{“字段名”: 字段位置}的字典作为数据模式，然后按照该模式装配读取的列表数据，最后实现用字典的方式访问数据。

所使用的函数：

用字典模式装配数据列表以实现按KEY读取
          
Python

            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
           @ 
           staticmethod 
          
           def 
           map_fields_dict_schema 
           ( 
           fields 
           , 
           dict_schema 
           ) 
           : 
          
           """根据字段的模式，返回模式和数据值的对应值；例如 fields为['a','b','c'],schema为{'name':0, 'age':1}，那么就返回{'name':'a','age':'b'} 
          
               @param fields: 包含有数据的数组，一般是通过对一个Line String通过按照\t分割得到 
          
               @param dict_schema: 一个词典，key是字段名称，value是字段的位置； 
          
               @return: 词典，key是字段名称，value是字段值 
          
               """ 
          
           pdict 
           = 
           { 
           } 
          
           for 
           fstr 
           , 
           findex  
           in 
           dict_schema 
           . 
           iteritems 
           ( 
           ) 
           : 
          
           pdict 
           [ 
           fstr 
           ] 
           = 
           str 
           ( 
           fields 
           [ 
           int 
           ( 
           findex 
           ) 
           ] 
           ) 
          
           return 
           pdict

有了该方法和之前的方法，可以用以下的方式读取数据：

用字典模式读取数据实例
          
Python

            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
           # coding:utf8 
          
           """ 
          
           @author: www.crazyant.net 
          
           测试使用字典模式加载数据列表 
          
           优点：对于多列文件，只通过配置需要读取的字段，就能读取对应列的数据 
          
           缺点：如果字段较多，每个字段的位置配置，较为麻烦 
          
           """ 
          
           import 
           file_util 
          
           import 
           pprint 
          
           # 配置好的要读取的字典模式，可以只配置自己关心的列的位置 
          
           dict_schema 
           = 
           { 
           "userid" 
           : 
           0 
           , 
           "username" 
           : 
           1 
           , 
           "usertype" 
           : 
           2 
           } 
          
           for 
           fields  
           in 
           file_util 
           . 
           FileUtil 
           . 
           read_file_data 
           ( 
           "userfile.txt" 
           ) 
           : 
          
           # 将字段列表，按照字典模式进行映射 
          
           dict_fields 
           = 
           file_util 
           . 
           FileUtil 
           . 
           map_fields_dict_schema 
           ( 
           fields 
           , 
           dict_schema 
           ) 
          
           pprint 
           . 
           pprint 
           ( 
           dict_fields 
           )

输出结果：

字典模式加载后的字典数据
          
      

          
      

          
      

          
      

          
      
Python

        
    
 
     
            1 
          

            2 
          

            3 
          

            4 
          

            5 
          

            6 
          

            7 
          

            8 
          

            9 
          

            10 
          

            11 
          

            12 
          
 
           { 
           'userid' 
           : 
           '1' 
           , 
           'username' 
           : 
           'name1' 
           , 
           'usertype' 
           : 
           '0' 
           } 
          
 
           { 
           'userid' 
           : 
           '2' 
           , 
           'username' 
           : 
           'name2' 
           , 
           'usertype' 
           : 
           '1' 
           } 
          
 
           { 
           'userid' 
           : 
           '3' 
           , 
           'username' 
           : 
           'name3' 
           , 
           'usertype' 
           : 
           '2' 
           } 
          
 
           { 
           'userid' 
           : 
           '4' 
           , 
           'username' 
           : 
           'name4' 
           , 
           'usertype' 
           : 
           '3' 
           } 
          
 
           { 
           'userid' 
           : 
           '5' 
           , 
           'username' 
           : 
           'name5' 
           , 
           'usertype' 
           : 
           '4' 
           } 
          
 
           { 
           'userid' 
           : 
           '6' 
           , 
           'username' 
           : 
           'name6' 
           , 
           'usertype' 
           : 
           '5' 
           } 
          
 
           { 
           'userid' 
           : 
           '7' 
           , 
           'username' 
           : 
           'name7' 
           , 
           'usertype' 
           : 
           '6' 
           } 
          
 
           { 
           'userid' 
           : 
           '8' 
           , 
           'username' 
           : 
           'name8' 
           , 
           'usertype' 
           : 
           '7' 
           } 
          
 
           { 
           'userid' 
           : 
           '9' 
           , 
           'username' 
           : 
           'name9' 
           , 
           'usertype' 
           : 
           '8' 
           } 
          
 
           { 
           'userid' 
           : 
           '10' 
           , 
           'username' 
           : 
           'name10' 
           , 
           'usertype' 
           : 
           '9' 
           } 
          
 
           { 
           'userid' 
           : 
           '11' 
           , 
           'username' 
           : 
           'name11' 
           , 
           'usertype' 
           : 
           '10' 
           } 
          
 
           { 
           'userid' 
           : 
           '12' 
           , 
           'username' 
           : 
           'name12' 
           , 
           'usertype' 
           : 
           '11' 
           } 
          
 
    

映射到模型之方法2：使用配置好的列表模式，装配读取的数据列表

如果需要读取文件所有列，或者前面的一些列，那么配置字典模式优点复杂，因为需要给每个字段配置索引位置，并且这些位置是从0开始完后数的，属于低级劳动，需要消灭。

列表模式应命运而生，先将配置好的列表模式转换成字典模式，然后按字典加载就可以实现。

转换模式，以及用按列表模式读取的代码：

用列表模式读取数据的方法
          
Python

            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
            18 
          
            19 
          
            20 
          
            21 
          
            22 
          
           @ 
           staticmethod 
          
           def 
           transform_list_to_dict 
           ( 
           para_list 
           ) 
           : 
          
           """把['a', 'b']转换成{'a':0, 'b':1}的形式 
          
               @param para_list: 列表，里面是每个列对应的字段名 
          
               @return: 字典，里面是字段名和位置的映射 
          
               """ 
          
           res_dict 
           = 
           { 
           } 
          
           idx 
           = 
           0 
          
           while 
           idx 
           < 
           len 
           ( 
           para_list 
           ) 
           : 
          
           res_dict 
           [ 
           str 
           ( 
           para_list 
           [ 
           idx 
           ] 
           ) 
           . 
           strip 
           ( 
           ) 
           ] 
           = 
           idx 
          
           idx 
           += 
           1 
          
           return 
           res 
           _dict 
          
           @ 
           staticmethod 
          
           def 
           map_fields_list_schema 
           ( 
           fields 
           , 
           list_schema 
           ) 
           : 
          
           """根据字段的模式，返回模式和数据值的对应值；例如 fields为['a','b','c'],schema为{'name', 'age'}，那么就返回{'name':'a','age':'b'} 
          
               @param fields: 包含有数据的数组，一般是通过对一个Line String通过按照\t分割得到 
          
               @param list_schema: 列名称的列表list 
          
               @return: 词典，key是字段名称，value是字段值 
          
               """ 
          
           dict_schema 
           = 
           FileUtil 
           . 
           transform_list_to_dict 
           ( 
           list_schema 
           ) 
          
           return 
           FileUtil 
           . 
           map_fields_dict_schema 
           ( 
           fields 
           , 
           dict_schema 
           )

使用的时候，可以用列表的形式配置模式，不需要配置索引更加简洁：

使用列表模式读取数据的调用的代码
          
Python

            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
           # coding:utf8 
          
           """ 
          
           @author: www.crazyant.net 
          
           测试使用列表模式加载数据列表 
          
           优点：如果读取所有列，用列表模式只需要按顺序写出各个列的字段名就可以 
          
           缺点：不能够只读取关心的字段，需要全部读取 
          
           """ 
          
           import 
           file_util 
          
           import 
           pprint 
          
           # 配置好的要读取的列表模式，只能配置前面的列，或者所有咧 
          
           list_schema 
           = 
           [ 
           "userid" 
           , 
           "username" 
           , 
           "usertype" 
           ] 
          
           for 
           fields  
           in 
           file_util 
           . 
           FileUtil 
           . 
           read_file_data 
           ( 
           "userfile.txt" 
           ) 
           : 
          
           # 将字段列表，按照字典模式进行映射 
          
           dict_fields 
           = 
           file_util 
           . 
           FileUtil 
           . 
           map_fields_list_schema 
           ( 
           fields 
           , 
           list_schema 
           ) 
          
           pprint 
           . 
           pprint 
           ( 
           dict_fields 
           )

运行结果和字典模式的完全一样。

file_util.py全部代码

以下是file_util.py中的全部代码，可以放在自己的公用类库中使用

file_util.py
          
Python

            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
            18 
          
            19 
          
            20 
          
            21 
          
            22 
          
            23 
          
            24 
          
            25 
          
            26 
          
            27 
          
            28 
          
            29 
          
            30 
          
            31 
          
            32 
          
            33 
          
            34 
          
            35 
          
            36 
          
            37 
          
            38 
          
            39 
          
            40 
          
            41 
          
            42 
          
            43 
          
            44 
          
            45 
          
            46 
          
            47 
          
            48 
          
            49 
          
            50 
          
            51 
          
            52 
          
            53 
          
            54 
          
            55 
          
            56 
          
            57 
          
            58 
          
            59 
          
            60 
          
            61 
          
            62 
          
            63 
          
            64 
          
            65 
          
            66 
          
            67 
          
           # -*- encoding:utf8 -*- 
          
           ''' 
          
           @author: www.crazyant.net 
          
           @version: 2014-12-5 
          
           ''' 
          
           class 
           FileUtil 
           ( 
           object 
           ) 
           : 
          
           '''文件、路径常用操作方法 
          
               ''' 
          
           @ 
           staticmethod 
          
           def 
           read_file_data 
           ( 
           filepath 
           ) 
           : 
          
           '''根据路径按行读取文件, 参数filepath：文件的绝对路径 
          
                   @param filepath: 读取文件的路径 
          
                   @return: 按\t分割后的每行的数据列表 
          
                   ''' 
          
           fin 
           = 
           open 
           ( 
           filepath 
           , 
           'r' 
           ) 
          
           for 
           line  
           in 
           fin 
           : 
          
           try 
           : 
          
           line 
           = 
           line 
           [ 
           : 
           - 
           1 
           ] 
          
           if 
           not 
           line 
           : 
           continue 
          
           except 
           : 
          
           continue 
          
           try 
           : 
          
           fields 
           = 
           line 
           . 
           split 
           ( 
           "\t" 
           ) 
          
           except 
           : 
          
           continue 
          
           # 抛出当前行的分割列表 
          
           yield 
           fields 
          
           fin 
           . 
           close 
           ( 
           ) 
          
           @ 
           staticmethod 
          
           def 
           transform_list_to_dict 
           ( 
           para_list 
           ) 
           : 
          
           """把['a', 'b']转换成{'a':0, 'b':1}的形式 
          
                   @param para_list: 列表，里面是每个列对应的字段名 
          
                   @return: 字典，里面是字段名和位置的映射 
          
                   """ 
          
           res_dict 
           = 
           { 
           } 
          
           idx 
           = 
           0 
          
           while 
           idx 
           < 
           len 
           ( 
           para_list 
           ) 
           : 
          
           res_dict 
           [ 
           str 
           ( 
           para_list 
           [ 
           idx 
           ] 
           ) 
           . 
           strip 
           ( 
           ) 
           ] 
           = 
           idx 
          
           idx 
           += 
           1 
          
           return 
           res 
           _dict 
          
           @ 
           staticmethod 
          
           def 
           map_fields_list_schema 
           ( 
           fields 
           , 
           list_schema 
           ) 
           : 
          
           """根据字段的模式，返回模式和数据值的对应值；例如 fields为['a','b','c'],schema为{'name', 'age'}，那么就返回{'name':'a','age':'b'} 
          
                   @param fields: 包含有数据的数组，一般是通过对一个Line String通过按照\t分割得到 
          
                   @param list_schema: 列名称的列表list 
          
                   @return: 词典，key是字段名称，value是字段值 
          
                   """ 
          
           dict_schema 
           = 
           FileUtil 
           . 
           transform_list_to_dict 
           ( 
           list_schema 
           ) 
          
           return 
           FileUtil 
           . 
           map_fields_dict_schema 
           ( 
           fields 
           , 
           dict_schema 
           ) 
          
           @ 
           staticmethod 
          
           def 
           map_fields_dict_schema 
           ( 
           fields 
           , 
           dict_schema 
           ) 
           : 
          
           """根据字段的模式，返回模式和数据值的对应值；例如 fields为['a','b','c'],schema为{'name':0, 'age':1}，那么就返回{'name':'a','age':'b'} 
          
               @param fields: 包含有数据的数组，一般是通过对一个Line String通过按照\t分割得到 
          
               @param dict_schema: 一个词典，key是字段名称，value是字段的位置； 
          
               @return: 词典，key是字段名称，value是字段值 
          
               """ 
          
           pdict 
           = 
           { 
           } 
          
           for 
           fstr 
           , 
           findex  
           in 
           dict_schema 
           . 
           iteritems 
           ( 
           ) 
           : 
          
           pdict 
           [ 
           fstr 
           ] 
           = 
           str 
           ( 
           fields 
           [ 
           int 
           ( 
           findex 
           ) 
           ] 
           ) 
          
           return 
           pdict

本文地址：http://www.crazyant.net/1707.html

Posted in: python Tagged: python

Python操作MySQL视频教程

2014/11/04 by Crazyant 暂无评论

给大家带来自己制作的Python操作MySQL视频教程。本教程分为三节：Python开发环境搭建以及支持MySQL开发的插件安装、Python访问MySQL数据库的标准API规范接口讲解、Python开发MySQL程序实战编码演示。通过课程的学习，大家能够基本掌握用Python开发MySQL程序。

视频高清版百度链接: http://pan.baidu.com/s/1DB0qM 密码: ri1n

Python操作MySQL视频教程第一讲 – 开发环境搭建

推荐使用以下的开发环境搭配：

Eclipse + JDK7
- 插件：PyDev 3.8.0

python-2.7.8
- 插件：MySQL-python-1.2.4b4.win32-py2.7

MySQL服务器：使用wampserver2.5软件包自带的MySQL软件
- 需要安装：vcredist_x64
- Mysql-5.6.17

本视频在优酷的地址：http://v.youku.com/v_show/id_XODE3Nzk4MTEy.html

Python操作MySQL视频教程第二讲 – 标准接口规范

第二讲的视频教程讲解的主要内容是：

Python官方针对操作数据库的标准规范
- 文档地址：http://legacy.python.org/dev/peps/pep-0249/
Python建立和数据库的connect连接对象
- connection对象的构造函数，包括主机、端口、用户名、密码、编码等参数
- connection对象的方法，主要是关闭连接、获取游标、提交事务、回滚事务
Python执行SQL语句的cursor对象
- 普通游标和字典游标的区别，以及字典游标优于普通游标的原因
- 游标执行SQL语句的方法
- 游标获取执行SQL语句结果集合的方法
Python编写访问数据库程序的框架，主要包括以下步骤：
1. 导入MySQLdb对象
2. 获取connection对象
3. 获取普通游标或者字典游标
4. 执行SQL语句
5. 从游标对象中取出数据，对数据做其他处理；
6. 关闭连接

视频在优酷的地址：http://v.youku.com/v_show/id_XODIxNzQ1MjQ0.html

Python操作MySQL视频教程第三讲 – 实例代码演示

第三讲的视频教程讲解的主要内容是：

Python编写MySQL程序的框架
- 引入模块：import MySQLdb
- 获取连接：conn = MySQLdb.connect()
- 获取游标：cursor = conn.cursor()
- 执行SQL：cursor.execute()
- 获取数据：curosr.fetchall()
- 关闭连接：conn.close()
MySQL的Innodb和Myisam引擎的区别
- innodb支持事务，myisam不支持事务
- 如果访问的是innodb数据库，并执行了insert、delete、update语句，python代码中必须执行conn.commit()才能使得SQL执行生效

视频在优酷的地址：http://v.youku.com/v_show/id_XODI4MjE4Njgw.html

本文的代码和PPT在git上的地址：http://git.oschina.net/peishuaishuai/python-mysql-tutorial

本文的高清视频随后会发布在百度网盘，敬请期待。

本文地址：http://www.crazyant.net/1664.html ，转载请注明来源。

Posted in: mysql, python Tagged: mysql, python, 视频教程

Python批量重命名文件的方法

2013/12/18 by Crazyant 暂无评论

用到了os的两个接口：

1、列出文件夹中的所有文件（也包含目录）

os.listdir(path)
Return a list containing the names of the entries in the directory given by path. The list is in arbitrary order. It does not include the special entries ‘.’ and ‘..’ even if they are present in the directory.

Availability: Unix, Windows.

Changed in version 2.3: On Windows NT/2k/XP and Unix, if path is a Unicode object, the result will be a list of Unicode objects. Undecodable filenames will still be returned as string objects

2、对文件进行重命名

os.rename(src, dst)
Rename the file or directory src to dst. If dst is a directory, OSError will be raised. On Unix, if dst exists and is a file, it will be replaced silently if the user has permission. The operation may fail on some Unix flavors if src and dst are on different filesystems. If successful, the renaming will be an atomic operation (this is a POSIX requirement). On Windows, if dst already exists, OSError will be raised even if it is a file; there may be no way to implement an atomic rename when dst names an existing file.

Availability: Unix, Windows

        
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
           import  
           os 
          
           dirpath 
           = 
           "D:/workbench/crazyant.net/myfiles" 
          
           for 
           fname  
           in 
           os 
           . 
           listdir 
           ( 
           dirpath 
           ) 
           : 
          
           newfname 
           = 
           fname 
           [ 
           3 
           : 
           ] 
          
           newfpath 
           = 
           "%s/%s" 
           % 
           ( 
           dirpath 
           , 
           newfname 
           ) 
          
           oldfpath 
           = 
           "%s/%s" 
           % 
           ( 
           dirpath 
           , 
           fname 
           ) 
          
           os 
           . 
           rename 
           ( 
           oldfpath 
           , 
           newfpath 
           )

其实就是用os.listdir读取里面所有的文件，然后用os.rename进行文件重命名即可实现。

python的os模块官方介绍：http://docs.python.org/2/library/os.html

转载请注明来源：http://www.crazyant.net/1397.html

Posted in: python Tagged: python

Python内置函数map、reduce、filter在文本处理中的应用

2013/12/15 by Crazyant 暂无评论

文件是由很多行组成的，这些行组成一个列表，python提供了处理列表很有用的三个函数：map、reduce、filter。因此在文本处理中，可以使用这三个函数达到代码的更加精简清晰。

这里的map、reduce是python的内置函数，跟hadoop的map、reduce函数没有关系，不过使用的目的有点类似，map函数做预处理、reduce函数一般做聚合。

map、reduce、filter在文本处理中的使用

下面是一个文本文件的内容，第1列是ID，第4列是权重，我们的目标是获取所有ID是奇数的行，将这些行的权重翻倍，最后返回权重值的总和。

ID	键	值	权重
1	name1	value1	11
2	name2	value2	12
3	name3	value3	13
4	name4	value4	14
5	name5	value5	15
6	name6	value6	16
7	name7	value7	17
8	name8	value8	18
9	name9	value9	19
10	name10	value10	20

使用filter、map、reduce函数的代码如下；

        
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
            18 
          
            19 
          
            20 
          
            21 
          
            22 
          
            23 
          
            24 
          
            25 
          
            26 
          
            27 
          
            28 
          
            29 
          
            30 
          
            31 
          
            32 
          
            33 
          
            34 
          
            35 
          
            36 
          
            37 
          
            38 
          
            39 
          
            40 
          
            41 
          
            42 
          
            43 
          
            44 
          
            45 
          
            46 
          
            47 
          
            48 
          
            49 
          
            50 
          
            51 
          
            52 
          
            53 
          
            54 
          
            55 
          
            56 
          
            57 
          
            58 
          
            59 
          
            60 
          
            61 
          
            62 
          
            63 
          
            64 
          
            65 
          
            66 
          
            67 
          
            68 
          
           #coding=utf8 
          
           '' 
           ' 
          
           Created on 2013-12-15 
          
           @author: www.crazyant.net 
          
           ' 
           '' 
          
           import  
           pprint 
          
           def  
           read_file 
           ( 
           file_path 
           ) 
           : 
          
           '' 
           ' 
          
                       读取文件的每一行，按\t分割后返回字段列表； 
          
               ' 
           '' 
          
           with  
           open 
           ( 
           file_path 
           , 
           "r" 
           ) 
           as 
           fp 
           : 
          
           for 
           line  
           in 
           fp 
           : 
          
           fields 
           = 
           line 
           [ 
           : 
           - 
           1 
           ] 
           . 
           split 
           ( 
           "\t" 
           ) 
          
           yield  
           fields 
          
           fp 
           . 
           close 
           ( 
           ) 
          
           def  
           is_even_lines 
           ( 
           fields 
           ) 
           : 
          
           '' 
           ' 
          
                       判断该行是否第一列的数字为偶数； 
          
               ' 
           '' 
          
           return 
           int 
           ( 
           fields 
           [ 
           0 
           ] 
           ) 
           % 
           2 
           == 
           0 
          
           def  
           double_weights 
           ( 
           fields 
           ) 
           : 
          
           '' 
           ' 
          
                       将每一行的权重这一字段的值翻倍 
          
               ' 
           '' 
          
           fields 
           [ 
           - 
           1 
           ] 
           = 
           int 
           ( 
           fields 
           [ 
           - 
           1 
           ] 
           ) 
           * 
           2 
          
           return 
           fields 
          
           def  
           sum_weights 
           ( 
           sum_value 
           , 
           fields 
           ) 
           : 
          
           '' 
           ' 
          
                       累加数字x到数字sum_value上面； 
          
                       返回新的sum_value值； 
          
               ' 
           '' 
          
           sum_value 
           += 
           int 
           ( 
           fields 
           [ 
           - 
           1 
           ] 
           ) 
          
           return 
           sum_value 
          
           if 
           __name__ 
           == 
           "__main__" 
           : 
          
           #读取文件中的所有行 
          
           file_lines 
           = 
           [ 
           x 
           for 
           x 
           in 
           read_file 
           ( 
           "test_data" 
           ) 
           ] 
          
           print 
           '文件中原始的行：' 
          
           pprint 
           . 
           pprint 
           ( 
           file_lines 
           ) 
          
           print 
           '----' 
          
           #过滤掉ID为偶数的行 
          
           odd_lines 
           = 
           filter 
           ( 
           is_even_lines 
           , 
           file_lines 
           ) 
          
           print 
           '过滤掉ID为偶数的行：' 
          
           pprint 
           . 
           pprint 
           ( 
           odd_lines 
           ) 
          
           print 
           '----' 
          
           #将每行的权重值翻倍 
          
           double_weights_lines 
           = 
           map 
           ( 
           double_weights 
           , 
           odd_lines 
           ) 
          
           print 
           '将每行的权重值翻倍：' 
          
           pprint 
           . 
           pprint 
           ( 
           double_weights_lines 
           ) 
          
           print 
           '----' 
          
           #计算所有的权重值的和 
          
           #由于传给sum函数的每个元素都是一个列表，所以需要先提供累加的初始值，这里指定为0 
          
           sum_val 
           = 
           reduce 
           ( 
           sum_weights 
           , 
           double_weights_lines 
           , 
           0 
           ) 
          
           print 
           '计算每行权重值的综合：' 
          
           print  
           sum 
           _val

运行结果：

        
    

        
    
 
     
            1 
          

            2 
          

            3 
          

            4 
          

            5 
          

            6 
          

            7 
          

            8 
          

            9 
          

            10 
          

            11 
          

            12 
          

            13 
          

            14 
          

            15 
          

            16 
          

            17 
          

            18 
          

            19 
          

            20 
          

            21 
          

            22 
          

            23 
          

            24 
          

            25 
          

            26 
          

            27 
          

            28 
          

            29 
          

            文件中原始的行： 
          
 
           [ 
           [ 
           '1' 
           , 
           'name1' 
           , 
           'value1' 
           , 
           '11' 
           ] 
           , 
          
 
           [ 
           '2' 
           , 
           'name2' 
           , 
           'value2' 
           , 
           '12' 
           ] 
           , 
          
 
           [ 
           '3' 
           , 
           'name3' 
           , 
           'value3' 
           , 
           '13' 
           ] 
           , 
          
 
           [ 
           '4' 
           , 
           'name4' 
           , 
           'value4' 
           , 
           '14' 
           ] 
           , 
          
 
           [ 
           '5' 
           , 
           'name5' 
           , 
           'value5' 
           , 
           '15' 
           ] 
           , 
          
 
           [ 
           '6' 
           , 
           'name6' 
           , 
           'value6' 
           , 
           '16' 
           ] 
           , 
          
 
           [ 
           '7' 
           , 
           'name7' 
           , 
           'value7' 
           , 
           '17' 
           ] 
           , 
          
 
           [ 
           '8' 
           , 
           'name8' 
           , 
           'value8' 
           , 
           '18' 
           ] 
           , 
          
 
           [ 
           '9' 
           , 
           'name9' 
           , 
           'value9' 
           , 
           '19' 
           ] 
           , 
          
 
           [ 
           '10' 
           , 
           'name10' 
           , 
           'value10' 
           , 
           '20' 
           ] 
           ] 
          
 
           -- 
           -- 
          

            过滤掉 
           ID为偶数的行： 
          
 
           [ 
           [ 
           '2' 
           , 
           'name2' 
           , 
           'value2' 
           , 
           '12' 
           ] 
           , 
          
 
           [ 
           '4' 
           , 
           'name4' 
           , 
           'value4' 
           , 
           '14' 
           ] 
           , 
          
 
           [ 
           '6' 
           , 
           'name6' 
           , 
           'value6' 
           , 
           '16' 
           ] 
           , 
          
 
           [ 
           '8' 
           , 
           'name8' 
           , 
           'value8' 
           , 
           '18' 
           ] 
           , 
          
 
           [ 
           '10' 
           , 
           'name10' 
           , 
           'value10' 
           , 
           '20' 
           ] 
           ] 
          
 
           -- 
           -- 
          

            将每行的权重值翻倍： 
          
 
           [ 
           [ 
           '2' 
           , 
           'name2' 
           , 
           'value2' 
           , 
           24 
           ] 
           , 
          
 
           [ 
           '4' 
           , 
           'name4' 
           , 
           'value4' 
           , 
           28 
           ] 
           , 
          
 
           [ 
           '6' 
           , 
           'name6' 
           , 
           'value6' 
           , 
           32 
           ] 
           , 
          
 
           [ 
           '8' 
           , 
           'name8' 
           , 
           'value8' 
           , 
           36 
           ] 
           , 
          
 
           [ 
           '10' 
           , 
           'name10' 
           , 
           'value10' 
           , 
           40 
           ] 
           ] 
          
 
           -- 
           -- 
          

            计算每行权重值的综合： 
          
 
           160 
          

              
          
 
    

map、reduce、filter函数的特点

filter函数：以列表为参数，返回满足条件的元素组成的列表；类似于SQL中的where a=1
map函数：以列表为参数，对每个元素做处理，返回这些处理后元素组成的列表；类似于sql中的select a*2
reduce函数：以列表为参数，对列表进行累计、汇总、平均等聚合函数；类似于sql中的select sum(a),average(b)

这些函数官方的解释

map(function, iterable, …)

Apply function to every item of iterable and return a list of the results. If additional iterable arguments are passed, function must take that many arguments and is applied to the items from all iterables in parallel. If one iterable is shorter than another it is assumed to be extended with None items. If function is None, the identity function is assumed; if there are multiple arguments, map() returns a list consisting of tuples containing the corresponding items from all iterables (a kind of transpose operation). The iterable arguments may be a sequence or any iterable object; the result is always a list.

reduce(function, iterable[, initializer])

Apply function of two arguments cumulatively to the items of iterable, from left to right, so as to reduce the iterable to a single value. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5). The left argument, x, is the accumulated value and the right argument, y, is the update value from the iterable. If the optional initializer is present, it is placed before the items of the iterable in the calculation, and serves as a default when the iterable is empty. If initializer is not given and iterable contains only one item, the first item is returned. Roughly equivalent to:

def reduce(function, iterable, initializer=None):
it = iter(iterable)
if initializer is None:
try:
initializer = next(it)
except StopIteration:
raise TypeError(‘reduce() of empty sequence with no initial value’)
accum_value = initializer
for x in it:
accum_value = function(accum_value, x)
return accum_value

filter(function, iterable)

Construct a list from those elements of iterable for which function returns true. iterable may be either a sequence, a container which supports iteration, or an iterator. If iterable is a string or a tuple, the result also has that type; otherwise it is always a list. If function is None, the identity function is assumed, that is, all elements of iterable that are false are removed.

Note that filter(function, iterable) is equivalent to [item for item in iterable if function(item)] if function is not None and [item for item in iterable if item] if function is None.

See itertools.ifilter() and itertools.ifilterfalse() for iterator versions of this function, including a variation that filters for elements where the function returns false.

参考资料：

http://docs.python.org/2/library/functions.html

http://www.oschina.net/code/snippet_111708_16145

转载请注明来源： http://www.crazyant.net/1390.html

Posted in: python

mysql根据A表更新B表的方法

2013/11/28 by Crazyant 暂无评论

最近遇到一个需求：mysql中A表和B表都有(id, age)字段，现在想读取B表的age字段，将其update到A表对应ID的age字段中去，我直接想到了一种方案：用Python读取B表，获得{id:age}形式的数据，然后根据每个ID和age的值依次update A表。

两个表分别定义和数据如下：

A表定义：

Field	Type	Comment
id	int(11)
name	varchar(20)
age	int(11)

数据：

1,name1,0
2,name2,0
3,name3,0
4,name4,0
5,name5,0

B表定义

Field	Type	Comment
id	int(11)
age	int(11)

数据：

1,11
2,21
3,31
4,41
5,51

python代码来实现

# -*- encoding:utf8 -*-
'''
@author: crazyant.net
读取B表的(id, age)数据，然后依次更新A表；
'''
from common.DBUtil import DB

dbUtil = DB('127.0.0.1',3306,'root','','test')

rs = dbUtil.query("SELECT id,age FROM table_b")

for row in rs:
(idv,age)=row
print (idv,age)
update_sql="update table_a set age='%s' where id='%s';"%(age,idv)
print update_sql
dbUtil.update(update_sql)

print 'over'

其实一条SQL语句就可以搞定

看了看代码，实在是简单，于是网上搜了一下mysql能不能根据一个表更新另一个表，结果发现update本身就支持多个表更新的功能。

UPDATE table_a,table_b SET table_a.age=table_b.age WHERE table_a.id=table_b.id;

用python代码就显得是大炮打蚊子多次一举了。

转载请注明来源：链接

Posted in: mysql, python Tagged: mysql, python

[织梦DEDE迁移]读取织梦MySQL生成所有文章链接

2013/11/27 by Crazyant 暂无评论

广告：本人承接迁移织梦到wordpress的业务.

本文阐述了从织梦的Mysql数据库读取数据表，生成所有文章链接的方法。

本文中使用了封装了Mysql常用函数的一个模块DBUtil，代码见链接

1、确认链接的组成结构

这个信息记录在dede的分类表dede_arctype的namerule字段中；

执行SQL语句：SELECT namerule FROM dede_arctype;

会看到返回结果都是一个值（一般都没有修改）：{typedir}/{Y}/{M}{D}/{aid}.html

这意思是，链接由以下字段组成：

{typedir}：类型的目录，来源于dede_arctype的typedir字段；
{Y}{M}{D}：文章发布的时间，来源于dede_archives表的pubdate字段；
{aid}：文章ID，来源于dede_archives的ID字段；

2、读取Mysql，拼凑URL

大致过程：

读取mysql的dede_arctype表和dede_archives，得到所有链接信息（包括文章ID、类型名称、类型目录、标题、发布日期、自定义文件名）
对于每一个链接，根据第1步骤的介绍装备链接；
至此已经拿到了所有的链接ID、链接标题和链接URL。

        
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
            18 
          
            19 
          
            20 
          
            21 
          
            22 
          
            23 
          
            24 
          
            25 
          
            26 
          
            27 
          
            28 
          
            29 
          
            30 
          
            31 
          
            32 
          
            33 
          
            34 
          
            35 
          
            36 
          
            37 
          
            38 
          
            39 
          
            40 
          
            41 
          
            42 
          
            43 
          
            44 
          
            45 
          
            46 
          
            47 
          
            48 
          
            49 
          
            50 
          
            51 
          
            52 
          
            53 
          
            54 
          
            55 
          
            56 
          
            57 
          
            58 
          
            59 
          
            60 
          
            61 
          
            62 
          
            63 
          
            64 
          
            65 
          
            66 
          
           # -*- encoding:utf8 -*- 
          
           from  
           common  
           import  
           DBUtil 
          
           import  
           pprint 
          
           import  
           datetime 
          
           dbUtil 
           = 
           DBUtil 
           . 
           DB 
           ( 
           '127.0.0.1' 
           , 
           3306 
           , 
           'root' 
           , 
           '' 
           , 
           'oiayafnq_lwqn' 
           ) 
          
           site_home_url 
           = 
           "http://www.crazyant.net" 
          
           class 
           Link 
           ( 
           ) 
           : 
          
           def  
           __init__ 
           ( 
           self 
           , 
           p_linkid 
           , 
           p_title 
           , 
           p_linkurl 
           ) 
           : 
          
           self 
           . 
           linkid 
           = 
           p_linkid 
          
           self 
           . 
           title 
           = 
           p_title 
          
           self 
           . 
           linkurl 
           = 
           p_linkurl 
          
           def  
           __str__ 
           ( 
           self 
           ) 
           : 
          
           strv 
           = 
           "%s\n%s\n%s\n" 
           % 
           ( 
           self 
           . 
           linkid 
           , 
           self 
           . 
           title 
           , 
           self 
           . 
           linkurl 
           ) 
          
           return 
           strv 
          
           class 
           DedeLinks 
           ( 
           ) 
           : 
          
           def  
           __init__ 
           ( 
           self 
           ) 
           : 
          
           self 
           . 
           allLinks 
           = 
           [ 
           ] 
          
           def  
           getDbArticlesInfo 
           ( 
           self 
           ) 
           : 
          
           '' 
           ' 
          
                                   获取数据库中链接的信息以及对应的分类 
          
                   ' 
           '' 
          
           rs 
           = 
           dbUtil 
           . 
           query 
           ( 
           '' 
           ' 
          
                               SELECT  
          
                                   dede_archives.id,dede_arctype.typename,dede_arctype.typedir,typeid,title,pubdate,filename 
          
                               FROM  
          
                                   dede_archives,dede_arctype  
          
                               WHERE dede_archives.typeid=dede_arctype.id; 
          
                           ' 
           '' 
           ) 
          
           return 
           rs 
          
           def  
           equipLink 
           ( 
           self 
           , 
           typedir 
           , 
           urldate 
           , 
           filename 
           , 
           linkid 
           ) 
           : 
          
           '' 
           ' 
          
                                   根据分类目录、发布文章日期、自定义连接名（可以为空），链接ID，拼接成一个URL 
          
                   ' 
           '' 
          
           article_date 
           = 
           str 
           ( 
           datetime 
           . 
           date 
           . 
           fromtimestamp 
           ( 
           urldate 
           ) 
           ) 
           . 
           replace 
           ( 
           "-" 
           , 
           "" 
           ) 
          
           #print filename 
          
           link_dir 
           = 
           "%s/%s/%s" 
           % 
           ( 
           typedir 
           , 
           article_date 
           [ 
           : 
           4 
           ] 
           , 
           article_date 
           [ 
           4 
           : 
           ] 
           ) 
          
           if 
           filename 
           . 
           strip 
           ( 
           ) 
           != 
           "" 
           : 
          
           link 
           = 
           "%s/%s.html" 
           % 
           ( 
           link_dir 
           , 
           filename 
           ) 
          
           else 
           : 
          
           link 
           = 
           "%s/%s.html" 
           % 
           ( 
           link_dir 
           , 
           linkid 
           ) 
          
           link 
           = 
           link 
           . 
           replace 
           ( 
           "{cmspath}" 
           , 
           site_home_url 
           ) 
          
           return 
           link 
          
           def  
           getAllDedeLinks 
           ( 
           self 
           ) 
           : 
          
           rs 
           = 
           self 
           . 
           getDbArticlesInfo 
           ( 
           ) 
          
           for 
           row  
           in 
           rs 
           : 
          
           ( 
           linkid 
           , 
           typename 
           , 
           typedir 
           , 
           typeid 
           , 
           title 
           , 
           pubdate 
           , 
           filename 
           ) 
           = 
           row 
          
           linkurl 
           = 
           self 
           . 
           equipLink 
           ( 
           typedir 
           , 
           pubdate 
           , 
           filename 
           , 
           linkid 
           ) 
          
           linkNode 
           = 
           Link 
           ( 
           linkid 
           , 
           title 
           , 
           linkurl 
           ) 
          
           self 
           . 
           allLinks 
           . 
           append 
           ( 
           linkNode 
           ) 
          
           def  
           process 
           ( 
           self 
           ) 
           : 
          
           self 
           . 
           getAllDedeLinks 
           ( 
           ) 
          
           if 
           __name__ 
           == 
           "__main__" 
           : 
          
           dlinks 
           = 
           DedeLinks 
           ( 
           ) 
          
           dlinks 
           . 
           process 
           ( 
           ) 
          
           for 
           linkNode  
           in 
           dlinks 
           . 
           allLinks 
           : 
          
           print  
           linkNode

其他模块可以访问该模块，采用dlinks.allLinks来访问所有的链接，其中的每个列表元素均包括链接ID、链接标题和链接URL。

转载请注明来源：织梦dede迁移读取织梦mysql生成所有文章链接

Posted in: PHP, python Tagged: 织梦

Python访问MySQL封装的常用类

2013/11/27 by Crazyant 暂无评论

python访问mysql比较简单，细节请参考我的另一篇文章：链接

自己平时也就用到两个mysql函数：查询和更新，下面是自己常用的函数的封装，大家拷贝过去直接可以使用。

文件名：DBUtil.py

        
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
            18 
          
            19 
          
            20 
          
            21 
          
            22 
          
            23 
          
            24 
          
            25 
          
            26 
          
            27 
          
            28 
          
            29 
          
            30 
          
            31 
          
            32 
          
            33 
          
            34 
          
            35 
          
            36 
          
            37 
          
            38 
          
            39 
          
            40 
          
            41 
          
            42 
          
            43 
          
            44 
          
            45 
          
            46 
          
            47 
          
            48 
          
           # -*- encoding:utf8 -*- 
          
           '' 
           ' 
          
           @author: crazyant.net 
          
           @version: 2013-10-22 
          
           封装的mysql常用函数 
          
           ' 
           '' 
          
           import  
           MySQLdb 
          
           class 
           DB 
           ( 
           ) 
           : 
          
           def  
           __init__ 
           ( 
           self 
           , 
           DB_HOST 
           , 
           DB_PORT 
           , 
           DB_USER 
           , 
           DB_PWD 
           , 
           DB_NAME 
           ) 
           : 
          
           self 
           . 
           DB_HOST 
           = 
           DB_HOST 
          
           self 
           . 
           DB_PORT 
           = 
           DB_PORT 
          
           self 
           . 
           DB_USER 
           = 
           DB_USER 
          
           self 
           . 
           DB_PWD 
           = 
           DB_PWD 
          
           self 
           . 
           DB_NAME 
           = 
           DB_NAME 
          
           self 
           . 
           conn 
           = 
           self 
           . 
           getConnection 
           ( 
           ) 
          
           def  
           getConnection 
           ( 
           self 
           ) 
           : 
          
           return 
           MySQLdb 
           . 
           Connect 
           ( 
          
           host 
           = 
           self 
           . 
           DB_HOST 
           , 
           #设置MYSQL地址 
          
           port 
           = 
           self 
           . 
           DB_PORT 
           , 
           #设置端口号 
          
           user 
           = 
           self 
           . 
           DB_USER 
           , 
           #设置用户名 
          
           passwd 
           = 
           self 
           . 
           DB_PWD 
           , 
           #设置密码 
          
           db 
           = 
           self 
           . 
           DB_NAME 
           , 
           #数据库名 
          
           charset 
           = 
           'utf8' 
           #设置编码 
          
           ) 
          
           def  
           query 
           ( 
           self 
           , 
           sqlString 
           ) 
           : 
          
           cursor 
           = 
           self 
           . 
           conn 
           . 
           cursor 
           ( 
           ) 
          
           cursor 
           . 
           execute 
           ( 
           sqlString 
           ) 
          
           returnData 
           = 
           cursor 
           . 
           fetchall 
           ( 
           ) 
          
           cursor 
           . 
           close 
           ( 
           ) 
          
           self 
           . 
           conn 
           . 
           close 
           ( 
           ) 
          
           return 
           returnData 
          
           def  
           update 
           ( 
           self 
           , 
           sqlString 
           ) 
           : 
          
           cursor 
           = 
           self 
           . 
           conn 
           . 
           cursor 
           ( 
           ) 
          
           cursor 
           . 
           execute 
           ( 
           sqlString 
           ) 
          
           self 
           . 
           conn 
           . 
           commit 
           ( 
           ) 
          
           cursor 
           . 
           close 
           ( 
           ) 
          
           self 
           . 
           conn 
           . 
           close 
           ( 
           ) 
          
           if 
           __name__ 
           == 
           "__main__" 
           : 
          
           db 
           = 
           DB 
           ( 
           '127.0.0.1' 
           , 
           3306 
           , 
           'root' 
           , 
           '' 
           , 
           'wordpress' 
           ) 
          
           print  
           db 
           . 
           query 
           ( 
           "show tables;" 
           )

使用方法为文件下面的main函数，使用query执行select语句并获取结果；或者使用update进行insert、delete等操作。

Posted in: mysql, python Tagged: mysql, python

python执行shell的两种方法

2013/11/22 by Crazyant 暂无评论

有两种方法可以在Python中执行SHELL程序，方法一是使用Python的commands包，方法二则是使用subprocess包，这两个包均是Python现有的内置模块。

使用python内置commands模块执行shell

commands对Python的os.popen()进行了封装，使用SHELL命令字符串作为其参数，返回命令的结果数据以及命令执行的状态；

该命令目前已经废弃，被subprocess所替代；

          
Python

            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
           # coding=utf-8 
          
           ''' 
          
           Created on 2013年11月22日 
          
           @author: crazyant.net 
          
           ''' 
          
           import 
           commands 
          
           import 
           pprint 
          
           def 
           cmd_exe 
           ( 
           cmd_String 
           ) 
           : 
          
           print 
           "will exe cmd,cmd:" 
           + 
           cmd_String 
          
           return 
           commands 
           . 
           getstatusoutput 
           ( 
           cmd_String 
           ) 
          
           if 
           __name__ 
           == 
           "__main__" 
           : 
          
           pprint 
           . 
           pprint 
           ( 
           cmd_exe 
           ( 
           "ls -la" 
           ) 
           )

使用python最新的subprocess模块执行shell

Python目前已经废弃了os.system，os.spawn*，os.popen*，popen2.*，commands.*来执行其他语言的命令，subprocesss是被推荐的方法；

subprocess允许你能创建很多子进程，创建的时候能指定子进程和子进程的输入、输出、错误输出管道，执行后能获取输出结果和执行状态。

        
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
            18 
          
            19 
          
            20 
          
            21 
          
            22 
          
            23 
          
            24 
          
            25 
          
            26 
          
            27 
          
            28 
          
            29 
          
            30 
          
            31 
          
            32 
          
            33 
          
            34 
          
            35 
          
            36 
          
            37 
          
            38 
          
            39 
          
            40 
          
            41 
          
            42 
          
           # coding=utf-8 
          
           '' 
           ' 
          
           Created on 2013年11月22日 
          
           @author: crazyant.net 
          
           ' 
           '' 
          
           import  
           shlex 
          
           import  
           datetime 
          
           import  
           subprocess 
          
           import  
           time 
          
           def  
           execute_command 
           ( 
           cmdstring 
           , 
           cwd 
           = 
           None 
           , 
           timeout 
           = 
           None 
           , 
           shell 
           = 
           False 
           ) 
           : 
          
           "" 
           "执行一个SHELL命令 
          
                       封装了subprocess的Popen方法, 支持超时判断，支持读取stdout和stderr 
          
                      参数: 
          
                   cwd: 运行命令时更改路径，如果被设定，子进程会直接先更改当前路径到cwd 
          
                   timeout: 超时时间，秒，支持小数，精度0.1秒 
          
                   shell: 是否通过shell运行 
          
               Returns: return_code 
          
               Raises:  Exception: 执行超时 
          
               " 
           "" 
          
           if 
           shell 
           : 
          
           cmdstring_list 
           = 
           cmdstring 
          
           else 
           : 
          
           cmdstring_list 
           = 
           shlex 
           . 
           split 
           ( 
           cmdstring 
           ) 
          
           if 
           timeout 
           : 
          
           end_time 
           = 
           datetime 
           . 
           datetime 
           . 
           now 
           ( 
           ) 
           + 
           datetime 
           . 
           timedelta 
           ( 
           seconds 
           = 
           timeout 
           ) 
          
           #没有指定标准输出和错误输出的管道，因此会打印到屏幕上； 
          
           sub 
           = 
           subprocess 
           . 
           Popen 
           ( 
           cmdstring_list 
           , 
           cwd 
           = 
           cwd 
           , 
           stdin 
           = 
           subprocess 
           . 
           PIPE 
           , 
           shell 
           = 
           shell 
           , 
           bufsize 
           = 
           4096 
           ) 
          
           #subprocess.poll()方法：检查子进程是否结束了，如果结束了，设定并返回码，放在subprocess.returncode变量中  
          
           while 
           sub 
           . 
           poll 
           ( 
           ) 
           is 
           None 
           : 
          
           time 
           . 
           sleep 
           ( 
           0.1 
           ) 
          
           if 
           timeout 
           : 
          
           if 
           end_time 
           <= 
           datetime 
           . 
           datetime 
           . 
           now 
           ( 
           ) 
           : 
          
           raise  
           Exception 
           ( 
           "Timeout：%s" 
           % 
           cmdstring 
           ) 
          
           return 
           str 
           ( 
           sub 
           . 
           returncode 
           ) 
          
           if 
           __name__ 
           == 
           "__main__" 
           : 
          
           print  
           execute_command 
           ( 
           "ls" 
           )

也可以在Popen中指定stdin和stdout为一个变量，这样就能直接接收该输出变量值。

总结

在python中执行SHELL有时候也是很必须的，比如使用Python的线程机制启动不同的shell进程，目前subprocess是Python官方推荐的方法，其支持的功能也是最多的，推荐大家使用。

转载请注明来源：http://www.crazyant.net/1319.html

Posted in: python, shell Tagged: python, shell

Python封装的常用日期函数

2013/10/12 by Crazyant 暂无评论

处理日志数据时，经常要对日期进行进行计算，比如日期加上天数、日期相差天数、日期对应的周等计算，本文收集了几个常用的python日期功能函数，一直更新中。

直接贴代码（文件名DateUtil.py），函数功能可以直接查看注释：

        
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
            18 
          
            19 
          
            20 
          
            21 
          
            22 
          
            23 
          
            24 
          
            25 
          
            26 
          
            27 
          
            28 
          
            29 
          
            30 
          
            31 
          
            32 
          
            33 
          
            34 
          
            35 
          
            36 
          
            37 
          
            38 
          
            39 
          
            40 
          
            41 
          
            42 
          
            43 
          
            44 
          
            45 
          
            46 
          
            47 
          
            48 
          
            49 
          
            50 
          
            51 
          
            52 
          
            53 
          
            54 
          
            55 
          
            56 
          
            57 
          
            58 
          
            59 
          
            60 
          
            61 
          
            62 
          
            63 
          
            64 
          
            65 
          
            66 
          
            67 
          
            68 
          
            69 
          
            70 
          
            71 
          
            72 
          
            73 
          
            74 
          
            75 
          
            76 
          
            77 
          
            78 
          
            79 
          
            80 
          
            81 
          
            82 
          
            83 
          
            84 
          
            85 
          
            86 
          
            87 
          
            88 
          
            89 
          
            90 
          
            91 
          
            92 
          
            93 
          
            94 
          
            95 
          
            96 
          
            97 
          
            98 
          
            99 
          
            100 
          
            101 
          
            102 
          
            103 
          
            104 
          
            105 
          
            106 
          
            107 
          
            108 
          
            109 
          
            110 
          
            111 
          
            112 
          
            113 
          
            114 
          
            115 
          
            116 
          
            117 
          
            118 
          
            119 
          
            120 
          
            121 
          
            122 
          
            123 
          
            124 
          
            125 
          
            126 
          
            127 
          
            128 
          
            129 
          
            130 
          
            131 
          
            132 
          
            133 
          
            134 
          
            135 
          
            136 
          
            137 
          
            138 
          
            139 
          
            140 
          
            141 
          
            142 
          
            143 
          
            144 
          
            145 
          
            146 
          
            147 
          
            148 
          
            149 
          
            150 
          
            151 
          
            152 
          
            153 
          
            154 
          
            155 
          
            156 
          
            157 
          
            158 
          
            159 
          
            160 
          
            161 
          
            162 
          
            163 
          
            164 
          
            165 
          
            166 
          
            167 
          
            168 
          
            169 
          
           # -*- encoding:utf8 -*- 
          
           '' 
           ' 
          
           @author: crazyant 
          
           @version: 2013-10-12 
          
           ' 
           '' 
          
           import  
           datetime 
           , 
           time 
          
           #定义的日期的格式，可以自己改一下，比如改成"$Y年$m月$d日" 
          
           format_date 
           = 
           "%Y-%m-%d" 
          
           format_datetime 
           = 
           "%Y-%m-%d %H:%M:%S" 
          
           def  
           getCurrentDate 
           ( 
           ) 
           : 
          
           '' 
           ' 
          
                       获取当前日期：2013-09-10这样的日期字符串 
          
               ' 
           '' 
          
           return 
           time 
           . 
           strftime 
           ( 
           format_date 
           , 
           time 
           . 
           localtime 
           ( 
           time 
           . 
           time 
           ( 
           ) 
           ) 
           ) 
          
           def  
           getCurrentDateTime 
           ( 
           ) 
           : 
          
           '' 
           ' 
          
                       获取当前时间：2013-09-10 11:22:11这样的时间年月日时分秒字符串 
          
               ' 
           '' 
          
           return 
           time 
           . 
           strftime 
           ( 
           format_datetime 
           , 
           time 
           . 
           localtime 
           ( 
           time 
           . 
           time 
           ( 
           ) 
           ) 
           ) 
          
           def  
           getCurrentHour 
           ( 
           ) 
           : 
          
           '' 
           ' 
          
                       获取当前时间的小时数，比如如果当前是下午16时，则返回16 
          
               ' 
           '' 
          
           currentDateTime 
           = 
           getCurrentDateTime 
           ( 
           ) 
          
           return 
           currentDateTime 
           [ 
           - 
           8 
           : 
           - 
           6 
           ] 
          
           def  
           getDateElements 
           ( 
           sdate 
           ) 
           : 
          
           '' 
           ' 
          
                       输入日期字符串，返回一个结构体组，包含了日期各个分量 
          
                       输入：2013-09-10或者2013-09-10 22:11:22 
          
                       返回：time.struct_time(tm_year=2013, tm_mon=4, tm_mday=1, tm_hour=21, tm_min=22, tm_sec=33, tm_wday=0, tm_yday=91, tm_isdst=-1) 
          
               ' 
           '' 
          
           dformat 
           = 
           "" 
          
           if 
           judgeDateFormat 
           ( 
           sdate 
           ) 
           == 
           0 
           : 
          
           return 
           None 
          
           elif  
           judgeDateFormat 
           ( 
           sdate 
           ) 
           == 
           1 
           : 
          
           dformat 
           = 
           format_date 
          
           elif  
           judgeDateFormat 
           ( 
           sdate 
           ) 
           == 
           2 
           : 
          
           dformat 
           = 
           format_datetime 
          
           sdate 
           = 
           time 
           . 
           strptime 
           ( 
           sdate 
           , 
           dformat 
           ) 
          
           return 
           sdate 
          
           def  
           getDateToNumber 
           ( 
           date1 
           ) 
           : 
          
           '' 
           ' 
          
                       将日期字符串中的减号冒号去掉:  
          
                       输入：2013-04-05，返回20130405 
          
                       输入：2013-04-05 22:11:23，返回20130405221123 
          
               ' 
           '' 
          
           return 
           date1 
           . 
           replace 
           ( 
           "-" 
           , 
           "" 
           ) 
           . 
           replace 
           ( 
           ":" 
           , 
           "" 
           ) 
           . 
           replace 
           ( 
           "" 
           , 
           "" 
           ) 
          
           def  
           judgeDateFormat 
           ( 
           datestr 
           ) 
           : 
          
           '' 
           ' 
          
                       判断日期的格式，如果是"%Y-%m-%d"格式则返回1，如果是"%Y-%m-%d %H:%M:%S"则返回2，否则返回0 
          
                       参数 datestr:日期字符串 
          
               ' 
           '' 
          
           try 
           : 
          
           datetime 
           . 
           datetime 
           . 
           strptime 
           ( 
           datestr 
           , 
           format_date 
           ) 
          
           return 
           1 
          
           except 
           : 
          
           pass 
          
           try 
           : 
          
           datetime 
           . 
           datetime 
           . 
           strptime 
           ( 
           datestr 
           , 
           format_datetime 
           ) 
          
           return 
           2 
          
           except 
           : 
          
           pass 
          
           return 
           0 
          
           def  
           minusTwoDate 
           ( 
           date1 
           , 
           date2 
           ) 
           : 
          
           '' 
           ' 
          
                       将两个日期相减，获取相减后的datetime.timedelta对象 
          
                       对结果可以直接访问其属性days、seconds、microseconds 
          
               ' 
           '' 
          
           if 
           judgeDateFormat 
           ( 
           date1 
           ) 
           == 
           0 
           or 
           judgeDateFormat 
           ( 
           date2 
           ) 
           == 
           0 
           : 
          
           return 
           None 
          
           d1Elements 
           = 
           getDateElements 
           ( 
           date1 
           ) 
          
           d2Elements 
           = 
           getDateElements 
           ( 
           date2 
           ) 
          
           if 
           not 
           d1Elements  
           or 
           not 
           d2Elements 
           : 
          
           return 
           None 
          
           d1 
           = 
           datetime 
           . 
           datetime 
           ( 
           d1Elements 
           . 
           tm_year 
           , 
           d1Elements 
           . 
           tm_mon 
           , 
           d1Elements 
           . 
           tm_mday 
           , 
           d1Elements 
           . 
           tm_hour 
           , 
           d1Elements 
           . 
           tm_min 
           , 
           d1Elements 
           . 
           tm_sec 
           ) 
          
           d2 
           = 
           datetime 
           . 
           datetime 
           ( 
           d2Elements 
           . 
           tm_year 
           , 
           d2Elements 
           . 
           tm_mon 
           , 
           d2Elements 
           . 
           tm_mday 
           , 
           d2Elements 
           . 
           tm_hour 
           , 
           d2Elements 
           . 
           tm_min 
           , 
           d2Elements 
           . 
           tm_sec 
           ) 
          
           return 
           d1 
           - 
           d2 
          
           def  
           dateAddInDays 
           ( 
           date1 
           , 
           addcount 
           ) 
           : 
          
           '' 
           ' 
          
                       日期加上或者减去一个数字，返回一个新的日期 
          
                       参数date1：要计算的日期 
          
                       参数addcount：要增加或者减去的数字，可以为1、2、3、-1、-2、-3，负数表示相减 
          
               ' 
           '' 
          
           try 
           : 
          
           addtime 
           = 
           datetime 
           . 
           timedelta 
           ( 
           days 
           = 
           int 
           ( 
           addcount 
           ) 
           ) 
          
           d1Elements 
           = 
           getDateElements 
           ( 
           date1 
           ) 
          
           d1 
           = 
           datetime 
           . 
           datetime 
           ( 
           d1Elements 
           . 
           tm_year 
           , 
           d1Elements 
           . 
           tm_mon 
           , 
           d1Elements 
           . 
           tm_mday 
           ) 
          
           datenew 
           = 
           d1 
           + 
           addtime 
          
           return 
           datenew 
           . 
           strftime 
           ( 
           format_date 
           ) 
          
           except  
           Exception  
           as 
           e 
           : 
          
           print 
           e 
          
           return 
           None 
          
           def  
           is_leap_year 
           ( 
           pyear 
           ) 
           : 
          
           '' 
           ' 
          
                       判断输入的年份是否是闰年  
          
               ' 
           '' 
               
           try 
           : 
                                 
           datetime 
           . 
           datetime 
           ( 
           pyear 
           , 
           2 
           , 
           29 
           ) 
          
           return 
           True 
                      
           except  
           ValueError 
           : 
                   
           return 
           False 
                     
           def  
           dateDiffInDays 
           ( 
           date1 
           , 
           date2 
           ) 
           : 
          
           '' 
           ' 
          
                       获取两个日期相差的天数，如果date1大于date2，返回正数，否则返回负数 
          
               ' 
           '' 
          
           minusObj 
           = 
           minusTwoDate 
           ( 
           date1 
           , 
           date2 
           ) 
          
           try 
           : 
          
           return 
           minusObj 
           . 
           days 
          
           except 
           : 
          
           return 
           None 
          
           def  
           dateDiffInSeconds 
           ( 
           date1 
           , 
           date2 
           ) 
           : 
          
           '' 
           ' 
          
                       获取两个日期相差的秒数 
          
               ' 
           '' 
          
           minusObj 
           = 
           minusTwoDate 
           ( 
           date1 
           , 
           date2 
           ) 
          
           try 
           : 
          
           return 
           minusObj 
           . 
           days * 
           24 
           * 
           3600 
           + 
           minusObj 
           . 
           seconds 
          
           except 
           : 
          
           return 
           None 
          
           def  
           getWeekOfDate 
           ( 
           pdate 
           ) 
           : 
          
           '' 
           ' 
          
                       获取日期对应的周，输入一个日期，返回一个周数字，范围是0~6、其中0代表周日 
          
               ' 
           '' 
          
           pdateElements 
           = 
           getDateElements 
           ( 
           pdate 
           ) 
          
           weekday 
           = 
           int 
           ( 
           pdateElements 
           . 
           tm_wday 
           ) 
           + 
           1 
          
           if 
           weekday 
           == 
           7 
           : 
          
           weekday 
           = 
           0 
          
           return 
           weekday 
          
           if 
           __name__ 
           == 
           "__main__" 
           : 
          
           '' 
           ' 
          
                       一些测试代码 
          
               ' 
           '' 
          
           print  
           judgeDateFormat 
           ( 
           "2013-04-01" 
           ) 
          
           print  
           judgeDateFormat 
           ( 
           "2013-04-01 21:22:33" 
           ) 
          
           print  
           judgeDateFormat 
           ( 
           "2013-04-31 21:22:33" 
           ) 
          
           print  
           judgeDateFormat 
           ( 
           "2013-xx" 
           ) 
          
           print 
           "--" 
          
           print  
           datetime 
           . 
           datetime 
           . 
           strptime 
           ( 
           "2013-04-01" 
           , 
           "%Y-%m-%d" 
           ) 
          
           print 
           'elements' 
          
           print  
           getDateElements 
           ( 
           "2013-04-01 21:22:33" 
           ) 
          
           print 
           'minus' 
          
           print  
           minusTwoDate 
           ( 
           "2013-03-05" 
           , 
           "2012-03-07" 
           ) 
           . 
           days 
          
           print  
           dateDiffInSeconds 
           ( 
           "2013-03-07 12:22:00" 
           , 
           "2013-03-07 10:22:00" 
           ) 
          
           print  
           type 
           ( 
           getCurrentDate 
           ( 
           ) 
           ) 
          
           print  
           getCurrentDateTime 
           ( 
           ) 
          
           print  
           dateDiffInSeconds 
           ( 
           getCurrentDateTime 
           ( 
           ) 
           , 
           "2013-06-17 14:00:00" 
           ) 
          
           print  
           getCurrentHour 
           ( 
           ) 
          
           print  
           dateAddInDays 
           ( 
           "2013-04-05" 
           , 
           - 
           5 
           ) 
          
           print  
           getDateToNumber 
           ( 
           "2013-04-05" 
           ) 
          
           print  
           getDateToNumber 
           ( 
           "2013-04-05 22:11:33" 
           ) 
          
           print  
           getWeekOfDate 
           ( 
           "2013-10-01" 
           )

转载请注明来源:http://www.crazyant.net/1309.html

ronon77

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
python 常用的案例1

pythonPython中文转拼音代码(支持全拼和首字母缩写)2015/07/05 by Crazyant 暂无评论本文的代码，从https://github.com/cleverdeng/pinyin.py升级得来，针对原文的代码，做了以下升级： 12341、可以传入参数firstc...
复制链接

扫一扫