1.标准输出到另一个文件
import sys
output = sys.stdout #output是临时变量
outputfile = open('wj_file','w') #打开文件
sys.stdout = outputfile #把该文件赋值给标准输出
print "hello" #hello 写入文件
outputfile .close()
sys.stdout = output #还原标准输出
2.多线程运行程序
import threading
thread_list = [None] * 10 #同时启动10个线程
def crawl_one(body, num):
return
# 要并发的程序
num = 0
for line in sys.stdin:
num += 1
body = line
i = num % 10
if thread_list[i] is None:
new_thread = threading.Thread(target=crawl_one, args = (body, num))
new_thread.setDaemon(True)
thread_list[i] = new_thread
new_thread.start()
else:
thread_list[i].join()
a = thread_list[i]
del a
new_thread = threading.Thread(target=crawl_one, args = (body, num))
new_thread.setDaemon(True)
thread_list[i] = new_thread
new_thread.start()
for t in thread_list:
t.join() #保证每个进程都执行
3.数据库模块的使用
import MySQLdb
conn = MySQLdb.connect(host = localhost或者ip, user = root, passwd = root, db = db_name , charset = 'utf8', unix_socket = "")
#unix_socket 是可选参数
curs = conn.cursor()
sql = "SELECT * FROM all_url WHERE url_hash = %s"
line_hash = hashlib.md5(name).hexdigest()
#用name 的哈希值查找更快
vals = (line_hash,) #元组类型
curs.execute(sql, vals)
for line in curs.fetchall():
print line
curs.close() #关闭游标
conn.commit() #提交数据库执行操作
conn.close() #关闭数据库连接
详细的数据库的操作我是参考另一篇博客,写的很详细。python下的MySQLdb使用
4.python 的编程风格pythonic
>>> import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
>>>