Python学习中碰到的问题

最新推荐文章于 2024-05-31 17:42:39 发布

lming_08

最新推荐文章于 2024-05-31 17:42:39 发布

阅读量1.6w

点赞数 2

分类专栏： Python 文章标签： python

本文链接：https://blog.csdn.net/lming_08/article/details/37901643

版权

Python 专栏收录该内容

12 篇文章 0 订阅

订阅专栏

1.tuple元组是不可更改的

2.print语句会自动加上换行符

3.write和writelines的区别：write的参数是字符串，writelines的参数是字符串的序列sequence_of_strings

4.如何生成二维列表？

>>> lst1 = [[0]*5]*5
>>> lst1
[[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]

>>> lst2 = [[0 for i in range(5)] for j in range(5)]
>>> lst2
[[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]

两种生成效果是一致的吗？乍一看，貌似一致，其实不然。

>>> lst1[0][0]=1
>>> lst1
[[1, 0, 0, 0, 0], [1, 0, 0, 0, 0], [1, 0, 0, 0, 0], [1, 0, 0, 0, 0], [1, 0, 0, 0, 0]]

>>> lst2[0][0]=1
>>> lst2
[[1, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]

可以看出lst1 = [[0]*5]*5 其实只有一行申请了内存，其他的4行都引用该行，因此赋值时每一行都会发生变化。

5.KeyError： 0

一般是直接使用dict_[key]，而key不在dict_字典内

6.IOError: [Errno 29] Illegal seek

fd = open("somefile")

fd.seek(0) # OK

sys.stdin.seek(0) # 报错：IOError: [Errno 29] Illegal seek

7.UnboundLocalError: local variable 'x' referenced before assignment

#!/usr/bin/python

x = 1

def main():
#   global x
    x += 1
    print x

if __name__ == "__main__":
    main()

上面的代码会产生标题给的错误，这是因为在某个块中不能直接对全局变量进行“写引用”(引用全局变量，并对变量写入值)；如果要写引用全局变量，必须在块中用global声明全局变量。产生的原因是，全局声明x = 1，编译器自动将x的类型标为int，而int是不可改变的类型(immutable classes)；于是在main()函数中，x += 1会重新生成一个对象，对象值为2，但是在main()作用域中，x并没有声明，因此就会报错。这个帖子里对Python中全局变量探讨还是比较详细的，http://stackoverflow.com/questions/21456739/unboundlocalerror-local-variable-l-referenced-before-assignment-python。

8.IOError: [Errno 32] Broken pipe

这个错误通常是在Linux下使用管道操作造成的；当管道的另一头输出部分提前关闭，而管道输入部分的内容没有完全输出，并且输入部分的内容很大，超过默认缓冲区大小，那么就会爆这种错误。

例如，cat test.txt | python test.py | head

这里的head是主要原因，因为head这里会提前关闭。解决办法有：

import signal
signal.signal(signal.SIGPIPE, signal.SIG_DFL)

这个错误是我写完Hadoop Streaming后，在本地测试时报的；尽管会报错，但我在Hadoop里跑，还是能够正确运行的。

9.快速使用字典累加计数：dic[key] = dic.get(key,default) + 1;该方法的好处是不用手动判断key是否在dic中

10.ValueError: (75, 'Value too large for defined data type')

值已经超出表示范围了，例如：

>>> time.gmtime(2**55)
time.struct_time(tm_year=1141709097, tm_mon=6, tm_mday=13, tm_hour=6, 
tm_min=26, tm_sec=8, tm_wday=6, tm_yday=164, tm_isdst=0)

>>> time.gmtime(2**56)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: (75, 'Value too large for defined data type')

有些表达式可能需要使用一些数学技巧，例如计算平均值时avg = (x1+x2+...+xn)/n 可以转化为

avg = x1/n + x2/n +...+xn/n

10.AttributeError: 'module' object has no attribute 'urlparse'

File "urlanalysis_nb.py", line 46, in <module>
domain = urlparse.urlparse(url)[1]
AttributeError: 'module' object has no attribute 'urlparse'

在交互式命令行中单独执行，是没有错的，检查了很长时间找不出问题所在！于是在http://lovesoo.org/python-script-error-attributeerror-module-object-has-no-attribute-solve-method.html 这个里面找到问题所在了，原来我的urlanalysis_nb.py脚本所在目录中存在一个我自己编写的urlparse.py文件和编译出的urlparse.pyc文件！然后删掉这两个文件就OK了

11.Python中使用split按字符拆分时最好使用单引号，因为如果是Unicode编码的字符串，使用双引号则不会成功

12.今天在写Spark程序时，按'^A'字符进行split一个字符串，结果split失败了，但是单独在Python程序中是可以split成功的，调了一下午都不清楚原因！终于在晚上找到解决办法了，就是按'\x01'字符split，因为'^A'与'\x01'是等价的，所以按'\x01'split就成功了，但仍然不明白为什么'^A'split就失败！！！(2015/01/20 21:43)

13.UnicodeEncodeError: 'ascii' codec can't encode characters in position 157-163: ordinal not in range...

这个问题是在spark-Python中出现的，出错的代码时调用了str(val)，百度了下，有说

import sys

reload(sys)

sys.setdefaultencoding('utf8')

但是我试了下没有效果，仍然报错，于是我用了"%s" % val 来代替，程序就执行正确了。

14.TypeError: 'dict' object is not callable

这个在Python中报错，经常是将一个字典取值操作写成了d(key)，实际上是d[key]

另外，在spark-Python中判断某个key是否属于字典，if key in d : ... 这个也会报错，写成if key in d.keys() : ...就不会报错

15.尽量不要对变量连等初始化声明，尤其是列表之类的数据结构

arr_score = arr_label = []，此时arr_score和arr_label的地址指向同一个位置，跟上面的第四条有点类似！

如果要在一行里初始化，可以这样var1, var2 = [0] * 2

16.尝试修改string的值（导致“TypeError: 'str' object does not support item assignment”）

string是一种不可变的数据类型

>>> s = "abcdef"
>>> s[3]
'd'
>>> s[3] = 'q'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment

17.reduce结合lambda表达式出现的问题

a = [1, 6, 2]

reduce(lambda x,y: x+y, a)

b = [[1,3], [2,6], [2, 5]]

reduce(lambda x,y:x[1]*x[0]+y[1]*y[0], b)

这里a是可以执行成功，但是b就报错

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

File "<stdin>", line 1, in <lambda>

TypeError: 'int' object has no attribute '__getitem__'

这里报错是因为，reduce操作必须要求输入与输出是同类型的！

18.print unicode中文时报错

UnicodeEncodeError: 'ascii' codec can't encode characters in position 17-22: ordinal not in range(128)

解决方案为：

print ch #fails
print ch.encode('ascii', 'ignore')
实践证明使用print ch.encode('utf8', 'ignore')
同样可以解决！注意python中设置utf8不要使用utf-8

http://stackoverflow.com/questions/5141559/unicodeencodeerror-ascii-codec-cant-encode-character-u-xef-in-position-0

直接输出到stdout正确，重定向到文件报错：

Traceback (most recent call last):
File "bin/select_ad_item_info.py", line 36, in <module>
print itemid, sellerid, itemname, price, sold
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-5: ordinal not in range(128)

加上这个

reload(sys)
sys.setdefaultencoding('utf8')

就可以正确重定向到文件了。

19 . python中函数参数如果是不可变对象都是使用值传递的，如果真要做到引用传递，可以将参数return