花了整整两天的时间学习一门关于python用来做统计的基础课程。现在总结一下。基本的原则是列出一些代码的实例,表明学到的pythong包括numpy和pandas的基本功能。如果可以用一个函数表示的就用一个函数,不用多个函数,这样容易跟踪。不过刚刚学完,思想上很觉得疲劳。这个总结可能要慢慢做起来。做完了再分享出去。
函数
def add_numbers(x, y, z=None, flag=False):
if flag:
print('flag is true')
if z == None:
return x + y
else:
return x + y + z
print(add_numbers(1, 2, flag=True))
要点:
1) python的函数参数是可以有默认值的
2) 调用带有默认参数值的函数时,可以指定哪个参数的默认值被改写
def add_numbers(x, y):
return x + y
f = add_numbers
f(1, 2)
要点:
1)函数可以赋值给变量
标量,列表,字典
x = (1, 'a', 2, 'b')
type(x)
#tuple
x = [1, 'a', 2, 'b']
x.append(3.3)
[1]*3
# [1, 1, 1]
[1,2] + [3,4]
# [1, 2, 3, 4]
print(x[1])
print(x[1:2])
print(x[-3, -1])
x = {'a':'b', 'c':'d'}
print(x['a'])
for e in x:
print(x[e])
for v in x.values():
print(v)
for e,v in x.items():
print(e)
print(v)
要点:
1)python的标量,列表,字典可以保存不同类型的值
2)列表的一些基本操作,像append(), *, +
3)字典的遍历key,value,和item的方法
字符串格式打印
sales_statement = '{} bought {} item(s) at a price of {} each for a total of {}'
print(sales_statement.format('Chris', 4, 3.24, 4 * 3.24))
要点:
1) 怎么像java或者c++那样在字符串里打印变量值
读取CSV文件
import csv
%precision 2
with open('mpg.csv') as csvfile:
mpg = list(csv.DictReader(csvfile))
mpg[0].keys()
# dict_keys(['mpg', 'cylinders', 'displacement', 'horsepower', 'weight', 'acceleration', 'model_year', 'origin', 'name'])
sum(float(d['acceleration']) for d in mpg)
# 6196.10
要点:
1)如何将CSV文件的每一行按照字典的方式读出来
2)怎样访问读出来的字典的内容
日期和时间
import datetime as dt
import time as tm
tm.time()
# 1605049825.93
dtnow = dt.datetime.fromtimestamp(tm.time())
dtnow
# datetime.datetime(2020, 11, 10, 15, 11, 18, 593011)
dtnow.year, dtnow.month
# (2020, 11)
today = dt.date.today()
today
# datetime.date(2020, 11, 10)
要点:
1)怎么使用时间戳和日期
2)怎么在时间戳和日期之间转换
对象
class Person:
department = 'school of information'
def set_name(self, new_name):
self.name = new_name
要点:
1)怎么定义类和类里面的属性和方法
map函数
store1 = [10.00, 11.00, 12.34, 2.34]
store2 = [9.00, 11.10, 12.34, 2.01]
cheapest = map(min, store1, store2) # lazy execution
list(cheapest)
# [9.00, 11.00, 12.34, 2.01]
要点:
1)map函数是lazy execution的
lambda函数
my_function = lambda a, b, c : a + b
my_function(1, 2, 3)
# 3
people = ['Dr. Christopher Brooks', 'Dr. Kevyn Collins-Thompson', 'Dr. VG Vinod Vydiswaran', 'Dr. Daniel Romero']
def split_title_and_name(person):
return person.split()[0] + ' ' + person.split()[-1]
#option 1
for person in people:
print(split_title_and_name(person) == (lambda person:person.split()[0] + ' ' + person.split()[-1])(person))
要点:
1)lambda函数的写法
numpy
import numpy as np
import math
#array creation
a = np.array([1, 2, 3])
print(a) # [1, 2, 3]
print(a.ndim) # 1
b = np.array([[1, 2, 3], [4, 5, 6]])
b
# array([[1, 2, 3],
# [4, 5, 6]])
b.shape
# (2,3)
b.dtype
#dtype('int32')
d = np.zeros((2, 3))
print(d)
# [[0. 0. 0.]
# [0. 0. 0.]]
np.random.rand(2,3)
# array([[0.74956529, 0.33373396, 0.19966149],
# [0.35464706, 0.24963549, 0.31511589]])
f = np.arange(10, 50, 2)
f
# array([10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42,
# 44, 46, 48])
np.linspace(0, 2, 15)
# array([0. , 0.14285714, 0.28571429, 0.42857143, 0.57142857,
# 0.71428571, 0.85714286, 1. , 1.14285714, 1.28571429,
# 1.42857143, 1.57142857, 1.71428571, 1.85714286, 2. ])
要点:
1)怎么从list创建numpy array
2)怎样显示numpy array维度的数量,维度,数据类型
3)怎样用numpy创建全零array,随机array, 步进式array,和平均分布array
a = np.array([10, 20, 30, 40])
b = np.array([1, 2, 3, 4])
d = a * b
print(d)
# [ 10 40 90 160]
# booilean array
d > 50
# array([False, False, True, True])
A = np.array([[1, 1], [0, 1]])
B = np.array([[2, 0], [3, 4]])
print(A*B)
# [[2 0]
# [0 4]]
print(A.sum())
print(A.max())
print(A.min())
print(A.mean())
# 3
# 1
# 0
# 0.75
要点:
1)array的乘法
2)array的布尔型array
3)array的sum,max, min, 和mean
b = np.arange(1, 16, 1).reshape(3, 5)
print(b)
# [[ 1 2 3 4 5]
# [ 6 7 8 9 10]
# [11 12 13 14 15]]
from PIL import Image
from IPython.display import display
im = Image.open('momo.JPG')
display