大数据实操课——Python（1）

QuartusII7

已于 2023-08-17 23:26:06 修改

阅读量155

点赞数

分类专栏：大数据Python 文章标签：大数据 python 开发语言

于 2023-08-13 19:09:05 首次发布

本文链接：https://blog.csdn.net/QuartusII7/article/details/132240505

版权

大数据Python 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

-- -- --- --- -- -- -- -- --- -- --- ----- ---- -- --- --- ----- --- ---目录---- --- -- -- ---- ---- - - --- ---- ---- - ---- -- -- ---- -- --- --- -- --

1. python 环境搭建及数据类型
- 1.1 python环境搭建
- 1.2 python

1. python 环境搭建及数据类型

1.1 python环境搭建

参考下面文章：
https://zhuanlan.zhihu.com/p/510293011
https://www.cnblogs.com/yangjian319/p/8666527.html

1.2 python

pyCharm集成开发环境：
1.jetBrains公司开发的IDE
2.下载安装pycharm cimmunity版
3.创建一个pycharm工程

1 .jupyter：

1.开启代码提示：https://www.jianshu.com/p/c3c2bfbc3fa0
2.import用法:https://blog.csdn.net/weixin_40532940/article/details/99290668

2.python语句规范：

1）tab键实现缩进，1个TAB=4个空格，使用4个空格缩进
2）注释：#键单行注释，“”“多行注释”“”
运行run快捷键：shift+enter

3.常见内置函数：

函数名	说明
type()	返回对象类型
dir()	函数不带参数时，返回当前范围内的变量、方法、定义的类型列表；带参数返回参数的属性、方法列表
input（），print（）	输入，打印输出
id（）	返回对象内存地址

4.变量的特点

变量的命名规则：
1）以字母或_ 开头，变量名以_、数字、字母组成
变量名大小写敏感
不能使用Python保留的关键字，查看python的关键字方法：
变量的特点：
使用变量前不需要声明
变量的类型不固定
变量是实际值的引用，id()判断两个变量是否引用同一个值

5.数值类型

类型	进制	举例
int	十进制	123，1_234_678
int	八进制	0O123,0o_12_345
int	十六进制	0x123,0x_1_234_567,0x_BAD_GJK
int	二进制	0b10,0b_0011_1100
float	-	1.23,1_2_3.，.123，.1_2_3
float	指数表示	1.23e4,1.2_3e-4,0e0

6.数值类型操作符

操作符	说明
x+y	加法
X-Y	减法
X*Y	乘法
X/Y	除法
X//Y	除后取整
X%Y	除后取余
-x	负数
abs()	取绝对值
int()	转成整数
float（）	转浮点数
divmod(x,Y)	返回一个包含商和余数的元组，比如divmod(5,3),返回值（1，2），表示5除以3，等于1余2
pow(x,y)	返回X的Y次方
x**y	返回X的y次方
round(x[,n])	方法返回浮点数X的四舍五入值

7.list 列表

1.列表的特点:
2.列表使用：
创建列表：[‘one’,2,[3,4],(5,6)]，'one’在列表中的位置是num[0],2是num[1],num[-1]是（5，6）
使用索引获取列表中的数据：x[0],x[2],x[-1],x[-3]
判断值是否存在列表中:in和not in

8.tuple 元组
https://blog.csdn.net/MicalChen/article/details/120240825

1.元组的特点:
2.元组使用：
创建：
t1 = (1,2,3,4)
t2 = ‘one’,2,[3,4],(5,6)
t3 = tuple([1,2,3])
使用索引获取列表数据：x[0],x[-3]
判断值是否存在列表中:in和not in

9.列表元组的操作

1.通过切片获得新的列表元组
[start : end : step]，start起始索引，从0开始，-1代表结束，end结束索引，step步长，为正从左往右取值，为负反向取值。正着取左闭右开区间，反着取左开右闭，总结start为闭，end为开。

对列表x = [1,2,3,4,5,6,7,8,9,0]

切片	结果
x[1:3]	[2,3]
x[-3,-1]	[8,9]
x[:4]	[1,2,3,4]
x[6：]	[7，8，9，0]
x[1:6:2]	[2,4,6]
x[-8:-1;3]	[3,6,9]
x[6:1:-2]	[7,5,3]
x[-1:-8:-3]	[0,7,4]

10.遍历列表元组元素

for v in x:
	print(v)

zip()函数
将可迭代的对象作为参数，将对象中对应的元素打包成一个个元组，然后返回由这些元组组成的对象

matrix = [[1,2,3,4],[5,6,7,8],[9,10,11]]
'''
zip(*matrix)
输出：<zip at 0x29a4e2b2ac8>
'''
list(zip(*matrix)) 
#result:[(1,5,9),(2,6,10),(3,7,11)]
相当于这样排列：
[1,2,3,4]
[5,6,7,8]
[9,10,11]

t = (1,2,3,4)
list(zip(t))

10.range操作
1）range类型支持切片
设X = range(10) ,对象表示range(0, 10)，也是左闭右开。

x = range(10)
list(x)
输出结果：[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

切片	结果	输出
x[1:3]	range(1,3)	[1, 2]
x[-3:-1]	range(7,9)	[7, 8]
x[:4]	(0,4)
x[6:]	(6,10)
x[1:6:2]	range(1,6,2)	[1, 3, 5]
x[-8: -1:3]	range(2,9,3)	[2, 5, 8]
x[6:1:-2]	range(6,1,-2)	[6, 4, 2]
x[-1:-8:-3]	range(9,2,-3)	[9, 6, 3]

2）使用FOR循坏遍历range()

for v in range(10)
print()

11 . 列表、元组、range转换

列表转元组：t = tuple(l) #l是列表
元组转列表：l = list(t) #t是元组
range转列表：l = list( r ) #r是range
range转tuple：t = tuple( r ) #r是range

12 . pack和unpack

pack,变量转换成序列
t = 1,2,3 #t是（1，2，3）

unpack，序列转换成变量
a,b,c = t #a=1,b=2,c=3

unpack中使用*，主要作用就是别人不要的都归他（* ）
a,b,*c = 1,2,3,4,5 #a=1,b=1,c=[3,4,5]
a,*b,c = 1,2,3,4,5 #a=1,b=[2,3,4],c=5
*a,b,c,d,e,f = 1,2,3,4,5 #a=[],b=1,c=2,d=3,e=4,f=5

交换两个变量的值，a,b = b,a

a=1
b=2
a,b=b,a
print(a,b)     #2，1

在for循环中unpack元组：

l = [(1,2),(3,4),(5,6)]
result = 0
for x,y in i:
	result += x*y
print(result)               #44

13 . 常见的序列操作
s = [1,2,3] ,t=[4,5,6] ,n=2

操作符	说明
t+s	拼接：[4, 5, 6, 1, 2, 3]
s*n	乘法：[1,2,3,1,2,3]
len(s)	序列的长度：3
min(s)	序列中最小值：1
max(s)	最大值3
s.index(x,start,end)	获得第一个x元素的索引（索引值在start和end之间）
s.count(x)	序列x元素出现的次数
index实列	index示例

13.1 shallow拷贝补充：B站视频

14 . 可变序列支持的操作
Python 中的序列类型支持哪些公共操作
 python的序列 (一)：可变序列
 python的序列 (二)：可变序列的操作方法
 Python常用的可变序列与不可变序列——数值、字符串、列表、字典、元组、集合

操作符	说明
s[i] = x	更新指定索引值
s[i:j] = t	使用序列t替换s中的i到j的值
del s[i:j]	等同于s[i:j] = []
s[i:j:k]=t	使用序列t中的值替换s[i:j:k]的值
del s[i:k:j]	删除值
s.append(x)	将x添加到序列的末尾
s.clear()	清空序列，相当于del s[:]
s.copy()	创建一个s的浅拷贝
s.extend(t)	使用序列t扩展序列s
s.insert(i,x)	在序列s的i索引处插入值x
s.pop(i)	返回序列s中索引为i的值，并将该值从序列中移除
s.remove(x)	将序列中第一个值为x的元素移除
s.reverse	将序列s倒序排列

代码演示：

s = [1,2,3,4,5,6]
t= [0,0,0]
s[1:3] = t
s
输出：[1, 0, 0, 0, 4, 5, 6]

s.append(t)
s
输出：[1, 0, 0, 0, 4, 5, 6, [0, 0, 0]]

s.extend(t)
s
输出：[1, 0, 0, 0, 4, 5, 6, [0, 0, 0], 0, 0, 0]

s.pop(6)
s
输出：[1, 0, 0, 0, 4, 5, [0, 0, 0], 0, 0, 0]

15 . 集合(set)
存储形式和列表相似：

集合中保存的数据具有唯一性，不可重复
集合中保存的数据是无序的
往集合中添加重复数据，集合将只保留一个
集合常用来去重和过滤

创建一个集合：
空集合：变量 = set(),创建一个空set{},类型就是dict了
相比较而言，推荐使用布隆过滤器：https://blog.csdn.net/m0_59485658/article/details/128793834
非空集合：变量 = {元素1，元素2，……}
操作判断：是否在集合中：in和not in

集合的操作：

并集：s1|s2|s3, 或者s1.union(s2,s3)
交集：s1&s2&s3, 或者s1.intersection（s2,s3）
差集：s1-s2-s3, 或者s1.difference(s2,s3)
超集：s1.issuperset(s2),s1是s2的超集，s1的元素都能在s2中找到
子集：s2.issubset(s1),s2是s1的子集，s2的元素都能在s1中找到
. . . . .一个集合既是自己的交集也是超集
相交：s1.isdisjoint(s2),

代码示例：

s1 = {1,2,3}
s2 = {3,4,5}
s3 = {3,1,6}
s1|s2|s3 	# {1, 2, 3, 4, 5, 6}
s1&s2&s3 	# {3}
s1-s2-s3 	# {2}

16 . 字典（dict）

通过键值对（key-value）来存储数据
存储的数据是无序的，使用键索引
键必须唯一，但值可以不唯一
键的类型只能是字符串、数字、元组，值可以是任何

字典操作：

empty_dict = {}
dict_1 = {1:'one',2:'two',3:'three'}	# {1: 'one', 2: 'two', 3: 'three'}
dict_2 = dict(one = 1,two=2,three=3)	#{'one': 1, 'three': 3, 'two': 2}

获取字典中的值

x = d[1] #1是key,不是索引
x = d[‘three’]
x = d.get(3,‘this value shen key is not found’)

x = dict_1.get(3,'this value shen key is not found')
x	# 'three'
x = dict_1.get(4,'this value shen key is not found')
x	#'this value shen key is not found'

判断值是否是字典的键：

in和not in ,判断的不是值在不在，而是键在不在
3 in dict_1	#true
'three' in dict_1	#false

遍历：

遍历键：
for k in dict_1:
	print(k)	
#
1
2
3

遍历值：
for v in dict_1.values():
	print(v)
#
one
two
three

遍历键值对：
for k,v in dict_1.items():
	print(k,v)
#
1 one
2 two
3 three

for item in dict_1.items():
	print(item[0])
1
2
3
print(item[1])
one
two
three

17 . 字符串
python定义字符串的三种形式

#单引号
str1 = 'allows embedded "double" quotes'
#双引号
str2 = "allows embedded 'double' quotes"
#三引号,允许字符串换行
str2 = '''allows embedded
 'double' quotes'''
str2 = """allows embedded
 "double" quotes"""

字符串操作：

操作	说明	举例
string[n:m]	字符串切片	string= ‘Hello World\n’，string[0]，string[:-1]，string[3:5]
int()	字符串转数值类型	int(“123”)，float(“123”)
str()	数值类型转字符串	str(123)，str(123.34)
ord()	字符串转Unicode码	ord(‘鸡’)
chr()	Unicode码转字符串	chr(40481)
lower()	转成小写字符串	“WELCOME”.LOWER()
upper()	转成大写字符串	“welcome".upper()
in	判断是否为子串	’or’ in ‘toronto or orlando’ #True

split():分隔字符串，生成一个列表
worlds = "welcome to python".split()
type(words)#list
print(words) #['welcome','to','python']
print（”welcome to python".split('to')） #['welcome,'python']`

join()：将序列中的元素以  指定的字符  连接生成一个新的字符串
s = "++"
list = [1,2,3]
s.join(list),#'1++2++3'

strip(),lstrip(),rstrip()|移除字符串头尾，头，尾指定的字符或字符序列
s=123000321
s.strip('123'),#'000',这个情况是因为strip中的是集合，他是一个个去配对的

find()：返回子串开始的索引值，找不到子串时返回-1
s = 'toronto or orlando'
s.find("or") #return index 1
s.find('or',2,8) #return -1

index()：返回子串开始的索引值，找不到抛出异常
s  = 'toronto or orlando'
s.index('or') #1
s.index('or',2,8) #substring not found

count()：统计字符串里面某个字符出现的次数
s  = 'toronto or orlando'
s.count('or') #3
s.count('or',2) #2，从第二个序号开始
s.count('or',2,9) #0
s.count('or',2,10) #1,空格算一个字符

replace()：方法把字符串中的旧字符串替换成新字符串
s  = 'toronto or orlando'
s.replace('or','/x\\')  #'t/x\\onto /x\\ /x\\lando'
s.replace('or','/x\\',2)#'t/x\\onto /x\\ orlando'

startswith（）：简查字符串是以指定字符串开头
s  = 'toronto or orlando'
s.startswith('or')	#false
s.startswith('or',1) #true
s.startswith('or,'tor'') #true

endswith()：简查字符串是以指定字符串结尾
s  = 'toronto or orlando'
s.endswith('and')	#false
s.endswith('and',1,-1) #true
s.endswith('do',1,-1) #false
s.endswith('and,'do'') #true

maketrans() ：字符串转换，设置转换模式；translate（）：执行转换操作
s  = 'toronto or orlando'
table = s.maketrans('on','.N') #o change .  n change N ;'toronto or orlando'
s.translate(table)  #'t.r.Nt. .r .rlaNd.'
s	#'toronto or orlando'

更多字符串操作建议去官网去查，或者去https://www.python100.com/

18. None和Boolean类型
none 和布尔值：

none是一个特殊常量，表示空置
布尔值：TRUE:不为0的数值、非空字符串、列表、字典、集合、-
FALSE:0、0.0、0+0j、空字符串、列表、字典、集合、none
布尔操作符：or、and、not

比较运算符：

小于<，<=小于等于，大于等于>=，大于>，等于==，不等于!=，判断两个标识符是不是引用自一个对象is，判断两个标识符是不是引用自不同一个对象is not，

19 . input（）方法

20 . 三元条件表达式

效果等同于一个if……else语句：result = 值1 if x<y else 值2
x =5
y=6
result = 1 if x<y else 2
result	#1

x =7
y=6
result2 = 1 if x<y else 2
result2 	#2

三元表达式：
'even' if x%2 ==0 else 'old'
'A' if x%2 ==0 else 'B' if x%5==0 else 'C'

21 . enumerate()方法

22 . 列表生成式
用列表生成式创建列表：

e**2 for e in a_list if type(e) == types.inttype
#解释
e**2:对e的平方
for e in a_list：取列表
if type(e) == types.inttype：条件过滤

#示例：
[e**2 for e in range(8) if e%2==0] #[0, 4, 16, 36],[0,8)中取余等于0的元素的平方

2~99中的偶数：
[x for x in range(2,10) if x%2==0]

π的精度逐渐升高（1~9位小数）：
from math import pi
[str(round(pi,i)) for i in range(1,10)]

[[1 if r==c or r+c==3 else 0 for c in range(4)] for r in range(4)]
#生成矩阵 [[1,0,0,1],[0,1,1,0],[0,1,1,0],[1,0,0,1]]

字符串'what is this'中出现的字母（生成集合(列表)）：
sentence = 'what is this '
{c for c in sentence if c !=''} 
#{' ', 'a', 'h', 'i', 's', 't', 'w'}

字符串'what is this'中单词的辅音字母（生成字典）：
sentence = 'what is this '
{w:{c for c in w if c not in 'aeiou'} for w in sentence.split()}
#{'is': {'s'}, 'this': {'h', 's', 't'}, 'what': {'h', 't', 'w'}}

____________________________________________
d = {"a":1,'b':2}
{k.upper():v for k,v in d.items()}	#{'A': 1, 'B': 2}

字符串'what is this'中含i的单词（生成器表达式）：
sentence = 'what is this '
(w for w in sentence.split() if 'i' in w)#<generator object <genexpr> at 0x0000020A87A22CA8>   灭有生成cabo
#生成器迭代对象
sentence = 'what is this '
for i in (w for w in sentence.split() if 'i' in w):
    print(i)
#
is
this

____________________________________________
[w for w in sentence.split()]	#['what', 'is', 'this']

字典生成式：
{k:v for k,v in input if xxx}
#将所有的key值变为大写
d = dict(a=1,b=2)
print({k.upper():v for k,v in d.items()})	#{'A': 1, 'B': 2}
____________________________________________
#大小写key值合并，统一以小写key值输出：
d = dict(a=2,b=1,c=2,B=9,A=5)
print({k.lower():d.get(k.lower(),0)+d.get(k.upper(),0) for k in d })

集合生成式：
#筛选字符串中的字母
{x for x in 'abracadabra' if x not in 'abc'}	#{'d', 'r'}

23 . 流程控制语句

条件选择语句：
guess =1
secret =2
if guess > secret :
	print("too large")
elif guess < secret :
 	print("too small")
else:
 	print("equals")

循环语句：
while guessed != secret:
	guessed = int(input("Guess a number:"))
else :
	print("congratulation")

____________________________________________
for i in range(0,8,2):
	print(i)
else:
	print("done!")
____________________________________________
for i in range(10):
    print(i)
    if i == 5:
        break
else:
    print("no 5")
x=2
if x ==1:
    pass
elif x ==2:
    print(x)

QuartusII7

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
大数据实操课——Python（1）

start : end : step]，start起始索引，从0开始，-1代表结束，end结束索引，step步长，为正从左往右取值，为负反向取值。正着取左闭右开区间，创建列表：[‘one’,2,[3,4],(5,6)]，'one’在列表中的位置是num[0],2是num[1],num[-1]是（5，6）超集：s1.issuperset(s2),s1是s2的超集，s1的元素都能在s2中找到。子集：s2.issubset(s1),s2是s1的子集，s2的元素都能在s1中找到。
复制链接

扫一扫