Kaggle翻译，第六天：Python 6/7

最新推荐文章于 2024-03-09 07:52:38 发布

King Stars

最新推荐文章于 2024-03-09 07:52:38 发布

阅读量211

点赞数

分类专栏： Kaggle与人工智能文章标签： python 人工智能

原文链接：https://www.kaggle.com/code/colinmorris/strings-and-dictionaries

版权

Kaggle与人工智能专栏收录该内容

25 篇文章 5 订阅

订阅专栏

字符串&字典

学习字符串和字典，两个Python的基本数据类型

字符串

Python真正闪耀的地方在字符串的操作上。这一部分我们将涉及到一些Python自带的字符串方法和格式操作。
这些字符串的操作模式往往会出现在数据科学工作中。

String 语法

你已经见过许多例子中的字符串在前面的课程，而只是回顾一下，Python中的字符串可以用单引号或双引号定义。他们在功能上等效。

x = 'Pluto is a planet'
y = "Pluto is a planet"
x == y

True

双引号在字符串中有单引号时更方便。
同样，使用单引号包裹含有双引号的字符串也很方便。

print("Pluto's a planet!")
print('My dog is named "Pluto"')

Pluto's a planet!
My dog is named "Pluto"

如果我们将一个单引号放在两个单引号之间，Python会很不理解。

'Pluto's a planet!'

  File "/tmp/ipykernel_20/1561186517.py", line 1
    'Pluto's a planet!'
           ^
SyntaxError: invalid syntax

我们可以通过反斜杠进行转义。

'Pluto\'s a planet!'

"Pluto's a planet!"

下面的表格列出了许多使用反斜杠进行转义的重点例子。

你的输入	得到的真实值	example	print(example)
\ ’	’	'What\ ‘s up?’	What’s up?
\ "	"	"That’s \ "cool\ “”	That’s “cool”
\ \	\	"Look, a mountain: / \ \ "	Look, a mountain: /\
\n	回车	“1\n2 3”	1 回车2 3

\n代表换行符，他导致Python从新的一行开始。

hello = "hello\nworld"
print(hello)

hello
world

此外，Python的三重引号字符串的语法让我们字面上包括换行符。（即通过打’Enter’在键盘上，而不是使用特别’\n’序列)。我们已经看过了我们使用docstring中文件功能，但是我们在任何我们想定义一个字符串可以使用它们。

triplequoted_hello = """hello
world"""
print(triplequoted_hello)
triplequoted_hello == hello

hello
world

True

print()函数自动在字符串末尾添加回车符，除非我们用关键字end指定一个新的值，否则默认时'\n'

print("hello")
print("world")
print("hello", end='')
print("pluto", end='')

hello
world
hellopluto

字符串时连续的

字符串可以被看成一个连续的字符组成的串。几乎我们见到的所有对列表的操作都可以用来操作字符串。

# Indexing
planet = 'Pluto'
planet[0]

'P'

# Slicing
planet[-3:]

'uto'

# How long is this string?
len(planet)

# Yes, we can even loop over them
[char+'! ' for char in planet]

['P! ', 'l! ', 'u! ', 't! ', 'o! ']

但是和列表最大的区别是字符串是不可变的，我们不能修改他们。

planet[0] = 'B'
# planet.append doesn't work either

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_20/2683731249.py in <module>
----> 1 planet[0] = 'B'
      2 # planet.append doesn't work either

TypeError: 'str' object does not support item assignment

字符串方法

和list一样，str也有许多有用的方法，下面是几个例子：

# ALL CAPS
claim = "Pluto is a planet!"
claim.upper()

'PLUTO IS A PLANET!'

# all lowercase
claim.lower()

'pluto is a planet!'

# Searching for the first index of a substring
claim.index('plan')

claim.startswith(planet)

True

claim.endswith('dwarf planet')

False

字符串和列表的相互转化 `.split()` `.join()`

str.split()将一个字符串转换成一个更小的字符串，默认是按照空格分割。在将大字符串转成几个单词的场景下很有用。

words = claim.split()
words

['Pluto', 'is', 'a', 'planet!']

有时，你不想按照空格分隔：

datestr = '1956-01-31'
year, month, day = datestr.split('-')

str.join()从另一个方向切入，它将一系列字符串组成的列表缝合成一个长字符串。调用该方法的字符串是分隔符。

'/'.join([month, day, year])

'01/31/1956'

# Yes, we can put unicode characters right in our string literals :)
' 👏 '.join([word.upper() for word in words])

'PLUTO 👏 IS 👏 A 👏 PLANET!'

用`.format()`创建字符串

Python允许我们用+运算两个字符串

planet + ', we miss you.'

'Pluto, we miss you.'

如果要在运算符中加非字符串对象，我们得先小心的使用str()进行类型转换。

position = 9
planet + ", you'll always be the " + position + "th planet to me."

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_20/1802981568.py in <module>
      1 position = 9
----> 2 planet + ", you'll always be the " + position + "th planet to me."

TypeError: can only concatenate str (not "int") to str

planet + ", you'll always be the " + str(position) + "th planet to me."

"Pluto, you'll always be the 9th planet to me."

这样会很难阅读，写着也很麻烦。试试str.format()

"{}, you'll always be the {}th planet to me.".format(planet, position)

"Pluto, you'll always be the 9th planet to me."

好看多了！我们把.format()成为字符格式化，将需要插入的值用{}占位符代替。
format()这样的操作已经很神奇了，但其实还可玩得更开。

pluto_mass = 1.303 * 10**22
earth_mass = 5.9722 * 10**24
population = 52910390
#         2 decimal points   3 decimal points, format as percent     separate with commas
"{} weighs about {:.2} kilograms ({:.3%} of Earth's mass). It is home to {:,} Plutonians.".format(
    planet, pluto_mass, pluto_mass / earth_mass, population,
)

"Pluto weighs about 1.3e+22 kilograms (0.218% of Earth's mass). It is home to 52,910,390 Plutonians."

# Referring to format() arguments by index, starting from 0
s = """Pluto's a {0}.
No, it's a {1}.
{0}!
{1}!""".format('planet', 'dwarf planet')
print(s)

Pluto's a planet.
No, it's a dwarf planet.
planet!
dwarf planet!

光str.format()的内容就够出本书了，所以我们就适可而止。给你两个网站来深度学习：pyformat.info 和官方文档

字典类型

字典类型是Python内置的键值对数据结构。

numbers = {'one':1, 'two':2, 'three':3}

在本例中，'one' 'two' 和 'three'是键，1，2，3是他们相关的值。
访问时通过与列表和字符串一样的方括号来访问

numbers['one']

我们还可以用相同的语法来加一组键值对

numbers['eleven'] = 11
numbers

{'one': 1, 'two': 2, 'three': 3, 'eleven': 11}

或者是更改对应键的值

numbers['one'] = 'Pluto'
numbers

{'one': 'Pluto', 'two': 2, 'three': 3, 'eleven': 11}

Python有字典推导式，和列表推导式的语法类似。

planets = ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn', 'Uranus', 'Neptune']
planet_to_initial = {planet: planet[0] for planet in planets}
planet_to_initial

{'Mercury': 'M',
 'Venus': 'V',
 'Earth': 'E',
 'Mars': 'M',
 'Jupiter': 'J',
 'Saturn': 'S',
 'Uranus': 'U',
 'Neptune': 'N'}

in运算符可以告诉我们某个键是不是在字典中

'Saturn' in planet_to_initial

True

'Betelgeuse' in planet_to_initial

False

一个循环可以遍历字典中的所有键

for k in numbers:
    print("{} = {}".format(k, numbers[k]))

one = Pluto
two = 2
three = 3
eleven = 11

我们访问键或值的集合通过dict.keys() 和 dict.values()

# Get all the initials, sort them alphabetically, and put them in a space-separated string.
' '.join(sorted(planet_to_initial.values()))

'E J M M N S U V'

另一个非常有用的方法dict.item()让我们同时遍历一个字典中的键和值。（在Python的语境中，一个item指的是一个键值对）

for planet, initial in planet_to_initial.items():
    print("{} begins with \"{}\"".format(planet.rjust(10), initial))

Mercury begins with "M"
     Venus begins with "V"
     Earth begins with "E"
      Mars begins with "M"
   Jupiter begins with "J"
    Saturn begins with "S"
    Uranus begins with "U"
   Neptune begins with "N"

详细信息总可以从help()函数中找到答案或查找在线文档：

help(dict)

Help on class dict in module builtins:

class dict(object)
 |  dict() -> new empty dictionary
 |  dict(mapping) -> new dictionary initialized from a mapping object's
 |      (key, value) pairs
 |  dict(iterable) -> new dictionary initialized as if via:
 |      d = {}
 |      for k, v in iterable:
 |          d[k] = v
 |  dict(**kwargs) -> new dictionary initialized with the name=value pairs
 |      in the keyword argument list.  For example:  dict(one=1, two=2)
 |  
 |  Methods defined here:
 |  
 |  __contains__(self, key, /)
 |      True if the dictionary has the specified key, else False.
 |  
 |  __delitem__(self, key, /)
 |      Delete self[key].
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __getitem__(...)
 |      x.__getitem__(y) <==> x[y]
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __le__(self, value, /)
 |      Return self<=value.
 |  
 |  __len__(self, /)
 |      Return len(self).
 |  
 |  __lt__(self, value, /)
 |      Return self<value.
 |  
 |  __ne__(self, value, /)
 |      Return self!=value.
 |  
 |  __repr__(self, /)
 |      Return repr(self).
 |  
 |  __setitem__(self, key, value, /)
 |      Set self[key] to value.
 |  
 |  __sizeof__(...)
 |      D.__sizeof__() -> size of D in memory, in bytes
 |  
 |  clear(...)
 |      D.clear() -> None.  Remove all items from D.
 |  
 |  copy(...)
 |      D.copy() -> a shallow copy of D
 |  
 |  get(self, key, default=None, /)
 |      Return the value for key if key is in the dictionary, else default.
 |  
 |  items(...)
 |      D.items() -> a set-like object providing a view on D's items
 |  
 |  keys(...)
 |      D.keys() -> a set-like object providing a view on D's keys
 |  
 |  pop(...)
 |      D.pop(k[,d]) -> v, remove specified key and return the corresponding value.
 |      If key is not found, d is returned if given, otherwise KeyError is raised
 |  
 |  popitem(...)
 |      D.popitem() -> (k, v), remove and return some (key, value) pair as a
 |      2-tuple; but raise KeyError if D is empty.
 |  
 |  setdefault(self, key, default=None, /)
 |      Insert key with a value of default if key is not in the dictionary.
 |      
 |      Return the value for key if key is in the dictionary, else default.
 |  
 |  update(...)
 |      D.update([E, ]**F) -> None.  Update D from dict/iterable E and F.
 |      If E is present and has a .keys() method, then does:  for k in E: D[k] = E[k]
 |      If E is present and lacks a .keys() method, then does:  for k, v in E: D[k] = v
 |      In either case, this is followed by: for k in F:  D[k] = F[k]
 |  
 |  values(...)
 |      D.values() -> an object providing a view on D's values
 |  
 |  ----------------------------------------------------------------------
 |  Class methods defined here:
 |  
 |  fromkeys(iterable, value=None, /) from builtins.type
 |      Create a new dictionary with keys from iterable and values set to value.
 |  
 |  ----------------------------------------------------------------------
 |  Static methods defined here:
 |  
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  __hash__ = None