Python中的字符串和字符数据

In the tutorial on Basic Data Types in Python, you learned how to define strings: objects that contain sequences of character data. Processing character data is integral to programming. It is a rare application that doesn’t need to manipulate strings at least to some extent.

在“ Python中的基本数据类型 ”教程 ,您学习了如何定义字符串:包含字符数据序列的对象。 处理字符数据是编程必不可少的。 这是一种罕见的应用程序,不需要至少在某种程度上处理字符串。

Here’s what you’ll learn in this tutorial: Python provides a rich set of operators, functions, and methods for working with strings. When you are finished with this tutorial, you will know how to access and extract portions of strings, and also be familiar with the methods that are available to manipulate and modify string data.

这是您在本教程中将学到的内容: Python提供了丰富的用于处理字符串的运算符,函数和方法。 完成本教程的学习后,您将知道如何访问和提取字符串的各个部分,并熟悉可用于处理和修改字符串数据的方法。

You will also be introduced to two other Python objects used to represent raw byte data, the bytes and bytearray types.

您还将被介绍到另外两个用于表示原始字节数据的Python对象,即bytesbytes bytearray类型。

字符串操作 (String Manipulation)

The sections below highlight the operators, methods, and functions that are available for working with strings.

以下各节重点介绍可用于处理字符串的运算符,方法和函数。

字符串运算符 (String Operators)

You have already seen the operators + and * applied to numeric operands in the tutorial on Operators and Expressions in Python. These two operators can be applied to strings as well.

在Python运算符和表达式教程中,您已经看到了应用于数字操作数的+*运算 。 这两个运算符也可以应用于字符串。

+运算符 (The + Operator)

The + operator concatenates strings. It returns a string consisting of the operands joined together, as shown here:

+运算符连接字符串。 它返回一个由连接在一起的操作数组成的字符串,如下所示:

 >>> >>>  s s = = 'foo'
'foo'
>>> >>>  t t = = 'bar'
'bar'
>>> >>>  u u = = 'baz'

'baz'

>>> >>>  s s + + t
t
'foobar'
'foobar'
>>> >>>  s s + + t t + + u
u
'foobarbaz'

'foobarbaz'

>>> >>>  printprint (( 'Go team' 'Go team' + + '!!!''!!!' )
)
Go team!!!
Go team!!!
*运算符 (The * Operator)

The * operator creates multiple copies of a string. If s is a string and n is an integer, either of the following expressions returns a string consisting of n concatenated copies of s:

*运算符创建一个字符串的多个副本。 如果s是一个字符串, n是一个整数,则以下两个表达式之一将返回一个由sn串联副本组成的字符串:

s * nn * s

s * n n * s

Here are examples of both forms:

以下是这两种形式的示例:

The multiplier operand n must be an integer. You’d think it would be required to be a positive integer, but amusingly, it can be zero or negative, in which case the result is an empty string:

乘法器操作数n必须为整数。 您可能会认为它必须是一个正整数,但有趣的是,它可以是零或负,在这种情况下,结果是一个空字符串:

 >>> >>>  'foo' 'foo' * * -- 8
8
''
''

If you were to create a string variable and initialize it to the empty string by assigning it the value 'foo' * -8, anyone would rightly think you were a bit daft. But it would work.

如果您要创建一个字符串变量并将其赋值为'foo' * -8初始化为空字符串,那么任何人都应该认为您有点傻。 但这会起作用。

in运算符 (The in Operator)

Python also provides a membership operator that can be used with strings. The in operator returns True if the first operand is contained within the second, and False otherwise:

Python还提供了可以与字符串一起使用的成员资格运算符。 如果第一个操作数包含在第二个操作数中,则in运算符返回True否则返回False

There is also a not in operator, which does the opposite:

还有一个not in运算符,其作用相反:

 >>> >>>  'z' 'z' not not in in 'abc'
'abc'
True
True
>>> >>>  'z' 'z' not not in in 'xyz'
'xyz'
False
False

内置字符串函数 (Built-in String Functions)

As you saw in the tutorial on Basic Data Types in Python, Python provides many functions that are built-in to the interpreter and always available. Here are a few that work with strings:

如您在Python中的基本数据类型教程中所见,Python提供了许多内置于解释器中且始终可用的功能。 以下是一些适用于字符串的方法:

Function 功能 Description 描述
chr()chr() Converts an integer to a character 将整数转换为字符
ord()ord() Converts a character to an integer 将字符转换为整数
len()len() Returns the length of a string 返回字符串的长度
str()str() Returns a string representation of an object 返回对象的字符串表示形式

These are explored more fully below.

下面将更全面地探讨这些内容。

ord(c)

ord(c)

Returns an integer value for the given character.

返回给定字符的整数值。

At the most basic level, computers store all information as numbers. To represent character data, a translation scheme is used which maps each character to its representative number.

在最基本的级别上,计算机将所有信息存储为数字。 为了表示字符数据,使用了一种翻译方案,该方案将每个字符映射到其代表编号。

The simplest scheme in common use is called ASCII. It covers the common Latin characters you are probably most accustomed to working with. For these characters, ord(c) returns the ASCII value for character c:

常用的最简单的方案称为ASCII 。 它涵盖了您可能最习惯使用的常见拉丁字符。 对于这些字符, ord(c)返回字符c的ASCII值:

ASCII is fine as far as it goes. But there are many different languages in use in the world and countless symbols and glyphs that appear in digital media. The full set of characters that potentially may need to be represented in computer code far surpasses the ordinary Latin letters, numbers, and symbols you usually see.

ASCII就可以了。 但是,世界上有许多不同的语言在使用,数字媒体中出现了无数的符号和字形。 可能需要用计算机代码表示的完整字符集远远超过您通常看到的普通拉丁字母,数字和符号。

Unicode is an ambitious standard that attempts to provide a numeric code for every possible character, in every possible language, on every possible platform. Python 3 supports Unicode extensively, including allowing Unicode characters within strings.

Unicode是一个雄心勃勃的标准,它试图在每种可能的平台上以每种可能的语言为每种可能的字符提供数字代码。 Python 3广泛支持Unicode,包括允许在字符串中使用Unicode字符。

For More Information: See Python’s Unicode Support in the Python documentation.

有关更多信息:请参阅Python文档中的Python 的Unicode支持

As long as you stay in the domain of the common characters, there is little practical difference between ASCII and Unicode. But the ord() function will return numeric values for Unicode characters as well:

只要您停留在常见字符的域中,ASCII和Unicode之间几乎没有实际区别。 但是ord()函数也将返回Unicode字符的数字值:

 >>> >>>  ordord (( '€''€' )
)
8364
8364
>>> >>>  ordord (( '∑''∑' )
)
8721
8721

chr(n)

chr(n)

Returns a character value for the given integer.

返回给定整数的字符值。

chr() does the reverse of ord(). Given a numeric value n, chr(n) returns a string representing the character that corresponds to n:

chr()ord()相反。 给定一个数值nchr(n)返回一个字符串,该字符串表示对应于n的字符:

chr() handles Unicode characters as well:

chr()处理Unicode字符:

 >>> >>>  chrchr (( 83648364 )
)
'€'
'€'
>>> >>>  chrchr (( 87218721 )
)
'∑'
'∑'

len(s)

len(s)

Returns the length of a string.

返回字符串的长度。

len(s) returns the number of characters in s:

len(s)返回的字符数s

str(obj)

str(obj)

Returns a string representation of an object.

返回对象的字符串表示形式。

Virtually any object in Python can be rendered as a string. str(obj) returns the string representation of object obj:

实际上,Python中的任何对象都可以呈现为字符串。 str(obj)返回对象obj的字符串表示形式:

 >>> >>>  strstr (( 49.249.2 )
)
'49.2'
'49.2'
>>> >>>  strstr (( 33 ++ 4j4j )
)
'(3+4j)'
'(3+4j)'
>>> >>>  strstr (( 3 3 + + 2929 )
)
'32'
'32'
>>> >>>  strstr (( 'foo''foo' )
)
'foo'
'foo'

字符串索引 (String Indexing)

Often in programming languages, individual items in an ordered set of data can be accessed directly using a numeric index or key value. This process is referred to as indexing.

通常,在编程语言中,可以使用数字索引或键值直接访问有序数据集中的各个项目。 此过程称为索引编制。

In Python, strings are ordered sequences of character data, and thus can be indexed in this way. Individual characters in a string can be accessed by specifying the string name followed by a number in square brackets ([]).

在Python中,字符串是字符数据的有序序列,因此可以通过这种方式进行索引。 通过指定字符串名称,然后在方括号( [] )中指定数字,可以访问字符串中的各个字符。

String indexing in Python is zero-based: the first character in the string has index 0, the next has index 1, and so on. The index of the last character will be the length of the string minus one.

Python中的字符串索引是从零开始的:字符串中的第一个字符的索引为0 ,下一个字符的索引为1 ,依此类推。 最后一个字符的索引将是字符串的长度减一。

For example, a schematic diagram of the indices of the string 'foobar' would look like this:

例如,字符串'foobar'的索引的示意图如下所示:

字符串索引1
String Indices
字符串索引

The individual characters can be accessed by index as follows:

可以通过索引访问各个字符,如下所示:

Attempting to index beyond the end of the string results in an error:

尝试索引超出字符串末尾会导致错误:

 >>> >>>  ss [[ 66 ]
]
Traceback (most recent call last):
  File Traceback (most recent call last):
  File "<pyshell#17>", line "<pyshell#17>" , line 1, in 1 , in <module>
    <module>
    ss [[ 66 ]
]
IndexError: IndexError : string index out of range
string index out of range

String indices can also be specified with negative numbers, in which case indexing occurs from the end of the string backward: -1 refers to the last character, -2 the second-to-last character, and so on. Here is the same diagram showing both the positive and negative indices into the string 'foobar':

字符串索引也可以用负数指定,在这种情况下,索引从字符串的末尾开始进行: -1表示最后一个字符, -2表示倒数第二个字符,依此类推。 这是同一张图,显示了字符串'foobar'正负索引:

字符串索引2
Positive and Negative String Indices
正负字符串索引

Here are some examples of negative indexing:

以下是负面索引的一些示例:

Attempting to index with negative numbers beyond the start of the string results in an error:

尝试使用超出字符串开头的负数编制索引会导致错误:

 >>> >>>  ss [[ -- 77 ]
]
Traceback (most recent call last):
  File Traceback (most recent call last):
  File "<pyshell#26>", line "<pyshell#26>" , line 1, in 1 , in <module>
    <module>
    ss [[ -- 77 ]
]
IndexError: IndexError : string index out of range
string index out of range

For any non-empty string s, s[len(s)-1] and s[-1] both return the last character. There isn’t any index that makes sense for an empty string.

对于任何非空字符串ss[len(s)-1]s[-1]都返回最后一个字符。 没有任何索引对空字符串有意义。

字符串切片 (String Slicing)

Python also allows a form of indexing syntax that extracts substrings from a string, known as string slicing. If s is a string, an expression of the form s[m:n] returns the portion of s starting with position m, and up to but not including position n:

Python还允许一种索引语法形式,该语法从字符串中提取子字符串,称为字符串切片。 如果s是字符串,则形式为s[m:n]表达式返回s的一部分,从位置m开始,直到但不包括位置n

Remember: String indices are zero-based. The first character in a string has index 0. This applies to both standard indexing and slicing.

切记:字符串索引从零开始。 字符串中的第一个字符的索引为0 。 这适用于标准索引编制和切片。

Again, the second index specifies the first character that is not included in the result—the character 'r' (s[5]) in the example above. That may seem slightly unintuitive, but it produces this result which makes sense: the expression s[m:n] will return a substring that is n - m characters in length, in this case, 5 - 2 = 3.

同样,第二个索引指定了不包含在结果中的第一个字符-上例中的字符'r's[5] )。 这似乎有点不直观,但是它会产生有意义的结果:表达式s[m:n]将返回一个长度为n - m字符的子字符串,在这种情况下为5 - 2 = 3

If you omit the first index, the slice starts at the beginning of the string. Thus, s[:m] and s[0:m] are equivalent:

如果省略第一个索引,则切片将从字符串的开头开始。 因此, s[:m]s[0:m]是等效的:

 >>> >>>  s s = = 'foobar'

'foobar'

>>> >>>  ss [:[: 44 ]
]
'foob'
'foob'
>>> >>>  ss [[ 00 :: 44 ]
]
'foob'
'foob'

Similarly, if you omit the second index as in s[n:], the slice extends from the first index through the end of the string. This is a nice, concise alternative to the more cumbersome s[n:len(s)]:

同样,如果在s[n:]省略第二个索引,则切片将从第一个索引延伸到字符串的末尾。 这是较麻烦的s[n:len(s)]一种不错的简洁选择:

For any string s and any integer n (0 ≤ n ≤ len(s)), s[:n] + s[n:] will be equal to s:

对于任何字符串s和任何整数n0 ≤ n ≤ len(s) ), s[:n] + s[n:]将等于s

 >>> >>>  s s = = 'foobar'

'foobar'

>>> >>>  ss [:[: 44 ] ] + + ss [[ 44 :]
:]
'foobar'
'foobar'
>>> >>>  ss [:[: 44 ] ] + + ss [[ 44 :] :] == == s
s
True
True

Omitting both indices returns the original string, in its entirety. Literally. It’s not a copy, it’s a reference to the original string:

省略两个索引将完整返回原始字符串。 从字面上看。 它不是副本,而是对原始字符串的引用:

If the first index in a slice is greater than or equal to the second index, Python returns an empty string. This is yet another obfuscated way to generate an empty string, in case you were looking for one:

如果切片中的第一个索引大于或等于第二个索引,则Python返回一个空字符串。 如果您正在寻找一个空字符串,这是另一种产生混淆的方式:

 >>> >>>  ss [[ 22 :: 22 ]
]
''
''
>>> >>>  ss [[ 44 :: 22 ]
]
''
''

Negative indices can be used with slicing as well. -1 refers to the last character, -2 the second-to-last, and so on, just as with simple indexing. The diagram below shows how to slice the substring 'oob' from the string 'foobar' using both positive and negative indices:

负索引也可以与切片一起使用。 -1表示最后一个字符, -2表示倒数第二个,依此类推,与简单的索引一样。 下图显示了如何使用正和负索引从字符串'foobar'切片子字符串'oob'

字符串索引3
String Slicing with Positive and Negative Indices
带正负号的字符串切片

Here is the corresponding Python code:

这是相应的Python代码:

在字符串切片中指定步幅 (Specifying a Stride in a String Slice)

There is one more variant of the slicing syntax to discuss. Adding an additional : and a third index designates a stride (also called a step), which indicates how many characters to jump after retrieving each character in the slice.

切片语法还有另一种形式要讨论。 添加一个附加的:和第三个索引指定一个跨度(也称为步骤),该跨度指示检索切片中的每个字符后要跳多少个字符。

For example, for the string 'foobar', the slice 0:6:2 starts with the first character and ends with the last character (the whole string), and every second character is skipped. This is shown in the following diagram:

例如,对于字符串'foobar' ,切片0:6:2从第一个字符开始,到最后一个字符(整个字符串)结束,然后跳过每个第二个字符。 如下图所示:

String stride 1
String Indexing with Stride
使用Stride进行字符串索引

Similarly, 1:6:2 specifies a slice starting with the second character (index 1) and ending with the last character, and again the stride value 2 causes every other character to be skipped:

同样, 1:6:2指定一个从第二个字符(索引1 )开始到最后一个字符结束的切片,并且跨度值2再次导致其他所有字符被跳过:

String stride 2
Another String Indexing with Stride
另一个带有Stride的字符串索引

The illustrative REPL code is shown here:

说明性的REPL代码如下所示:

 >>> >>>  s s = = 'foobar'

'foobar'

>>> >>>  ss [[ 00 :: 66 :: 22 ]
]
'foa'

'foa'

>>> >>>  ss [[ 11 :: 66 :: 22 ]
]
'obr'
'obr'

As with any slicing, the first and second indices can be omitted, and default to the first and last characters respectively:

与任何切片一样,第一个和第二个索引可以省略,并且分别默认为第一个和最后一个字符:

You can specify a negative stride value as well, in which case Python steps backward through the string. In that case, the starting/first index should be greater than the ending/second index:

您还可以指定一个负的跨步值,在这种情况下,Python将向后浏览字符串。 在这种情况下,开始/第一个索引应大于结束/第二个索引:

 >>> >>>  s s = = 'foobar'
'foobar'
>>> >>>  ss [[ 66 :: 00 :: -- 22 ]
]
'rbo'
'rbo'

In the above example, 6:0:-2 means “start at the last character and step backward by 2, up to but not including the first character.”

在上面的示例中, 6:0:-2表示“从最后一个字符开始,然后向后退2 ,直到但不包括第一个字符。”

When you are stepping backward, if the first and second indices are omitted, the defaults are reversed in an intuitive way: the first index defaults to the end of the string, and the second index defaults to the beginning. Here is an example:

当您向后退时,如果省略了第一个和第二个索引,则默认设置会以直观的方式反转:第一个索引默认为字符串的结尾,第二个索引默认为开头。 这是一个例子:

This is a common paradigm for reversing a string:

这是反转字符串的常见范例:

 >>> >>>  s s = = 'If Comrade Napoleon says it, it must be right.'
'If Comrade Napoleon says it, it must be right.'
>>> >>>  ss [::[:: -- 11 ]
]
'.thgir eb tsum ti ,ti syas noelopaN edarmoC fI'
'.thgir eb tsum ti ,ti syas noelopaN edarmoC fI'

将变量插值到字符串中 (Interpolating Variables Into a String)

In Python version 3.6, a new string formatting mechanism was introduced. This feature is formally named the Formatted String Literal, but is more usually referred to by its nickname f-string.

在Python 3.6版中,引入了新的字符串格式化机制。 该功能的正式名称为Formatted String Literal,但通常更昵称为f-string

The formatting capability provided by f-strings is extensive and won’t be covered in full detail here. If you want to learn more, you can check out the Real Python article Python 3’s f-Strings: An Improved String Formatting Syntax (Guide). There is also a tutorial on Formatted Output coming up later in this series that digs deeper into f-strings.

f字符串提供的格式化功能是广泛的,此处将不详细介绍。 如果您想了解更多信息,可以查看Real Python文章Python 3的f-Strings:改进的字符串格式语法(指南) 。 在本系列的后面部分还将提供有关格式化输出的教程,该教程将更深入地研究f字符串。

One simple feature of f-strings you can start using right away is variable interpolation. You can specify a variable name directly within an f-string literal, and Python will replace the name with the corresponding value.

您可以立即开始使用的f字符串的一个简单功能是变量插值。 您可以直接在f字符串文字中指定变量名称,Python会将其替换为相应的值。

For example, suppose you want to display the result of an arithmetic calculation. You can do this with a straightforward print() statement, separating numeric values and string literals by commas:

例如,假设您要显示算术计算的结果。 您可以使用简单的print()语句执行此操作,并用逗号分隔数字值和字符串文字:

But this is cumbersome. To accomplish the same thing using an f-string:

但这很麻烦。 要使用f字符串完成相同的操作:

  • Specify either a lowercase f or uppercase F directly before the opening quote of the string literal. This tells Python it is an f-string instead of a standard string.
  • Specify any variables to be interpolated in curly braces ({}).
  • 在字符串文字的右引号前直接指定小写字母f或大写字母F 这告诉Python它是f字符串而不是标准字符串。
  • 指定要用大括号( {} )内插的任何变量。

Recast using an f-string, the above example looks much cleaner:

使用f字符串进行重铸,上面的示例看起来更加简洁:

 >>> >>>  n n = = 20
20
>>> >>>  m m = = 25
25
>>> >>>  prod prod = = n n * * m
m
>>> >>>  printprint (( ff 'The product of {n} and {m} is {prod}''The product of {n} and {m} is {prod}' )
)
The product of 20 and 25 is 500
The product of 20 and 25 is 500

Any of Python’s three quoting mechanisms can be used to define an f-string:

Python的三种引号机制中的任何一种都可以用来定义f字符串:

修改字符串 (Modifying Strings)

In a nutshell, you can’t. Strings are one of the data types Python considers immutable, meaning not able to be changed. In fact, all the data types you have seen so far are immutable. (Python does provide data types that are mutable, as you will soon see.)

简而言之,您不能。 字符串是Python认为不可变的数据类型之一,意味着无法更改。 实际上,到目前为止,您所看到的所有数据类型都是不可变的。 (您很快就会看到,Python确实提供了可变的数据类型。)

A statement like this will cause an error:

这样的语句将导致错误:

 >>> >>>  s s = = 'foobar'
'foobar'
>>> >>>  ss [[ 33 ] ] = = 'x'
'x'
Traceback (most recent call last):
  File Traceback (most recent call last):
  File "<pyshell#40>", line "<pyshell#40>" , line 1, in 1 , in <module>
    <module>
    ss [[ 33 ] ] = = 'x'
'x'
TypeError: TypeError : 'str' object does not support item assignment
'str' object does not support item assignment

In truth, there really isn’t much need to modify strings. You can usually easily accomplish what you want by generating a copy of the original string that has the desired change in place. There are very many ways to do this in Python. Here is one possibility:

实际上,确实不需要太多修改字符串。 通常,您可以通过生成具有所需更改的原始字符串的副本来轻松完成所需的操作。 在Python中有很多方法可以做到这一点。 这是一种可能性:

There is also a built-in string method to accomplish this:

还有一个内置的字符串方法可以完成此操作:

 >>> >>>  s s = = 'foobar'
'foobar'
>>> >>>  s s = = ss .. replacereplace (( 'b''b' , , 'x''x' )
)
>>> >>>  s
s
'fooxar'
'fooxar'

Read on for more information about built-in string methods!

继续阅读有关内置字符串方法的更多信息!

内置字符串方法 (Built-in String Methods)

You learned in the tutorial on Variables in Python that Python is a highly object-oriented language. Every item of data in a Python program is an object.

在Python变量教程中了解到Python是一种高度面向对象的语言。 Python程序中的每一项数据都是一个对象。

You are also familiar with functions: callable procedures that you can invoke to perform specific tasks.

您还熟悉函数:可以调用以执行特定任务的可调用过程。

Methods are similar to functions. A method is a specialized type of callable procedure that is tightly associated with an object. Like a function, a method is called to perform a distinct task, but it is invoked on a specific object and has knowledge of its target object during execution.

方法类似于功能。 方法是与对象紧密相关的一种特殊类型的可调用过程。 像函数一样,方法被调用以执行不同的任务,但是在特定对象上调用该方法,并且在执行过程中了解其目标对象。

The syntax for invoking a method on an object is as follows:

在对象上调用方法的语法如下:

This invokes method .foo() on object obj. <args> specifies the arguments passed to the method (if any).

这将在对象obj上调用方法.foo()<args>指定传递给方法的参数(如果有)。

You will explore much more about defining and calling methods later in the discussion of object-oriented programming. For now, the goal is to present some of the more commonly used built-in methods Python supports for operating on string objects.

稍后在面向对象编程的讨论中,您将探索有关定义和调用方法的更多信息。 目前,目标是提供Python支持的一些更常用的内置方法来对字符串对象进行操作。

In the following method definitions, arguments specified in square brackets ([]) are optional.

在以下方法定义中,方括号( [] )中指定的参数是可选的。

大小写转换 (Case Conversion)

Methods in this group perform case conversion on the target string.

该组中的方法对目标字符串执行大小写转换。

s.capitalize()

s.capitalize()

Capitalizes the target string.

大写目标字符串。

s.capitalize() returns a copy of s with the first character converted to uppercase and all other characters converted to lowercase:

s.capitalize()返回s的副本,其中第一个字符转换为大写,所有其他字符转换为小写:

 >>> >>>  s s = = 'foO BaR BAZ quX'
'foO BaR BAZ quX'
>>> >>>  ss .. capitalizecapitalize ()
()
'Foo bar baz qux'
'Foo bar baz qux'

Non-alphabetic characters are unchanged:

非字母字符不变:

s.lower()

s.lower()

Converts alphabetic characters to lowercase.

将字母字符转换为小写。

s.lower() returns a copy of s with all alphabetic characters converted to lowercase:

s.lower()返回s的副本,其中所有字母字符均转换为小写字母:

 >>> >>>  'FOO Bar 123 baz qUX''FOO Bar 123 baz qUX' .. lowerlower ()
()
'foo bar 123 baz qux'
'foo bar 123 baz qux'

s.swapcase()

s.swapcase()

Swaps case of alphabetic characters.

交换字母字符的大小写。

s.swapcase() returns a copy of s with uppercase alphabetic characters converted to lowercase and vice versa:

s.swapcase()返回s的副本,其中大写字母字符转换为小写,反之亦然:

s.title()

s.title()

Converts the target string to “title case.”

将目标字符串转换为“标题大小写”。

s.title() returns a copy of s in which the first letter of each word is converted to uppercase and remaining letters are lowercase:

s.title()返回s的副本,其中每个单词的首字母转换为大写,其余字母转换为小写:

 >>> >>>  'the sun also rises''the sun also rises' .. titletitle ()
()
'The Sun Also Rises'
'The Sun Also Rises'

This method uses a fairly simple algorithm. It does not attempt to distinguish between important and unimportant words, and it does not handle apostrophes, possessives, or acronyms gracefully:

此方法使用一个相当简单的算法。 它不会尝试区分重要单词和不重要单词,也不会优雅地处理撇号,所有格或首字母缩写词:

s.upper()

s.upper()

Converts alphabetic characters to uppercase.

将字母字符转换为大写。

s.upper() returns a copy of s with all alphabetic characters converted to uppercase:

s.upper()返回s的副本,其中所有字母字符均转换为大写:

 >>> >>>  'FOO Bar 123 baz qUX''FOO Bar 123 baz qUX' .. upperupper ()
()
'FOO BAR 123 BAZ QUX'
'FOO BAR 123 BAZ QUX'
查找和替换 (Find and Replace)

These methods provide various means of searching the target string for a specified substring.

这些方法提供了在目标字符串中搜索指定子字符串的各种方法。

Each method in this group supports optional <start> and <end> arguments. These are interpreted as for string slicing: the action of the method is restricted to the portion of the target string starting at character position <start> and proceeding up to but not including character position <end>. If <start> is specified but <end> is not, the method applies to the portion of the target string from <start> through the end of the string.

该组中的每个方法都支持可选的<start><end>参数。 它们被解释为用于字符串切片:方法的作用仅限于目标字符串中从字符位置<start>到直到但不包括字符位置<end> 。 如果指定了<start>但未指定<end> ,则该方法适用于目标字符串中从<start>到字符串结尾的部分。

s.count(<sub>[, <start>[, <end>]])

s.count(<sub>[, <start>[, <end>]])

Counts occurrences of a substring in the target string.

计算目标字符串中子字符串的出现次数。

s.count(<sub>) returns the number of non-overlapping occurrences of substring <sub> in s:

s.count(<sub>)返回s <sub>字符串<sub>的不重叠出现次数:

The count is restricted to the number of occurrences within the substring indicated by <start> and <end>, if they are specified:

如果指定了计数,则限制为<start><end>指示的子字符串中出现的次数:

 >>> >>>  'foo goo moo''foo goo moo' .. countcount (( 'oo''oo' , , 00 , , 88 )
)
2
2

s.endswith(<suffix>[, <start>[, <end>]])

s.endswith(<suffix>[, <start>[, <end>]])

Determines whether the target string ends with a given substring.

确定目标字符串是否以给定的子字符串结尾。

s.endswith(<suffix>) returns True if s ends with the specified <suffix> and False otherwise:

如果s以指定的<suffix>结尾,则s.endswith(<suffix>)返回True否则返回False

The comparison is restricted to the substring indicated by <start> and <end>, if they are specified:

如果指定了比较,则仅限于<start><end>指示的子字符串:

 >>> >>>  'foobar''foobar' .. endswithendswith (( 'oob''oob' , , 00 , , 44 )
)
True
True
>>> >>>  'foobar''foobar' .. endswithendswith (( 'oob''oob' , , 22 , , 44 )
)
False
False

s.find(<sub>[, <start>[, <end>]])

s.find(<sub>[, <start>[, <end>]])

Searches the target string for a given substring.

在目标字符串中搜索给定的子字符串。

s.find(<sub>) returns the lowest index in s where substring <sub> is found:

s.find(<sub>)返回s找到子字符串<sub>的最低索引:

This method returns -1 if the specified substring is not found:

如果找不到指定的子字符串,则此方法返回-1

 >>> >>>  'foo bar foo baz foo qux''foo bar foo baz foo qux' .. findfind (( 'grault''grault' )
)
-1
-1

The search is restricted to the substring indicated by <start> and <end>, if they are specified:

如果指定了搜索,则仅限于<start><end>指示的子字符串:

s.index(<sub>[, <start>[, <end>]])

s.index(<sub>[, <start>[, <end>]])

Searches the target string for a given substring.

在目标字符串中搜索给定的子字符串。

This method is identical to .find(), except that it raises an exception if <sub> is not found rather than returning -1:

此方法与.find()相同,除了如果未找到<sub>而不是返回-1 ,它将引发异常:

 >>> >>>  'foo bar foo baz foo qux''foo bar foo baz foo qux' .. indexindex (( 'grault''grault' )
)
Traceback (most recent call last):
  File Traceback (most recent call last):
  File "<pyshell#0>", line "<pyshell#0>" , line 1, in 1 , in <module>
    <module>
    'foo bar foo baz foo qux''foo bar foo baz foo qux' .. indexindex (( 'grault''grault' )
)
ValueError: ValueError : substring not found
substring not found

s.rfind(<sub>[, <start>[, <end>]])

s.rfind(<sub>[, <start>[, <end>]])

Searches the target string for a given substring starting at the end.

从末尾开始在目标字符串中搜索给定的子字符串。

s.rfind(<sub>) returns the highest index in s where substring <sub> is found:

s.rfind(<sub>)返回s找到子字符串<sub>的最高索引:

As with .find(), if the substring is not found, -1 is returned:

.find() ,如果未找到子字符串,则返回-1

 >>> >>>  'foo bar foo baz foo qux''foo bar foo baz foo qux' .. rfindrfind (( 'grault''grault' )
)
-1
-1

The search is restricted to the substring indicated by <start> and <end>, if they are specified:

如果指定了搜索,则仅限于<start><end>指示的子字符串:

s.rindex(<sub>[, <start>[, <end>]])

s.rindex(<sub>[, <start>[, <end>]])

Searches the target string for a given substring starting at the end.

从末尾开始在目标字符串中搜索给定的子字符串。

This method is identical to .rfind(), except that it raises an exception if <sub> is not found rather than returning -1:

此方法与.rfind()相同,除了它在未找到<sub>而不是返回-1时引发异常:

 >>> >>>  'foo bar foo baz foo qux''foo bar foo baz foo qux' .. rindexrindex (( 'grault''grault' )
)
Traceback (most recent call last):
  File Traceback (most recent call last):
  File "<pyshell#1>", line "<pyshell#1>" , line 1, in 1 , in <module>
    <module>
    'foo bar foo baz foo qux''foo bar foo baz foo qux' .. rindexrindex (( 'grault''grault' )
)
ValueError: ValueError : substring not found
substring not found

s.startswith(<prefix>[, <start>[, <end>]])

s.startswith(<prefix>[, <start>[, <end>]])

Determines whether the target string starts with a given substring.

确定目标字符串是否以给定的子字符串开头。

s.startswith(<suffix>) returns True if s starts with the specified <suffix> and False otherwise:

如果s以指定的<suffix>开头,则s.startswith(<suffix>)返回True否则返回False

The comparison is restricted to the substring indicated by <start> and <end>, if they are specified:

如果指定了比较,则仅限于<start><end>指示的子字符串:

 >>> >>>  'foobar''foobar' .. startswithstartswith (( 'bar''bar' , , 33 )
)
True
True
>>> >>>  'foobar''foobar' .. startswithstartswith (( 'bar''bar' , , 33 , , 22 )
)
False
False
人物分类 (Character Classification)

Methods in this group classify a string based on the characters it contains.

该组中的方法根据字符串包含的字符对字符串进行分类。

s.isalnum()

s.isalnum()

Determines whether the target string consists of alphanumeric characters.

确定目标字符串是否由字母数字字符组成。

s.isalnum() returns True if s is nonempty and all its characters are alphanumeric (either a letter or a number), and False otherwise:

如果s s.isalnum()空且其所有字符均为字母数字(字母或数字),则s.isalnum()返回True否则返回False

s.isalpha()

s.isalpha()

Determines whether the target string consists of alphabetic characters.

确定目标字符串是否由字母字符组成。

s.isalpha() returns True if s is nonempty and all its characters are alphabetic, and False otherwise:

如果s s.isalpha()空且其所有字符均为字母,则s.isalpha()返回True否则返回False

 >>> >>>  'ABCabc''ABCabc' .. isalphaisalpha ()
()
True
True
>>> >>>  'abc123''abc123' .. isalphaisalpha ()
()
False
False

s.isdigit()

s.isdigit()

Determines whether the target string consists of digit characters.

确定目标字符串是否由数字字符组成。

s.digit() returns True if s is nonempty and all its characters are numeric digits, and False otherwise:

如果s s.digit()空且其所有字符均为数字,则s.digit()返回True否则返回False

s.isidentifier()

s.isidentifier()

Determines whether the target string is a valid Python identifier.

确定目标字符串是否为有效的Python标识符。

s.isidentifier() returns True if s is a valid Python identifier according to the language definition, and False otherwise:

如果s是根据语言定义有效的Python标识符,则s.isidentifier()返回True否则返回False

 >>> >>>  'foo32''foo32' .. isidentifierisidentifier ()
()
True
True
>>> >>>  '32foo''32foo' .. isidentifierisidentifier ()
()
False
False
>>> >>>  'foo$32''foo$32' .. isidentifierisidentifier ()
()
False
False

Note: .isidentifier() will return True for a string that matches a Python keyword even though that would not actually be a valid identifier:

注意: .isidentifier()对于与Python关键字匹配的字符串将返回True ,即使它实际上不是有效的标识符:

You can test whether a string matches a Python keyword using a function called iskeyword(), which is contained in a module called keyword. One possible way to do this is shown below:

您可以使用iskeyword()函数iskeyword()包含在名为keyword的模块中iskeyword()来测试字符串是否与Python关键字匹配。 下面显示了一种可能的方法:

 >>> >>>  from from keyword keyword import import iskeyword
iskeyword
>>> >>>  iskeywordiskeyword (( 'and''and' )
)
True
True

If you really want to ensure that a string would serve as a valid Python identifier, you should check that .isidentifier() is True and that iskeyword() is False.

如果您确实想确保字符串可以用作有效的Python标识符,则应检查.isidentifier()True以及iskeyword()False

See Python Modules and Packages—An Introduction to read more about Python modules.

请参阅Python模块和软件包-简介,以了解有关Python模块的更多信息。

s.islower()

s.islower()

Determines whether the target string’s alphabetic characters are lowercase.

确定目标字符串的字母字符是否为小写。

s.islower() returns True if s is nonempty and all the alphabetic characters it contains are lowercase, and False otherwise. Non-alphabetic characters are ignored:

如果s s.islower()空且包含的所有字母字符均为小写,则s.islower()返回True否则返回False 。 非字母字符将被忽略:

s.isprintable()

s.isprintable()

Determines whether the target string consists entirely of printable characters.

确定目标字符串是否完全由可打印字符组成。

s.isprintable() returns True if s is empty or all the alphabetic characters it contains are printable. It returns False if s contains at least one non-printable character. Non-alphabetic characters are ignored:

如果s为空或其包含的所有字母字符都可打印,则s.isprintable()返回True 。 如果s包含至少一个不可打印字符,则返回False 。 非字母字符将被忽略:

 >>> >>>  'a'a tt b'b' .. isprintableisprintable ()
()
False
False
>>> >>>  'a b''a b' .. isprintableisprintable ()
()
True
True
>>> >>>  '''' .. isprintableisprintable ()
()
True
True
>>> >>>  'a'a nn b'b' .. isprintableisprintable ()
()
False
False

Note: This is the only .isxxxx() method that returns True if s is an empty string. All the others return False for an empty string.

注意:如果s为空字符串,这是唯一返回True .isxxxx()方法。 其他所有返回False的空字符串。

s.isspace()

s.isspace()

Determines whether the target string consists of whitespace characters.

确定目标字符串是否由空格字符组成。

s.isspace() returns True if s is nonempty and all characters are whitespace characters, and False otherwise.

如果s s.isspace()空且所有字符均为空白字符,则s.isspace()返回True否则返回False

The most commonly encountered whitespace characters are space ' ', tab 't', and newline 'n':

最常见的空白字符是空格' ' ,制表符't'和换行符'n'

However, there are a few other ASCII characters that qualify as whitespace, and if you account for Unicode characters, there are quite a few beyond that:

但是,还有一些其他的ASCII字符可以用作空格,如果您考虑Unicode字符,那么还有很多:

 >>> >>>  '' fu2005rfu2005r '' .. isspaceisspace ()
()
True
True

('f' and 'r' are the escape sequences for the ASCII Form Feed and Carriage Return characters; 'u2005' is the escape sequence for the Unicode Four-Per-Em Space.)

'f''r'是ASCII 'u2005'和回车符的转义序列; 'u2005'是Unicode的“每四个字符'u2005'的转义序列。)

s.istitle()

s.istitle()

Determines whether the target string is title cased.

确定目标字符串是否为标题大小写。

s.istitle() returns True if s is nonempty, the first alphabetic character of each word is uppercase, and all other alphabetic characters in each word are lowercase. It returns False otherwise:

如果s s.istitle()空, s.istitle()返回True ,则每个单词的第一个字母字符为大写,而每个单词中的所有其他字母字符为小写。 否则返回False

Note: Here is how the Python documentation describes .istitle(), in case you find this more intuitive: “Uppercase characters may only follow uncased characters and lowercase characters only cased ones.”

注意:这是Python文档描述.istitle() ,以防您更加直观:“大写字母只能跟在无大小写的字符后面,小写字母只能跟在大写字符.istitle() 。”

s.isupper()

s.isupper()

Determines whether the target string’s alphabetic characters are uppercase.

确定目标字符串的字母字符是否为大写。

s.isupper() returns True if s is nonempty and all the alphabetic characters it contains are uppercase, and False otherwise. Non-alphabetic characters are ignored:

如果s s.isupper()空且包含的所有字母字符均为大写,则s.isupper()返回True否则返回False 。 非字母字符将被忽略:

 >>> >>>  'ABC''ABC' .. isupperisupper ()
()
True
True
>>> >>>  'ABC1$D''ABC1$D' .. isupperisupper ()
()
True
True
>>> >>>  'Abc1$D''Abc1$D' .. isupperisupper ()
()
False
False
字符串格式 (String Formatting)

Methods in this group modify or enhance the format of a string.

该组中的方法修改或增强字符串的格式。

s.center(<width>[, <fill>])

s.center(<width>[, <fill>])

Centers a string in a field.

在字段中将字符串居中。

s.center(<width>) returns a string consisting of s centered in a field of width <width>. By default, padding consists of the ASCII space character:

s.center(<width>)返回一个以s组成的字符串,该字符串以s为中心,其宽度为width <width> 。 默认情况下,填充由ASCII空格字符组成:

If the optional <fill> argument is specified, it is used as the padding character:

如果指定了可选的<fill>参数,它将用作填充字符:

 >>> >>>  'bar''bar' .. centercenter (( 1010 , , '-''-' )
)
'---bar----'
'---bar----'

If s is already at least as long as <width>, it is returned unchanged:

如果s已经至少与<width>一样长,则返回原样:

s.expandtabs(tabsize=8)

s.expandtabs(tabsize=8)

Expands tabs in a string.

扩展标签中的字符串。

s.expandtabs() replaces each tab character ('t') with spaces. By default, spaces are filled in assuming a tab stop at every eighth column:

s.expandtabs()用空格替换每个制表符( 't' )。 默认情况下,假设在每第八列的制表符处停止,则填充空格:

 >>> >>>  'a'a tt bb tt c'c' .. expandtabsexpandtabs ()
()
'a       b       c'
'a       b       c'
>>> >>>  'aaa'aaa tt bbbbbb tt c'c' .. expandtabsexpandtabs ()
()
'aaa     bbb     c'
'aaa     bbb     c'

tabsize is an optional keyword parameter specifying alternate tab stop columns:

tabsize是一个可选的关键字参数,用于指定备用制表符停止列:

s.ljust(<width>[, <fill>])

s.ljust(<width>[, <fill>])

Left-justifies a string in field.

在字段中左对齐字符串。

s.ljust(<width>) returns a string consisting of s left-justified in a field of width <width>. By default, padding consists of the ASCII space character:

s.ljust(<width>)返回一个由s组成的字符串,其中s在width <width>的字段中左对齐。 默认情况下,填充由ASCII空格字符组成:

 >>> >>>  'foo''foo' .. ljustljust (( 1010 )
)
'foo       '
'foo       '

If the optional <fill> argument is specified, it is used as the padding character:

如果指定了可选的<fill>参数,它将用作填充字符:

If s is already at least as long as <width>, it is returned unchanged:

如果s已经至少与<width>一样长,则返回原样:

 >>> >>>  'foo''foo' .. ljustljust (( 22 )
)
'foo'
'foo'

s.lstrip([<chars>])

s.lstrip([<chars>])

Trims leading characters from a string.

修剪字符串中的前导字符。

s.lstrip() returns a copy of s with any whitespace characters removed from the left end:

s.lstrip()返回s的副本,其中从左端删除了所有空格字符:

If the optional <chars> argument is specified, it is a string that specifies the set of characters to be removed:

如果指定了可选的<chars>参数,则它是一个字符串,它指定要删除的字符集:

 >>> >>>  'http://www.realpython.com''http://www.realpython.com' .. lstriplstrip (( '/:pth''/:pth' )
)
'www.realpython.com'
'www.realpython.com'

s.replace(<old>, <new>[, <count>])

s.replace(<old>, <new>[, <count>])

Replaces occurrences of a substring within a string.

替换字符串中子字符串的出现。

s.replace(<old>, <new>) returns a copy of s with all occurrences of substring <old> replaced by <new>:

s.replace(<old>, <new>)返回s的副本,所有出现的子字符串<old><new>替换:

If the optional <count> argument is specified, a maximum of <count> replacements are performed, starting at the left end of s:

如果指定了可选的<count>参数,则从s的左端开始最多执行<count>替换:

 >>> >>>  'foo bar foo baz foo qux''foo bar foo baz foo qux' .. replacereplace (( 'foo''foo' , , 'grault''grault' , , 22 )
)
'grault bar grault baz foo qux'
'grault bar grault baz foo qux'

s.rjust(<width>[, <fill>])

s.rjust(<width>[, <fill>])

Right-justifies a string in a field.

在字段中右对齐字符串。

s.rjust(<width>) returns a string consisting of s right-justified in a field of width <width>. By default, padding consists of the ASCII space character:

s.rjust(<width>)返回一个由s组成的字符串,该字符串在width <width>的字段中右对齐。 默认情况下,填充由ASCII空格字符组成:

If the optional <fill> argument is specified, it is used as the padding character:

如果指定了可选的<fill>参数,它将用作填充字符:

 >>> >>>  'foo''foo' .. rjustrjust (( 1010 , , '-''-' )
)
'-------foo'
'-------foo'

If s is already at least as long as <width>, it is returned unchanged:

如果s已经至少与<width>一样长,则返回原样:

s.rstrip([<chars>])

s.rstrip([<chars>])

Trims trailing characters from a string.

修剪字符串中的尾随字符。

s.rstrip() returns a copy of s with any whitespace characters removed from the right end:

s.rstrip()返回s的副本,其中从右端删除了所有空格字符:

 >>> >>>  '   foo bar baz   ''   foo bar baz   ' .. rstriprstrip ()
()
'   foo bar baz'
'   foo bar baz'
>>> >>>  'foo'foo tntn barbar tntn bazbaz tntn '' .. rstriprstrip ()
()
'footnbartnbaz'
'footnbartnbaz'

If the optional <chars> argument is specified, it is a string that specifies the set of characters to be removed:

如果指定了可选的<chars>参数,则它是一个字符串,它指定要删除的字符集:

s.strip([<chars>])

s.strip([<chars>])

Strips characters from the left and right ends of a string.

从字符串的左右两端去除字符。

s.strip() is essentially equivalent to invoking s.lstrip() and s.rstrip() in succession. Without the <chars> argument, it removes leading and trailing whitespace:

s.strip()本质上等效于连续调用s.lstrip()s.rstrip() 。 如果没有<chars>参数,它将删除前导和尾随空格:

 >>> >>>  s s = = '   foo bar baz'   foo bar baz tttttt '
'
>>> >>>  s s = = ss .. lstriplstrip ()
()
>>> >>>  s s = = ss .. rstriprstrip ()
()
>>> >>>  s
s
'foo bar baz'
'foo bar baz'

As with .lstrip() and .rstrip(), the optional <chars> argument specifies the set of characters to be removed:

.lstrip().rstrip() ,可选的<chars>参数指定要删除的字符集:

Note: When the return value of a string method is another string, as is often the case, methods can be invoked in succession by chaining the calls:

注意:当字符串方法的返回值是另一个字符串时(通常是这样),可以通过链接调用来连续调用方法:

 >>> >>>  '   foo bar baz'   foo bar baz tttttt '' .. lstriplstrip ()() .. rstriprstrip ()
()
'foo bar baz'
'foo bar baz'
>>> >>>  '   foo bar baz'   foo bar baz tttttt '' .. stripstrip ()
()
'foo bar baz'

'foo bar baz'

>>> >>>  'www.realpython.com''www.realpython.com' .. lstriplstrip (( 'w.moc''w.moc' )) .. rstriprstrip (( 'w.moc''w.moc' )
)
'realpython'
'realpython'
>>> >>>  'www.realpython.com''www.realpython.com' .. stripstrip (( 'w.moc''w.moc' )
)
'realpython'
'realpython'

s.zfill(<width>)

s.zfill(<width>)

Pads a string on the left with zeros.

用零填充左边的字符串。

s.zfill(<width>) returns a copy of s left-padded with '0' characters to the specified <width>:

s.zfill(<width>)返回以'0'字符左填充的s的副本到指定的<width>

If s contains a leading sign, it remains at the left edge of the result string after zeros are inserted:

如果s包含一个前导符号,则在插入零后它将保留在结果字符串的左边缘:

 >>> >>>  '+42''+42' .. zfillzfill (( 88 )
)
'+0000042'
'+0000042'
>>> >>>  '-42''-42' .. zfillzfill (( 88 )
)
'-0000042'
'-0000042'

If s is already at least as long as <width>, it is returned unchanged:

如果s已经至少与<width>一样长,则返回原样:

.zfill() is most useful for string representations of numbers, but Python will still happily zero-pad a string that isn’t:

.zfill()对于数字的字符串表示形式最有用,但是Python仍然会很乐意将不是以下内容的字符串零填充:

 >>> >>>  'foo''foo' .. zfillzfill (( 66 )
)
'000foo'
'000foo'
在字符串和列表之间转换 (Converting Between Strings and Lists)

Methods in this group convert between a string and some composite data type by either pasting objects together to make a string, or by breaking a string up into pieces.

该组中的方法通过将对象粘贴在一起以组成字符串,或将字符串分解成多个部分,从而在字符串和某种复合数据类型之间进行转换。

These methods operate on or return iterables, the general Python term for a sequential collection of objects. You will explore the inner workings of iterables in much more detail in the upcoming tutorial on definite iteration.

这些方法对可迭代对象进行操作或返回可迭代对象可迭代对象是对象的顺序集合的通用Python术语。 您将在即将发布的有关确定迭代的教程中更详细地探讨可迭代的内部工作原理。

Many of these methods return either a list or a tuple. These are two similar composite data types that are prototypical examples of iterables in Python. They are covered in the next tutorial, so you’re about to learn about them soon! Until then, simply think of them as sequences of values. A list is enclosed in square brackets ([]), and a tuple is enclosed in parentheses (()).

这些方法很多都返回列表或元组。 这是两种相似的复合数据类型,它们是Python中可迭代的原型示例。 它们将在下一个教程中介绍,因此您将很快了解它们! 在此之前,只需将它们视为值序列即可。 列表放在方括号( [] )中,而元组放在括号( () )中。

With that introduction, let’s take a look at this last group of string methods.

通过介绍,让我们看一下最后一组字符串方法。

s.join(<iterable>)

s.join(<iterable>)

Concatenates strings from an iterable.

连接可迭代的字符串。

s.join(<iterable>) returns the string that results from concatenating the objects in <iterable> separated by s.

s.join(<iterable>)返回将由s分隔的<iterable>的对象串联而成的字符串。

Note that .join() is invoked on s, the separator string. <iterable> must be a sequence of string objects as well.

请注意, .join()在分隔符s上调用。 <iterable>必须是字符串对象的序列。

Some sample code should help clarify. In the following example, the separator s is the string ', ', and <iterable> is a list of string values:

一些示例代码应有助于澄清。 在以下示例中,分隔符s是字符串', ' ,而<iterable>是字符串值的列表:

The result is a single string consisting of the list objects separated by commas.

结果是一个单个字符串,其中包含用逗号分隔的列表对象。

In the next example, <iterable> is specified as a single string value. When a string value is used as an iterable, it is interpreted as a list of the string’s individual characters:

在下一个示例中,将<iterable>指定为单个字符串值。 当字符串值用作迭代值时,它将被解释为字符串中各个字符的列表:

 >>> >>>  listlist (( 'corge''corge' )
)
['c', 'o', 'r', 'g', 'e']

['c', 'o', 'r', 'g', 'e']

>>> >>>  ':'':' .. joinjoin (( 'corge''corge' )
)
'c:o:r:g:e'
'c:o:r:g:e'

Thus, the result of ':'.join('corge') is a string consisting of each character in 'corge' separated by ':'.

因此, ':'.join('corge')是一个字符串,由'corge'的每个字符组成,并以':'分隔。

This example fails because one of the objects in <iterable> is not a string:

此示例失败,因为<iterable>中的对象之一不是字符串:

That can be remedied, though:

不过,可以补救:

 >>> >>>  '---''---' .. joinjoin ([([ 'foo''foo' , , strstr (( 2323 ), ), 'bar''bar' ])
])
'foo---23---bar'
'foo---23---bar'

As you will soon see, many composite objects in Python can be construed as iterables, and .join() is especially useful for creating strings from them.

正如您将很快看到的,Python中的许多复合对象都可以解释为可迭代对象, .join()对于从它们创建字符串特别有用。

s.partition(<sep>)

s.partition(<sep>)

Divides a string based on a separator.

根据分隔符分隔字符串。

s.partition(<sep>) splits s at the first occurrence of string <sep>. The return value is a three-part tuple consisting of:

s.partition(<sep>)在字符串<sep>首次出现时对s进行分割。 返回值是一个三部分的元组,包括:

  • The portion of s preceding <sep>
  • <sep> itself
  • The portion of s following <sep>
  • s之前<sep>
  • <sep>本身
  • <sep>s部分

Here are a couple examples of .partition() in action:

这是运行中的.partition()的几个示例:

If <sep> is not found in s, the returned tuple contains s followed by two empty strings:

如果在s找不到<sep> ,则返回的元组包含s和两个空字符串:

 >>> >>>  'foo.bar''foo.bar' .. partitionpartition (( '@@''@@' )
)
('foo.bar', '', '')
('foo.bar', '', '')

Remember: Lists and tuples are covered in the next tutorial.

切记:列表和元组将在下一教程中介绍。

s.rpartition(<sep>)

s.rpartition(<sep>)

Divides a string based on a separator.

根据分隔符分隔字符串。

s.rpartition(<sep>) functions exactly like s.partition(<sep>), except that s is split at the last occurrence of <sep> instead of the first occurrence:

s.rpartition(<sep>)的功能完全一样s.partition(<sep>)不同之处在于s是在最后一次出现裂<sep>代替第一发生:

s.rsplit(sep=None, maxsplit=-1)

s.rsplit(sep=None, maxsplit=-1)

Splits a string into a list of substrings.

将字符串拆分为子字符串列表。

Without arguments, s.rsplit() splits s into substrings delimited by any sequence of whitespace and returns the substrings as a list:

不带参数的s.rsplit()s拆分为由任何空白序列分隔的子字符串,并将这些子字符串作为列表返回:

 >>> >>>  'foo bar baz qux''foo bar baz qux' .. rsplitrsplit ()
()
['foo', 'bar', 'baz', 'qux']
['foo', 'bar', 'baz', 'qux']
>>> >>>  'foo'foo ntnt bar   bazbar   baz rfrf qux'qux' .. rsplitrsplit ()
()
['foo', 'bar', 'baz', 'qux']
['foo', 'bar', 'baz', 'qux']

If <sep> is specified, it is used as the delimiter for splitting:

如果指定了<sep> ,它将用作分割的定界符:

(If <sep> is specified with a value of None, the string is split delimited by whitespace, just as though <sep> had not been specified at all.)

(如果<sep>的值指定为None ,则该字符串将由空格分隔,就像完全没有指定<sep>一样。)

When <sep> is explicitly given as a delimiter, consecutive delimiters in s are assumed to delimit empty strings, which will be returned:

<sep>作为分隔符明确给出时,假设s中的连续分隔符分隔空字符串,将返回空字符串:

 >>> >>>  'foo...bar''foo...bar' .. rsplitrsplit (( sepsep == '.''.' )
)
['foo', '', '', 'bar']
['foo', '', '', 'bar']

This is not the case when <sep> is omitted, however. In that case, consecutive whitespace characters are combined into a single delimiter, and the resulting list will never contain empty strings:

但是,如果省略<sep> ,则不是这种情况。 在这种情况下,连续的空格字符将合并为单个定界符,并且结果列表将永远不会包含空字符串:

If the optional keyword parameter <maxsplit> is specified, a maximum of that many splits are performed, starting from the right end of s:

如果指定了可选的关键字参数<maxsplit> ,则从s的右端开始执行最大数量的拆分:

 >>> >>>  'www.realpython.com''www.realpython.com' .. rsplitrsplit (( sepsep == '.''.' , , maxsplitmaxsplit == 11 )
)
['www.realpython', 'com']
['www.realpython', 'com']

The default value for <maxsplit> is -1, which means all possible splits should be performed—the same as if <maxsplit> is omitted entirely:

<maxsplit>的默认值为-1 ,这意味着应该执行所有可能的分割,就像完全省略了<maxsplit>

s.split(sep=None, maxsplit=-1)

s.split(sep=None, maxsplit=-1)

Splits a string into a list of substrings.

将字符串拆分为子字符串列表。

s.split() behaves exactly like s.rsplit(), except that if <maxsplit> is specified, splits are counted from the left end of s rather than the right end:

s.split()行为与s.rsplit()完全相同,不同之处在于,如果指定了<maxsplit> ,则从s的左端开始而不是从右端开始计数:

 >>> >>>  'www.realpython.com''www.realpython.com' .. splitsplit (( '.''.' , , maxsplitmaxsplit == 11 )
)
['www', 'realpython.com']
['www', 'realpython.com']
>>> >>>  'www.realpython.com''www.realpython.com' .. rsplitrsplit (( '.''.' , , maxsplitmaxsplit == 11 )
)
['www.realpython', 'com']
['www.realpython', 'com']

If <maxsplit> is not specified, .split() and .rsplit() are indistinguishable.

如果未指定<maxsplit>则无法区分.split().rsplit()

s.splitlines([<keepends>])

s.splitlines([<keepends>])

Breaks a string at line boundaries.

在行边界处中断字符串。

s.splitlines() splits s up into lines and returns them in a list. Any of the following characters or character sequences is considered to constitute a line boundary:

s.splitlines()s分成几行,并在列表中返回它们。 以下任何字符或字符序列均视为构成线边界:

Escape Sequence 转义序列 Character 字符
nn Newline 新队
rr Carriage Return 回车
rnrn Carriage Return + Line Feed 回车+换行
v or vx0bx0b Line Tabulation 线制表
f or fx0cx0c Form Feed 换页
x1cx1c File Separator 文件分隔符
x1dx1d Group Separator 组分隔符
x1ex1e Record Separator 记录分隔符
x85x85 Next Line (C1 Control Code) 下一行(C1控制码)
u2028u2028 Unicode Line Separator Unicode行分隔符
u2029u2029 Unicode Paragraph Separator Unicode段落分隔符

Here is an example using several different line separators:

这是使用几种不同的行分隔符的示例:

If consecutive line boundary characters are present in the string, they are assumed to delimit blank lines, which will appear in the result list:

如果字符串中存在连续的行边界字符,则假定它们定界空白行,这些行将出现在结果列表中:

 >>> >>>  'foo'foo ffffff bar'bar' .. splitlinessplitlines ()
()
['foo', '', '', 'bar']
['foo', '', '', 'bar']

If the optional <keepends> argument is specified and is truthy, then the lines boundaries are retained in the result strings:

如果指定了可选的<keepends>参数且该参数是正确的,则行边界将保留在结果字符串中:

bytes对象 (bytes Objects)

The bytes object is one of the core built-in types for manipulating binary data. A bytes object is an immutable sequence of single byte values. Each element in a bytes object is a small integer in the range 0 to 255.

bytes对象是用于处理二进制数据的核心内置类型之一。 bytes对象是单字节值的不可变序列。 bytes对象中的每个元素都是0255范围内的一个小整数。

定义文字bytes对象 (Defining a Literal bytes Object)

A bytes literal is defined in the same way as a string literal with the addition of a 'b' prefix:

bytes字面量的定义与字符串字面量的定义方式相同,但增加了'b'前缀:

 >>> >>>  b b = = bb 'foo bar baz'
'foo bar baz'
>>> >>>  b
b
b'foo bar baz'
b'foo bar baz'
>>> >>>  typetype (( bb )
)
<class 'bytes'>
<class 'bytes'>

As with strings, you can use any of the single, double, or triple quoting mechanisms:

与字符串一样,您可以使用任何单引号,双引号或三引号机制:

Only ASCII characters are allowed in a bytes literal. Any character value greater than 127 must be specified using an appropriate escape sequence:

bytes文字中仅允许使用ASCII字符。 必须使用适当的转义序列指定大于127任何字符值:

 >>> >>>  b b = = bb 'foo'foo xddxdd bar'
bar'
>>> >>>  b
b
b'fooxddbar'
b'fooxddbar'
>>> >>>  bb [[ 33 ]
]
221
221
>>> >>>  intint (( 0xdd0xdd )
)
221
221

The 'r' prefix may be used on a bytes literal to disable processing of escape sequences, as with strings:

可以在bytes文字上使用'r'前缀来禁用转义序列的处理,例如字符串:

使用内置bytes()函数定义bytes对象 (Defining a bytes Object With the Built-in bytes() Function)

The bytes() function also creates a bytes object. What sort of bytes object gets returned depends on the argument(s) passed to the function. The possible forms are shown below.

bytes()函数还创建一个bytes对象。 返回哪种bytes对象取决于传递给函数的参数。 可能的形式如下所示。

bytes(<s>, <encoding>)

bytes(<s>, <encoding>)

Creates a bytes object from a string.

从字符串创建一个bytes对象。

bytes(<s>, <encoding>) converts string <s> to a bytes object, using str.encode() according to the specified <encoding>:

bytes(<s>, <encoding>)使用str.encode()根据指定的<encoding>将字符串<s>转换为bytes对象:

 >>> >>>  b b = = bytesbytes (( 'foo.bar''foo.bar' , , 'utf8''utf8' )
)
>>> >>>  b
b
b'foo.bar'
b'foo.bar'
>>> >>>  typetype (( bb )
)
<class 'bytes'>
<class 'bytes'>

Technical Note: In this form of the bytes() function, the <encoding> argument is required. “Encoding” refers to the manner in which characters are translated to integer values. A value of "utf8" indicates Unicode Transformation Format UTF-8, which is an encoding that can handle every possible Unicode character. UTF-8 can also be indicated by specifying "UTF8", "utf-8", or "UTF-8" for <encoding>.

技术说明:bytes()函数的这种形式中,需要<encoding>参数。 “编码”是指将字符转换为整数值的方式。 值"utf8"表示Unicode转换格式UTF-8 ,这是一种可以处理所有可能的Unicode字符的编码。 还可以通过为<encoding>指定"UTF8""utf-8""UTF-8"来指示"UTF-8"

See the Unicode documentation for more information. As long as you are dealing with common Latin-based characters, UTF-8 will serve you fine.

有关更多信息,请参见Unicode文档 。 只要您要处理基于拉丁语的常见字符,UTF-8都会为您服务。

bytes(<size>)

bytes(<size>)

Creates a bytes object consisting of null (0x00) bytes.

创建一个bytes对象,该对象由null( 0x00 )字节组成。

bytes(<size>) defines a bytes object of the specified <size>, which must be a positive integer. The resulting bytes object is initialized to null (0x00) bytes:

bytes(<size>)定义指定的<size>bytes对象,该对象必须为正整数。 结果bytes对象被初始化为null( 0x00 )个字节:

bytes(<iterable>)

bytes(<iterable>)

Creates a bytes object from an iterable.

从可迭代对象创建一个bytes对象。

bytes(<iterable>) defines a bytes object from the sequence of integers generated by <iterable>. <iterable> must be an iterable that generates a sequence of integers n in the range 0 ≤ n ≤ 255:

bytes(<iterable>)<iterable>生成的整数序列定义一个bytes对象。 <iterable>必须是可迭代产生整数序列n在范围0 ≤ n ≤ 255

 >>> >>>  b b = = bytesbytes ([([ 100100 , , 102102 , , 104104 , , 106106 , , 108108 ])
])
>>> >>>  b
b
b'dfhjl'
b'dfhjl'
>>> >>>  typetype (( bb )
)
<class 'bytes'>
<class 'bytes'>
>>> >>>  bb [[ 22 ]
]
104
104

bytes对象的操作 (Operations on bytes Objects)

Like strings, bytes objects support the common sequence operations:

像字符串一样, bytes对象支持常见的序列操作:

  • The in and not in operators:

  • The concatenation (+) and replication (*) operators:

    >>> b = b'abcde'
    
    >>> b + b'fghi'
    b'abcdefghi'
    >>> b * 3
    b'abcdeabcdeabcde'
    
    
  • Indexing and slicing:

  • Built-in functions:

    >>> len(b)
    5
    >>> min(b)
    97
    >>> max(b)
    101
    
    
  • innot in运算符:

  • 串联( + )和复制( * )运算符:

     >>>  b = b 'abcde'
    
    >>>  b + b 'fghi'
    b'abcdefghi'
    >>>  b * 3
    b'abcdeabcdeabcde'
    
  • 索引和切片:

  • 内置功能:

     >>>  len ( b )
    5
    >>>  min ( b )
    97
    >>>  max ( b )
    101
    

Many of the methods defined for string objects are valid for bytes objects as well:

为字符串对象定义的许多方法也对bytes对象有效:

Notice, however, that when these operators and methods are invoked on a bytes object, the operand and arguments must be bytes objects as well:

但是请注意,当在bytes对象上调用这些运算符和方法时,操作数和参数也必须是bytes对象:

 >>> >>>  b b = = bb 'foo.bar'

'foo.bar'

>>> >>>  b b + + '.baz'
'.baz'
Traceback (most recent call last):
  File Traceback (most recent call last):
  File "<pyshell#72>", line "<pyshell#72>" , line 1, in 1 , in <module>
    <module>
    b b + + '.baz'
'.baz'
TypeError: TypeError : can't concat bytes to str
can't concat bytes to str
>>> >>>  b b + + bb '.baz'
'.baz'
b'foo.bar.baz'

b'foo.bar.baz'

>>> >>>  bb .. splitsplit (( sepsep == '.''.' )
)
Traceback (most recent call last):
  File Traceback (most recent call last):
  File "<pyshell#74>", line "<pyshell#74>" , line 1, in 1 , in <module>
    <module>
    bb .. splitsplit (( sepsep == '.''.' )
)
TypeError: TypeError : a bytes-like object is required, not 'str'
a bytes-like object is required, not 'str'
>>> >>>  bb .. splitsplit (( sepsep == bb '.''.' )
)
[b'foo', b'bar']
[b'foo', b'bar']

Although a bytes object definition and representation is based on ASCII text, it actually behaves like an immutable sequence of small integers in the range 0 to 255, inclusive. That is why a single element from a bytes object is displayed as an integer:

尽管bytes对象的定义和表示是基于ASCII文本的,但实际上它的行为就像是不可变的小整数序列,范围在0255 (含0255 )。 这就是为什么bytes对象中的单个元素显示为整数的原因:

A slice is displayed as a bytes object though, even if it is only one byte long:

尽管切片只有一个字节长,但仍显示为bytes对象:

 >>> >>>  bb [[ 22 :: 33 ]
]
b'c'
b'c'

You can convert a bytes object into a list of integers with the built-in list() function:

您可以使用内置的list()函数将bytes对象转换为整数list()

Hexadecimal numbers are often used to specify binary data because two hexadecimal digits correspond directly to a single byte. The bytes class supports two additional methods that facilitate conversion to and from a string of hexadecimal digits.

十六进制数字通常用于指定二进制数据,因为两个十六进制数字直接对应于一个字节。 bytes类支持两种其他方法,这些方法有助于在十六进制数字字符串之间进行转换。

bytes.fromhex(<s>)

bytes.fromhex(<s>)

Returns a bytes object constructed from a string of hexadecimal values.

返回由十六进制值的字符串构造的bytes对象。

bytes.fromhex(<s>) returns the bytes object that results from converting each pair of hexadecimal digits in <s> to the corresponding byte value. The hexadecimal digit pairs in <s> may optionally be separated by whitespace, which is ignored:

bytes.fromhex(<s>)返回bytes对象,该对象通过将<s>中的每对十六进制数字转换为相应的字节值而得到。 <s>的十六进制数字对可以选择由空格分隔,可以将其忽略:

 >>> >>>  b b = = bytesbytes .. fromhexfromhex (( ' aa 68 4682cc '' aa 68 4682cc ' )
)
>>> >>>  b
b
b'xaahFx82xcc'
b'xaahFx82xcc'
>>> >>>  listlist (( bb )
)
[170, 104, 70, 130, 204]
[170, 104, 70, 130, 204]

Note: This method is a class method, not an object method. It is bound to the bytes class, not a bytes object. You will delve much more into the distinction between classes, objects, and their respective methods in the upcoming tutorials on object-oriented programming. For now, just observe that this method is invoked on the bytes class, not on object b.

注意:此方法是类方法,而不是对象方法。 它绑定到bytes类,而不是bytes对象。 在即将发布的面向对象编程的教程中,您将深入研究类,对象及其各自方法之间的区别。 现在,仅观察到此方法是在bytes类上而不是在对象b上调用的。

b.hex()

b.hex()

Returns a string of hexadecimal value from a bytes object.

bytes对象返回一个十六进制值的字符串。

b.hex() returns the result of converting bytes object b into a string of hexadecimal digit pairs. That is, it does the reverse of .fromhex():

b.hex()返回将bytes对象b转换为十六进制数字对字符串的结果。 也就是说,它与.fromhex()相反:

Note: As opposed to .fromhex(), .hex() is an object method, not a class method. Thus, it is invoked on an object of the bytes class, not on the class itself.

注意:.fromhex()相对, .hex()是对象方法,而不是类方法。 因此,它是在bytes类的对象上调用的,而不是在类本身上调用的。

bytearray对象 (bytearray Objects)

Python supports another binary sequence type called the bytearray. bytearray objects are very like bytes objects, despite some differences:

Python支持另一种称为bytearray二进制序列类型。 尽管有一些区别,但bytearray对象与bytes对象非常相似:

  • There is no dedicated syntax built into Python for defining a bytearray literal, like the 'b' prefix that may be used to define a bytes object. A bytearray object is always created using the bytearray() built-in function:

    >>> ba = bytearray('foo.bar.baz', 'UTF-8')
    >>> ba
    bytearray(b'foo.bar.baz')
    
    >>> bytearray(6)
    bytearray(b'x00x00x00x00x00x00')
    
    >>> bytearray([100, 102, 104, 106, 108])
    bytearray(b'dfhjl')
    
    
  • bytearray objects are mutable. You can modify the contents of a bytearray object using indexing and slicing:

  • 没有专用语法内建在Python中用于定义bytearray文字,如'b' ,其可以被用来定义一个前缀bytes对象。 始终使用bytearray()内置函数创建一个bytearray对象:

     >>>  ba = bytearray ( 'foo.bar.baz' , 'UTF-8' )
    >>>  ba
    bytearray(b'foo.bar.baz')
    
    >>>  bytearray ( 6 )
    bytearray(b'x00x00x00x00x00x00')
    
    >>>  bytearray ([ 100 , 102 , 104 , 106 , 108 ])
    bytearray(b'dfhjl')
    
  • bytearray对象是可变的。 您可以使用索引和切片来修改bytearray数组对象的内容:

A bytearray object may be constructed directly from a bytes object as well:

bytearray对象可以被直接构造从一个bytes对象,以及:

 >>> >>>  ba ba = = bytearraybytearray (( bb 'foo''foo' )
)
>>> >>>  ba
ba
bytearray(b'foo')
bytearray(b'foo')

结论 (Conclusion)

This tutorial provided an in-depth look at the many different mechanisms Python provides for string handling, including string operators, built-in functions, indexing, slicing, and built-in methods. You also were introduced to the bytes and bytearray types.

本教程深入探讨了Python为字符串处理提供的许多不同机制,包括字符串运算符,内置函数,索引,切片和内置方法。 还向您介绍了bytesbytes bytearray类型。

These types are the first types you have examined that are composite—built from a collection of smaller parts. Python provides several composite built-in types. In the next tutorial, you will explore two of the most frequently used: lists and tuples.

这些类型是您检查的第一个复合类型,它们是由一组较小的零件组成的。 Python提供了几种复合的内置类型。 在下一个教程中,您将探索两个最常使用的列表元组

翻译自: https://www.pybloggers.com/2018/07/strings-and-character-data-in-python/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值