- Python 3 supports formatting values into strings. Although this can include very complicated expressions, the most basic usage is to insert a value into a string with a single placeholder.
- There’s a lot going on here. First, that’s a method call on a string literal. Strings are objects, and objects have methods. Second, the whole expression evaluates to a string. Third,
{0}
and{1}
are replacement fields, which are replaced by the arguments passed to theformat()
method -
'1MB = 1000{0.modules[humansize].SUFFIXES[1000][0]}'.format(sys)
- The
sys
module holds information about the currently running Python instance. Since you just imported it, you can pass thesys
module itself as an argument to theformat()
method. So the replacement field{0}
refers to thesys
module. sys.modules
is a dictionary of all the modules that have been imported in this Python instance. The keys are the module names as strings; the values are the module objects themselves. So the replacement field{0.modules}
refers to the dictionary of imported modules.sys.modules['humansize']
is thehumansize
module which you just imported. The replacement field{0.modules[humansize]}
refers to thehumansize
module. Note the slight difference in syntax here. In real Python code, the keys of thesys.modules
dictionary are strings; to refer to them, you need to put quotes around the module name (e.g.'humansize'
). But within a replacement field, you skip the quotes around the dictionary key name (e.g.humansize
). To quote PEP 3101: Advanced String Formatting, “The rules for parsing an item key are very simple. If it starts with a digit, then it is treated as a number, otherwise it is used as a string.”sys.modules['humansize'].SUFFIXES
is the dictionary defined at the top of thehumansize
module. The replacement field{0.modules[humansize].SUFFIXES}
refers to that dictionary.sys.modules['humansize'].SUFFIXES[1000]
is a list of si suffixes:['KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB']
. So the replacement field{0.modules[humansize].SUFFIXES[1000]}
refers to that list.sys.modules['humansize'].SUFFIXES[1000][0]
is the first item of the list of si suffixes:'KB'
. Therefore, the complete replacement field{0.modules[humansize].SUFFIXES[1000][0]}
is replaced by the two-character stringKB
.
- The
- Format specifiers allow you to munge the replacement text in a variety of useful ways, like the
printf()
function in C. You can add zero- or space-padding, align strings, control decimal precision, and even convert numbers to hexadecimal. - The
splitlines()
method takes one multiline string and returns a list of strings, one for each line of the original. Note that the carriage returns at the end of each line are not included. - The
split()
string method has one required argument, a delimiter. The method splits a string into a list of strings based on the delimiter. Here, the delimiter is an ampersand character, but it could be anything - The optional second argument to the
split()
method is the number of times you want to split.1
means “only split once,” so thesplit()
method will return a two-item list. (In theory, a value could contain an equals sign too. If you just used'key=value=foo'.split('=')
, you would end up with a three-item list['key', 'value', 'foo']
.) - Once you’ve defined a string, you can get any part of it as a new string.
- This is called slicing the string. Slicing strings works exactly the same as slicing lists, which makes sense, because strings are just sequences of characters.
- To define a
bytes
object, use theb''
“byte literal” syntax. Each byte within the byte literal can be an ascii character or an encoded hexadecimal number from\x00
to\xff
(0–255). - The type of a
bytes
object isbytes
. - Just like lists and strings, you can use index notation to get individual bytes in a
bytes
object. The items of a string are strings; the items of abytes
object are integers. Specifically, integers between 0–255. A bytes
object is immutable; you can not assign individual bytes. If you need to change individual bytes, you can either use string slicing and concatenation operators (which work the same as strings), or you can convert thebytes
object into abytearray
object.- The one difference is that, with the
bytearray
object, you can assign individual bytes using index notation. The assigned value must be an integer between 0–255. - The one thing you can never do is mix bytes and strings.
- You can’t count the occurrences of bytes in a string, because there are no bytes in a string. A string is a sequence of characters. Perhaps you meant “count the occurrences of the string that you would get after decoding this sequence of bytes in a particular character encoding”? Well then, you’ll need to say that explicitly. Python 3 won’t implicitly convert bytes to strings or strings to bytes.
- And here is the link between strings and bytes:
bytes
objects have adecode()
method that takes a character encoding and returns a string, and strings have anencode()
method that takes a character encoding and returns abytes
object. In the previous example, the decoding was relatively straightforward — converting a sequence of bytes in the ascii encoding into a string of characters. But the same process works with any encoding that supports the characters of the string — even legacy (non-Unicode) encodings. - In Python 2, the default encoding for
.py
files was ascii. In Python 3, the default encoding is utf-8. - If you would like to use a different encoding within your Python code, you can put an encoding declaration on the first line of each file
Dive into Python3 重点摘录 (chapter4)
最新推荐文章于 2024-10-01 05:04:32 发布