python 格式化字符串_Python字符串格式化教程

python 格式化字符串

Remember the Zen of Python and how there should be “one obvious way to do something in Python”? You might scratch your head when you find out that there are four major ways to do string formatting in Python.

还记得Python禅宗吗?应该怎么有“一种在Python中做某事的明显方法”? 当发现使用Python进行字符串格式化有四种主要方法时,您可能会scratch不休。

In this tutorial, you’ll learn the four main approaches to string formatting in Python, as well as their strengths and weaknesses. You’ll also get a simple rule of thumb for how to pick the best general purpose string formatting approach in your own programs.

在本教程中,您将学习使用Python进行字符串格式化的四种主要方法,以及它们的优缺点。 您还将获得一个简单的经验法则,以了解如何在自己的程序中选择最佳的通用字符串格式设置方法。

Let’s jump right in, as we’ve got a lot to cover. In order to have a simple toy example for experimentation, let’s assume you’ve got the following variables (or constants, really) to work with:

让我们直接进入,因为我们有很多内容要讲。 为了有一个简单的玩具示例进行实验,我们假设您可以使用以下变量(或常量):

 >>> >>>  errno errno = = 50159747054
50159747054
>>> >>>  name name = = 'Bob'
'Bob'

Based on these variables, you’d like to generate an output string containing a simple error message:

根据这些变量,您想生成一个包含简单错误消息的输出字符串:

That error could really spoil a dev’s Monday morning… But we’re here to discuss string formatting. So let’s get to work.

该错误确实会破坏开发人员的星期一早上……但是我们在这里讨论字符串格式。 因此,让我们开始工作。

#1“旧样式”字符串格式(%运算符) (#1 “Old Style” String Formatting (% operator))

Strings in Python have a unique built-in operation that can be accessed with the % operator. This lets you do simple positional formatting very easily. If you’ve ever worked with a printf-style function in C, you’ll recognize how this works instantly. Here’s a simple example:

Python中的字符串具有唯一的内置操作,可以使用%运算符进行访问。 这使您可以非常轻松地执行简单的位置格式设置。 如果您曾经在C printf使用过printf -style函数,那么您将立即意识到它是如何工作的。 这是一个简单的例子:

 >>> >>>  'Hello, 'Hello,  %s%s ' ' % % name
name
"Hello, Bob"
"Hello, Bob"

I’m using the %s format specifier here to tell Python where to substitute the value of name, represented as a string.

我在这里使用%s格式说明符来告诉Python在哪里替换name值(表示为字符串)。

There are other format specifiers available that let you control the output format. For example, it’s possible to convert numbers to hexadecimal notation or add whitespace padding to generate nicely formatted tables and reports. (See Python Docs: “printf-style String Formatting”.)

还有其他可用的格式说明符,可用于控制输出格式。 例如,可以将数字转换为十六进制表示法或添加空格填充以生成格式正确的表和报告。 (请参阅Python Docs:“ printf样式的字符串格式设置” 。)

Here, you can use the %x format specifier to convert an int value to a string and to represent it as a hexadecimal number:

在这里,您可以使用%x格式说明符将int值转换为字符串并将其表示为十六进制数字:

The “old style” string formatting syntax changes slightly if you want to make multiple substitutions in a single string. Because the % operator takes only one argument, you need to wrap the right-hand side in a tuple, like so:

如果要在单个字符串中进行多个替换,则“旧样式”字符串格式语法会稍有变化。 因为%运算符仅接受一个参数,所以您需要将右侧包装在元组中,如下所示:

 >>> >>>  'Hey 'Hey  %s%s , there is a 0x, there is a 0x %x%x  error!'  error!' % % (( namename , , errnoerrno )
)
'Hey Bob, there is a 0xbadc0ffee error!'
'Hey Bob, there is a 0xbadc0ffee error!'

It’s also possible to refer to variable substitutions by name in your format string, if you pass a mapping to the % operator:

如果将映射传递给%运算符,则还可以在格式字符串中按名称引用变量替换:

This makes your format strings easier to maintain and easier to modify in the future. You don’t have to worry about making sure the order you’re passing in the values matches up with the order in which the values are referenced in the format string. Of course, the downside is that this technique requires a little more typing.

这使您的格式字符串更易于维护,将来也更易于修改。 您不必担心要确保在值中传递的顺序与在格式字符串中引用值的顺序匹配。 当然,缺点是该技术需要更多的键入。

I’m sure you’ve been wondering why this printf-style formatting is called “old style” string formatting. It was technically superseded by “new style” formatting in Python 3, which we’re going to talk about next.

我确定您一直在想为什么将这种printf样式格式称为“旧样式”字符串格式。 从技术上讲,它已被Python 3中的“新样式”格式所取代,我们将在下面讨论。

#2“新样式”字符串格式( str.format (#2 “New Style” String Formatting (str.format))

Python 3 introduced a new way to do string formatting that was also later back-ported to Python 2.7. This “new style” string formatting gets rid of the %-operator special syntax and makes the syntax for string formatting more regular. Formatting is now handled by calling .format() on a string object.

Python 3引入了一种进行字符串格式化的新方法,该新方法后来也被反向移植到Python 2.7。 这种“新样式”字符串格式摆脱了% -operator特殊语法,并使字符串格式的语法更加规范。 现在可以通过在字符串对象上调用.format()处理格式。

You can use format() to do simple positional formatting, just like you could with “old style” formatting:

您可以使用format()进行简单的位置格式设置,就像使用“旧样式”格式一样:

 >>> >>>  'Hello, {}''Hello, {}' .. formatformat (( namename )
)
'Hello, Bob'
'Hello, Bob'

Or, you can refer to your variable substitutions by name and use them in any order you want. This is quite a powerful feature as it allows for re-arranging the order of display without changing the arguments passed to format():

或者,您可以按名称引用变量替换,并以所需的任何顺序使用它们。 这是一项非常强大的功能,因为它允许重新排列显示顺序,而无需更改传递给format()的参数:

This also shows that the syntax to format an int variable as a hexadecimal string has changed. Now you need to pass a format spec by adding a :x suffix. The format string syntax has become more powerful without complicating the simpler use cases. It pays off to read up on this string formatting mini-language in the Python documentation.

这也表明将int变量格式化为十六进制字符串的语法已更改。 现在,您需要通过添加:x后缀来传递格式规范。 格式字符串语法已变得更强大,而又不使更简单的用例复杂化。 阅读Python文档中的字符串格式迷你语言很有意义

In Python 3, this “new style” string formatting is to be preferred over %-style formatting. While “old style” formatting has been de-emphasized, it has not been deprecated. It is still supported in the latest versions of Python. According to this discussion on the Python dev email list and this issue on the Python dev bug tracker, %-formatting is going to stick around for a long time to come.

在Python 3中,这种“新样式”字符串格式优于% -style格式。 尽管不再强调“旧样式”格式 ,但尚未弃用。 最新版本的Python仍支持它。 根据Python开发人员电子邮件列表上的讨论以及Python开发人员错误跟踪器上的此问题%格式设置将持续很长时间。

Still, the official Python 3 documentation doesn’t exactly recommend “old style” formatting or speak too fondly of it:

尽管如此,Python 3官方文档仍未完全建议“旧样式”格式或过于喜欢它:

“The formatting operations described here exhibit a variety of quirks that lead to a number of common errors (such as failing to display tuples and dictionaries correctly). Using the newer formatted string literals or the str.format() interface helps avoid these errors. These alternatives also provide more powerful, flexible and extensible approaches to formatting text.” (Source)

“这里描述的格式化操作表现出各种古怪,导致许多常见错误(例如未能正确显示元组和字典)。 使用较新的格式化字符串文字或str.format()接口有助于避免这些错误。 这些替代方案还提供了更强大,灵活和可扩展的文本格式设置方法。” ( 来源

This is why I’d personally try to stick with str.format for new code moving forward. Starting with Python 3.6, there’s yet another way to format your strings. I’ll tell you all about it in the next section.

这就是为什么我亲自尝试坚持使用str.format来推动新代码前进的原因。 从Python 3.6开始,还有另一种格式化字符串的方法。 在下一节中,我将告诉您所有相关信息。

#3字符串插值/ f字符串(Python 3.6+) (#3 String Interpolation / f-Strings (Python 3.6+))

Python 3.6 added a new string formatting approach called formatted string literals or “f-strings”. This new way of formatting strings lets you use embedded Python expressions inside string constants. Here’s a simple example to give you a feel for the feature:

Python 3.6添加了一种新的字符串格式化方法,称为格式化字符串文字或“ f-strings” 。 这种格式化字符串的新方法使您可以在字符串常量中使用嵌入式Python表达式。 这是一个简单的示例,可让您对该功能有所了解:

 >>> >>>  ff 'Hello, {name}!'
'Hello, {name}!'
'Hello, Bob!'
'Hello, Bob!'

As you can see, this prefixes the string constant with the letter “f“—hence the name “f-strings.” This new formatting syntax is powerful. Because you can embed arbitrary Python expressions, you can even do inline arithmetic with it. Check out this example:

如您所见,这在字符串常量的前面加上了字母“ f ”-因此将其命名为“ f-strings”。 这种新的格式语法功能强大。 因为您可以嵌入任意Python表达式,所以您甚至可以对其进行内联算术。 看看这个例子:

Formatted string literals are a Python parser feature that converts f-strings into a series of string constants and expressions. They then get joined up to build the final string.

格式化的字符串文字是Python解析器的一项功能,可将f字符串转换为一系列字符串常量和表达式。 然后,他们会联合起来构建最终的字符串。

Imagine you had the following greet() function that contains an f-string:

假设您有以下包含f字符串的greet()函数:

 >>> >>>  def def greetgreet (( namename , , questionquestion ):
):
...     ...     return return ff "Hello, {name}! How's it {question}?"
"Hello, {name}! How's it {question}?"
...
...
>>> >>>  greetgreet (( 'Bob''Bob' , , 'going''going' )
)
"Hello, Bob! How's it going?"
"Hello, Bob! How's it going?"

When you disassemble the function and inspect what’s going on behind the scenes, you’ll see that the f-string in the function gets transformed into something similar to the following:

当您反汇编该函数并检查幕后情况时,您会看到该函数中的f字符串已转换为类似于以下内容的东西:

The real implementation is slightly faster than that because it uses the BUILD_STRING opcode as an optimization. But functionally they’re the same:

实际的实现比它快一点,因为它使用BUILD_STRING操作码作为优化 。 但是在功能上它们是相同的:

 >>> >>>  import import dis
dis
>>> >>>  disdis .. disdis (( greetgreet )
)
  2           0 LOAD_CONST               1 ('Hello, ')
  2           0 LOAD_CONST               1 ('Hello, ')
              2 LOAD_FAST                0 (name)
              2 LOAD_FAST                0 (name)
              4 FORMAT_VALUE             0
              4 FORMAT_VALUE             0
              6 LOAD_CONST               2 ("! How's it ")
              6 LOAD_CONST               2 ("! How's it ")
              8 LOAD_FAST                1 (question)
              8 LOAD_FAST                1 (question)
             10 FORMAT_VALUE             0
             10 FORMAT_VALUE             0
             12 LOAD_CONST               3 ('?')
             12 LOAD_CONST               3 ('?')
             14 BUILD_STRING             5
             14 BUILD_STRING             5
             16 RETURN_VALUE
             16 RETURN_VALUE

String literals also support the existing format string syntax of the str.format() method. That allows you to solve the same formatting problems we’ve discussed in the previous two sections:

字符串文字还支持str.format()方法的现有格式字符串语法。 这样一来,您就可以解决前面两节中讨论的相同格式问题:

Python’s new formatted string literals are similar to JavaScript’s Template Literals added in ES2015. I think they’re quite a nice addition to Python, and I’ve already started using them in my day to day (Python 3) work. You can learn more about formatted string literals in our in-depth Python f-strings tutorial.

Python的新格式字符串文字类似于ES2015中添加的 JavaScript 模板文字 。 我认为它们是Python的一个不错的补充,并且我已经在日常工作(Python 3)中开始使用它们。 您可以在我们深入的Python f字符串教程中了解有关格式化字符串文字的更多信息。

#4模板字符串(标准库) (#4 Template Strings (Standard Library))

Here’s one more tool for string formatting in Python: template strings. It’s a simpler and less powerful mechanism, but in some cases this might be exactly what you’re looking for.

这是Python中用于字符串格式化的另一个工具:模板字符串。 这是一种更简单且功能更弱的机制,但是在某些情况下,这可能正是您要寻找的。

Let’s take a look at a simple greeting example:

让我们看一个简单的问候示例:

 >>> >>>  from from string string import import Template
Template
>>> >>>  t t = = TemplateTemplate (( 'Hey, $name!''Hey, $name!' )
)
>>> >>>  tt .. substitutesubstitute (( namename == namename )
)
'Hey, Bob!'
'Hey, Bob!'

You see here that we need to import the Template class from Python’s built-in string module. Template strings are not a core language feature but they’re supplied by the string module in the standard library.

您将在此处看到我们需要从Python的内置string模块中导入Template类。 模板字符串不是核心语言功能,但由标准库中string模块提供。

Another difference is that template strings don’t allow format specifiers. So in order to get the previous error string example to work, you’ll need to manually transform the int error number into a hex-string:

另一个区别是模板字符串不允许使用格式说明符。 因此,为了使之前的错误字符串示例起作用,您需要将int错误编号手动转换为十六进制字符串:

That worked great.

效果很好。

So when should you use template strings in your Python programs? In my opinion, the best time to use template strings is when you’re handling formatted strings generated by users of your program. Due to their reduced complexity, template strings are a safer choice.

那么什么时候应该在Python程序中使用模板字符串呢? 我认为,使用模板字符串的最佳时间是在处理程序用户生成的格式化字符串时。 由于降低了复杂度,因此模板字符串是更安全的选择。

The more complex formatting mini-languages of the other string formatting techniques might introduce security vulnerabilities to your programs. For example, it’s possible for format strings to access arbitrary variables in your program.

其他字符串格式化技术的更复杂的格式化迷你语言可能会在程序中引入安全漏洞。 例如, 格式字符串可以访问程序中的任意变量

That means, if a malicious user can supply a format string, they can potentially leak secret keys and other sensitive information! Here’s a simple proof of concept of how this attack might be used against your code:

这意味着,如果恶意用户可以提供格式字符串,则他们可能会泄漏秘密密钥和其他敏感信息! 这是有关如何针对您的代码使用这种攻击的简单概念证明:

 >>> >>>  # This is our super secret key:
# This is our super secret key:
>>> >>>  SECRET SECRET = = 'this-is-a-secret'

'this-is-a-secret'

>>> >>>  class class ErrorError :
:
...      ...      def def __init____init__ (( selfself ):
):
...          ...          pass

pass

>>> >>>  # A malicious user can craft a format string that
# A malicious user can craft a format string that
>>> >>>  # can read data from the global namespace:
# can read data from the global namespace:
>>> >>>  user_input user_input = = '{error.__init__.__globals__[SECRET]}'

'{error.__init__.__globals__[SECRET]}'

>>> >>>  # This allows them to exfiltrate sensitive information,
# This allows them to exfiltrate sensitive information,
>>> >>>  # like the secret key:
# like the secret key:
>>> >>>  err err = = ErrorError ()
()
>>> >>>  user_inputuser_input .. formatformat (( errorerror == errerr )
)
'this-is-a-secret'
'this-is-a-secret'

See how a hypothetical attacker was able to extract our secret string by accessing the __globals__ dictionary from a malicious format string? Scary, huh? Template strings close this attack vector. This makes them a safer choice if you’re handling format strings generated from user input:

看看假设的攻击者如何通过从恶意格式字符串访问__globals__字典来提取我们的秘密字符串? 吓人吧? 模板字符串将关闭此攻击向量。 如果您要处理从用户输入生成的格式字符串,这将使它们成为更安全的选择:

您应该使用哪种字符串格式化方法? (Which String Formatting Method Should You Use?)

I totally get that having so much choice for how to format your strings in Python can feel very confusing. This is an excellent cue to bust out this handy flowchart infographic I’ve put together for you:

我完全明白,如何使用Python格式化字符串有太多选择,这会让您感到非常困惑。 这是消除我为您整理的该方便流程图的绝佳提示:

Python String Formatting Flowchart
Click to Tweet) 单击鸣叫

This flowchart is based on the rule of thumb that I apply when I’m writing Python:

此流程图基于我在编写Python时适用的经验法则:

Python String Formatting Rule of Thumb: If your format strings are user-supplied, use Template Strings (#4) to avoid security issues. Otherwise, use Literal String Interpolation/ (#3) if you’re on Python 3.6+, and “New Style” str.format (#2) if you’re not.

Python字符串格式化的经验法则:如果格式字符串是用户提供的,请使用模板字符串(#4)以避免安全问题。 否则,如果您使用的是Python 3.6+,请使用文字字符串插值/(#3),否则请使用“ New Style” str.format(#2)

重要要点 (Key Takeaways)

  • Perhaps surprisingly, there’s more than one way to handle string formatting in Python.
  • Each method has its individual pros and cons. Your use case will influence which method you should use.
  • If you’re having trouble deciding which string formatting method to use, try our Python String Formatting Rule of Thumb.
  • 也许令人惊讶的是,在Python中处理字符串格式的方法不止一种。
  • 每种方法都有其各自的优缺点。 您的用例将影响您应该使用哪种方法。
  • 如果您在决定使用哪种字符串格式化方法时遇到麻烦,请尝试使用我们的Python字符串格式化经验法则。

翻译自: https://www.pybloggers.com/2018/07/python-string-formatting-tutorial/

python 格式化字符串

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值