Python中的字符串

This isn't the first time that we are encountering Strings since we have started learning python. In many of the previous tutorials we have used strings in examples or discussed about it, so it shouldn't be an ambush for you. Nonetheless, this chapter will give you more insight about how they can be used, manipulated and implemented in python world. We will also checkout some handy string functions to manipulate string. So, without wasting time let's jump right into it.

自从我们开始学习python以来,这不是我们第一次遇到Strings。 在以前的许多教程中,我们都在示例中使用了字符串或对其进行了讨论,因此它不应该对您造成伏击。 尽管如此,本章将为您提供更多有关如何在python world中使用,操纵和实现它们的见解。 我们还将签出一些方便的字符串函数来操作字符串。 因此,在不浪费时间的情况下,让我们直接进入它。

什么是字符串? (What is a String?)

String can be defined as a sequence of characters, and that's the most basic explanation of string that you can provide. In this definition, we can see two important terms, first being sequence and other is characters. If you are here after finishing the last tutorial, then there, we already explained - What is Sequence data type and how Strings are a type of sequence. Just for revision, in python, Sequence is a data type which is made up of several elements of same type, i.e., integers, float, characters, strings etc.

字符串可以定义为字符序列 ,这是您可以提供的最基本的字符串说明。 在这个定义中,我们可以看到两个重要的术语,第一个是序列 ,另一个是字符 。 如果您在完成上一教程之后就在这里,那么我们已经在这里进行了解释-什么是Sequence数据类型以及Strings是一种序列类型。 仅出于修订目的,在python中,Sequence是一种数据类型,它由几个相同类型的元素组成,即整数,浮点数,字符,字符串等。

Note: There is a unique code provided to all existing characters. The coding convention had been labelled as Unicode format. It consists of characters of almost every possible languages and in fact emoticons too (yes, emoticons had been declared as characters too).
注意: 有一个唯一代码提供给所有现有字符。 该编码约定已被标记为Unicode格式 它由几乎所有可能的语言的字符组成,实际上也包括表情符号(是的,表情符号也已被声明为字符)。

Hence, strings can be considered as a special type of sequence, where all its elements are characters. For example, string "Hello, World" is basically a sequence ['H', 'e', 'l', 'l', 'o', ',', ' ', 'W', 'o', 'r', 'l', 'd'] and its length can be calculated by counting number of characters inside the sequence, which is 12.

因此,字符串可以被认为是一种特殊的序列,其中所有元素都是字符。 例如,字符串"Hello, World"基本上是一个序列['H','e','l','l','o',',',',','W','o',' r','l','d']及其长度可以通过计算序列中的字符数12来计算。

Note: 注意: Yes, space, comma everything inside those quotes will be a character if the length is 1. 是的,空格,如果长度为1,则用引号引起的逗号将都是字符。

Generally in programming languages there is a different data type dedicated to characters only, while in Python, there is no character data type. Instead characters are just treated as a string of length 1.

通常,在编程语言中,存在仅用于字符的不同数据类型,而在Python中,没有字符数据类型。 而是将字符视为长度为1的字符串。

字符串声明 (Declaration of Strings)

>>> mystring = "This is not my first String"
>>> print (mystring);
This is not my first String

Live Example →

现场示例→

You can access each individual character of a string too. Just like accessing each element of a Sequence, we can use index numbers for this purpose. To access first character of mystring, we can do following:

您也可以访问字符串的每个字符。 就像访问序列中的每个元素一样,我们可以为此使用索引号。 要访问mystring第一个字符,我们可以执行以下操作:

>>> print (mystring[0]);
T

Since T is the first character of our string This is not my first String, hence it will have index number as 0 (zero). Similarly, for further characters we can use index 1, 2, 3 and so on, i.e., in order to access ith element we will have to use (i-1)th index.

由于T是我们字符串的第一个字符, This is not my first String ,因此它的索引号为0 (零)。 类似地,对于其他字符,我们可以使用索引1、2、3等,即,为了访问第ith个元素,我们将不得不使用第(i-1)个索引。

There is another trick to access elements of the sequence from its end. For example, if you want to access the last element of the sequence just do the following:

还有另一种从序列末尾访问序列元素的技巧。 例如,如果要访问序列的最后一个元素,请执行以下操作:

>>> print (mystring[-1])

Writing -1 in the index will imply that you are asking for the 1st element from the last. Similarly, in order to access 2nd last element use -2 as index, for 3rd last use -3 and so on, i.e., for ith element from the last use -ith as the index. So that settles the generalization for accessing each character from both forward and backward side in a string. Note that positive index number implies you are accessing character from the forward side, while negative index number means you're accessing it from the rear end.

在索引中写入-1表示您要从最后一个开始索要第一个元素。 类似地,为了访问第二个最后一个元素,请使用-2作为索引,对于第三个最后一个元素请使用-3 ,依此类推,即,对于最后一个使用中的第i个元素,请使用-ith作为索引。 这样就解决了从字符串的正反两面访问每个字符的一般化问题。 请注意,正索引号表示您正在从前端访问字符,而负索引号表示您正在从后端访问字符。

We can conclude the what we have covered till now in a simple table. Consider a string PYTHON. For this each character can be accessed in two ways - from the front, or from the rear end.

我们可以在一个简单的表中总结到目前为止所涵盖的内容。 考虑一个字符串PYTHON 。 为此,可以通过两种方式访问​​每个字符-从前端或后端。

CharactersPYTHON
Forward Index012345
Backward Index-6-5-4-3-2-1
性格 P ÿ Ť H Ø ñ
远期指数 0 1个 2 3 4 5
落后指数 -6 -5 -4 -3 -2 -1

转义序列 (Escape Sequence)

Suppose you want a string to store a quote by Mahatma Gandhi.

假设您想要一个字符串来存储圣雄甘地的报价。

"You must be the change you wish to see in the world" - Gandhi “您一定是您希望在世界上看到的改变”-甘地

This is the exact line you want to display in the console. And you also wish to have the quotes surrounding this sentence. As you go ahead and print the statement, you will see that it isn't that simple.

这是您要在控制台中显示的确切行。 您也希望在这句话旁加上引号。 在继续打印语句时,您会发现它不是那么简单。

Python will instantly return a syntax error. This is because of those extra double quotes that we added. In above image you can notice that Gandhi's quoted text is in black colour, while "- Gandhi" is in green. Also, if you have used IDLE enough you might know that all the characters inside the string are highlighted in green in the IDLE (it can be other colours too depending upon text editor, python version, OS etc). This clearly means that Python isn't treating You must be the change you wish to see in the world part of the sentence as a string. Therefore, this concludes that whenever we open a quote and close it, to declare a string, whatever we write after the closing quote, is just considered as some python keyword.

Python将立即返回语法错误。 这是因为我们添加了额外的双引号。 在上图中,您会注意到,甘地引用的文本为黑色,而“-甘地”为绿色。 另外,如果您已经足够使用IDLE,则可能知道字符串中的所有字符在IDLE中都以绿色突出显示(取决于文本编辑器,python版本,操作系统等,它也可以是其他颜色)。 显然,这意味着Python并未将You视为您希望在句子的世界中看到的更改作为字符串。 因此,得出的结论是,每当我们打开引号并将其关闭时,要声明一个字符串,无论我们在引号后写什么,都被视为某个python关键字。

Like for the above quotation text, we started the string with two double quotes and wrote You must be the change you wish to see in the world just next to it, since double quote was already closed before this phrase, hence Python considered the entire sentence as some non-understandable python keywords. After the phrase, another double quote started, then came - Gandhi after that and finally the closing double quote, since - Gandhi part is within a pair of double quotes hence its totally legitimate.

就像上面的引号文本一样,我们以两个双引号开头的字符串,并写上您必须是您希望在它旁边的世界中看到的更改 ,因为双引号已在此短语之前关闭,因此Python会考虑整个句子作为一些难以理解的python关键字。 短语之后,又开始使用双引号-甘地之后,最后是双引号,因为-甘地部分在一对双引号内,因此完全合法。

Now you understand the problem that we can face if we use uneven number of double quotes. Now let's see how we can actually have a quote in a string. Well, there are two ways to do so:

现在您了解了如果我们使用双引号的数量不均会遇到的问题。 现在,让我们看看如何在字符串中实际使用引号。 好吧,有两种方法可以做到这一点:

  1. First one is a bit compromising. You can use single quotes inside of double quotes, like:

    第一个有点妥协。 您可以在双引号内使用单引号,例如:

    >>> print ("'You must be the change you wish to see in the world' - Gandhi");
    
    ‘You must be the change you wish to see in the world' - Gandhi

    Hence, it's legitimate to use single quote inside double quotes, however, reverse is not true, i.e.,

    因此,在双引号内使用单引号是合法的,但是,reverse不是正确的,即,

    >>> '"You must be the change you wish to see in the world" - Gandhi'

    Will give an error.

    会报错。

  2. Second one is for those who hate to compromise, or just want to use the double quotes. For you people, there is something called escape sequence or literally speaking, a back-slash\. You can use it like:

    第二个是那些不愿妥协或只想使用双引号的人。 对于您的人们来说,有一种叫做转义序列的东西,或者从字面上讲,是反斜杠 \ 。 您可以像这样使用它:

    >>> print ("\"You must be the change you wish to see in the world\" – Gandhi");

    Can you guess what happened? We used backslash or escape sequence at two places, just before the quotes which we directly want to print. If you want to inform the compiler to simply print whatever you type and not try to compile it, just add an escape sequence before it. Also remember, you must use one escape sequence for one character. For example, in order to print 5 double quotes, we will have to use 5 backslashes, one before each quote, like this:

    你能猜出发生了什么吗? 我们在两个地方使用了反斜杠或转义序列,就在我们直接要打印的引号之前。 如果要通知编译器只打印您键入的内容而不尝试编译它,只需在其前面添加一个转义序列即可。 还要记住,您必须对一个字符使用一个转义序列。 例如,要打印5个双引号,我们将必须使用5个反斜杠,每个引号之前应加一个,如下所示:

    >>> print ("\"\"\"\"\"");

字符串的输入和输出 (Input and Output for String)

Input and Output methods have already been discussed in Input and Output tutorial in details. It is recommended to go through that tutorial, if you haven't already.

输入和输出方法已经在输入和输出教程中详细讨论过。 如果尚未学习该教程,则建议您阅读该教程。

字符串操作 (Operations on String)

String handling in python probably requires least efforts. Since in python, string operations have very low complexity compared to other languages. Let's see how we can play around with strings.

python中的字符串处理可能需要最少的工作。 由于在python中,与其他语言相比,字符串操作的复杂度非常低。 让我们看看如何使用字符串。

  1. Concatenation: No, wait! what? This word may sound a bit complex for absolute beginners but all it means is - to join two strings. Like to join "Hello" with "World", to make it "HelloWorld". Yes, that's it.

    串联:不,等等! 什么? 对于绝对的初学者来说,这个单词听起来可能有些复杂,但是它的意思是- 连接两个字符串 。 喜欢将"Hello""World"一起加入,使其成为"HelloWorld" 。 对,就是那样。

    >>> print ("Hello" + "World");
    
    HelloWorld

    Yes. A plus sign + is enought to do the trick. When used with strings, the + sign joins the two strings. Let's have one more example:

    是。 加号+足以解决问题。 与字符串一起使用时, +号将两个字符串连接在一起。 让我们再举一个例子:

    >>> s1 = "Name Python "
    >>> s2 = "had been adapted "
    >>> s3 = "from Monty Python"
    >>> print (s1 + s2 + s3)
    
    Name Python had been adapted from Monty Python

    Live Example →

    现场示例→



  2. Repetition: Suppose we want to write same text multiple times on console. Like repeat "Hi!" a 100 times. Now one option is to write it all manually, like "Hi!Hi!Hi!..." hundred times or just do the following:

    重复:假设我们要在控制台上多次编写相同的文本。 重复一遍“嗨!” 一个100倍。 现在,一种选择是手动编写所有内容,例如“ Hi!Hi!Hi!...”一百次,或执行以下操作:

    >>> print ("Hi!"*100)

    Suppose, you want the user to input some number n and based on that you want a text to be printed on console n times, how can you do it? It's simple. Just create a variable n and use input() function to get a number from the user and then just multiply the text with n.

    假设您希望用户输入一些数字n并且希望您在控制台上打印n次文本,您该如何做呢? 这很简单。 只需创建一个变量n并使用input()函数从用户处获取一个数字,然后将文本乘以n

    >>> n = input("Number of times you want the text to repeat: ")
    
    Number of times you want the text to repeat: 5
    
    >>> print ("Text"*n);
    
    TextTextTextTextText


  3. Check existence of a character or a sub-string in a string: The keyword in is used for this. For example: If there is a text India won the match and you want to check if won exist in it or not. Go to IDLE and try the following:

    检查字符串中是否存在字符或子字符串:关键字in用于此目的。 例如:如果有文本, 印度赢得了比赛 ,您想检查其中是否存在胜利 。 转到“ IDLE”并尝试以下操作:

    >>> "won" in "India won the match"
    
    True

    Amongst other datatypes in python, there is Boolean datatype which can have one of the possible two values, i.e., either true or false. Since we are checking if something exists in a string or not, hence, the possible outcomes to this will either be Yes, it exists or No, it doesn't, therefore either True or False is returned. This should also give you an idea about where to use Boolean datatype while writing programs.

    除python中的其他数据类型外,还有布尔数据类型,它可以具有两个可能值之一,即truefalse 。 由于我们正在检查字符串中是否存在某些内容,因此,可能的结果是“是”,“存在”或“ 否”,“不” ,因此返回TrueFalse 。 这也应该使您了解编写程序时在哪里使用布尔数据类型。



  4. not in keyword: This is just the opposite of the in keyword. You're pretty smart if you guessed that right. Its implementation is also pretty similar to the in keyword.

    not in 关键字:这与in关键字相反。 如果您猜对了,那么您就很聪明。 它的实现也与in关键字非常相似。

  5. >>> "won" not in "India won the match"
    
    False

You can see all the above String operations live in action, by clicking on the below Live example button. Also, we suggest you to practice using the live compiler and try changing the code and run it.

通过单击下面的“实时”示例按钮,可以看到以上所有的String操作实时生效。 另外,我们建议您练习使用实时编译器并尝试更改代码并运行它。

Live Example →

现场示例→

将String转换为Int或Float数据类型,反之亦然 (Converting String to Int or Float datatype and vice versa)

This is a very common doubt amongst beginners as a number when enclosed in quotes becomes a string in python and then if you will try to perform mathematical operations on it, you will get error.

对于初学者来说,这是一个非常普遍的疑问,因为用引号引起来的数字在python中成为字符串,然后如果您尝试对其执行数学运算,则会出错。

numStr = '123'

In the statement above 123 is not a number, but a string.

在上面的语句中123不是数字,而是字符串。

Hence, in such situation, to convert a numeric string into float or int datatype, we can use float() and int() functions.

因此,在这种情况下,要将数字字符串转换为floatint数据类型,我们可以使用float()int()函数。

numStr = '123'
numFloat = float(numStr)
numInt = int(numFloat)

Live Example →

现场示例→

And then you can easily perform mathematical functions on the numeric value.

然后,您可以轻松地对数值执行数学函数。

Similarly, to convert an int or float variable to string, we can use the str() function.

类似地,要将intfloat变量转换为string ,我们可以使用str()函数。

num = 123
# so simple
numStr = str(num)

切片 (Slicing)

Slicing is yet another string operation. Slicing lets you extract a part of any string based on a start index and an end index. For example, if we have a string This is Python tutorial and we want to extract a part of this string or just a character, then we can use slicing. First lets get familiar with its usage syntax:

切片是另一个字符串操作。 通过切片,您可以基于开始索引和结束索引提取任何字符串的一部分。 例如,如果我们有一个字符串“ This is Python”教程 ,并且想要提取此字符串的一部分或只是一个字符,则可以使用切片。 首先,让我们熟悉其用法语法:

string_name[starting_index : finishing_index : character_iterate]
  • String_name is the name of the variable holding the string.

    String_name是保存字符串的变量的名称。

  • starting_index is the index of the beginning character which you want in your sub-string.

    starting_index是您要在子字符串中开始的字符的索引。

  • finishing_index is one more than the index of the last character that you want in your substring.

    finish_index比您要在子字符串中最后一个字符的索引大一。

  • character_iterate: To understand this, let us consider that we have a string Hello Brother!, and we want to use the slicing operation on this string to extract a sub-string. This is our code:

    character_iterate :要理解这一点,让我们考虑我们有一个字符串Hello Brother! ,我们想对该字符串使用切片操作来提取子字符串。 这是我们的代码:

    >>> str = "Hello Brother!"
    >>> print(str[0:10:2]);

    Live Example →

    现场示例→

    Now str[0:10:2] means, we want to extract a substring starting from the index 0 (beginning of the string), to the index value 10, and the last parameter means, that we want every second character, starting from the starting index. Hence in the output we will get, HloBo.

    现在str[0:10:2]表示,我们要提取一个从索引0 (字符串的开头)到索引值10的子字符串,最后一个参数表示我们想要第二个字符,从起始索引。 因此,在输出中,我们将得到HloBo

    H is at index 0, then leaving e, the second character from H will be printed, which is l, then skipping the second l, the second character from the first l is printed, which is o and so on.

    H在索引0 ,然后离开e ,将打印H的第二个字符,即l ,然后跳过第二个l ,打印第一个l的第二个字符,即o ,依此类推。

It will be more clear with a few more examples:

通过更多示例,将更加清楚:

Let's take a string with 10 characters, ABCDEFGHIJ. The index number will begin from 0 and end at 9.

让我们以10个字符组成的字符串ABCDEFGHIJ 。 索引号将从0开始,到9结束。

ABCDEFGHIJ
0123456789
一个 C d Ë F G H 一世 Ĵ
0 1个 2 3 4 5 6 7 8 9

Now try the following command:

现在尝试以下命令:

>>> print s[0:5:1]

Here slicing will be done from 0th character to the 4th character (5-1) by iterating 1 character in each jump.

这里的切片将通过在每个跳转中迭代1字符来完成,从第0个字符到第4个字符(5-1)。

Now, remove the last number and the colon and just write this.

现在,删除最后一个数字和colon然后写下来。

>>> print (s[0:5]);

You'll see that output are both same.

您会看到输出都是相同的。

You can practice by changing the values. Also try changing the value of the character iterate variable to some value n, then it will print every nth character from starting index to the final index.

您可以通过更改值进行练习。 还要尝试将字符迭代变量的值更改为某个值n ,然后它将打印从起始索引到最终索引的nth字符。

翻译自: https://www.studytonight.com/python/string-in-python

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值