perl 修改文本文件_如何使用Perl解析文本文件

perl 修改文本文件

Parsing text files is one of the reasons Perl makes a great data mining and scripting tool.

解析文本文件是Perl成为出色的数据挖掘和脚本编写工具的原因之一。

As you'll see below, Perl can be used to basically reformat a group of text. If you look down at the first chunk of text and then the last part at the bottom of the page, you can see that the code in the middle is what transforms the first set into the second.

正如您将在下面看到的,Perl可以用于基本上重新格式化一组文本。 如果您向下看第一部分文本,然后查看页面底部的最后部分,则可以看到中间的代码将第一组转换为第二组。

如何解析文本文件 ( How to Parse Text Files )

As an example, let's build a little program that opens up a tab separated data file, and parses the columns into something we can use.

作为示例,让我们构建一个小的程序,该程序打开一个制表符分隔的数据文件,并将这些列解析为可以使用的内容。

Say, as an example, that your boss hands you a file with a list of names, emails, and phone numbers, and wants you to read the file and do something with the information, like put it into a database or just print it out in a nicely formatted report.

举例来说,假设您的老板将文件,姓名,电子邮件和电话号码交给您,并希望您阅读该文件并对信息进行某些处理,例如将其放入数据库或仅将其打印出来在格式正确的报告中。

The file's columns are separated with the TAB character and would look something like this:

该文件的列用TAB字符分隔,看起来像这样:

 Larry larry@example.com 111-1111
 Curly curly@example.com 222-2222
 Moe moe@example.com 333-3333 

Here's the full listing we'll be working with:

这是我们将使用的完整清单:

 #!/usr/bin/perl
 open (FILE, 'data.txt');
 while (<FILE>) {
 chomp;
 ($name, $email, $phone) = split("\t");
 print "Name: $name\n";
 print "Email: $email\n";
 print "Phone: $phone\n";
 print "---------\n";
 }
 close (FILE);
 exit;

Note: This pulls some code from the tutorial on how to read and write files in Perl.

注意:这从教程中提取了一些有关如何在Perl中读取和写入文件的代码。

What it does first is open a file called data.txt (that should reside in the same directory as the Perl script). Then, it reads the file into the catchall variable $_ line by line. In this case, the $_ is implied and not actually used in the code.

首先要做的是打开一个名为data.txt的文件 (该文件应与Perl脚本位于同一目录中)。 然后,它将文件逐行读取到catchall变量$ _中。 在这种情况下,$ _是隐含的 ,实际上并未在代码中使用。

After reading in a line, any whitespace is chomped off the end of it. Then, the split function is used to break the line on the tab character. In this case, the tab is represented by the code \t. To the left of the split's sign, you'll see that I'm assigning a group of three different variables. These represent one for each column of the line.

在一条线读之后,任何空白字符chomped关闭它的结束。 然后,使用split函数在制表符上断开行。 在这种情况下,选项卡由代码\ t表示 。 在拆分符号的左侧,您将看到我正在分配一组三个不同的变量。 这些代表该行的每一列。

Finally, each variable that has been split from the file's line is printed separately so that you can see how to access each column's data individually.

最后,已从文件行中分割出的每个变量将单独打印,以便您可以看到如何分别访问每一列的数据。

The output of the script should look something like this:

脚本的输出应如下所示:

 Name: Larry
 Email: larry@example.com
 Phone: 111-1111
 ---------
 Name: Curly
 Email: curly@example.com
 Phone: 222-2222
 ---------
 Name: Moe
 Email: moe@example.com
 Phone: 333-3333
 --------- 

Although in this example we're just printing out the data, it would be trivially easy to store that same information parsed from a TSV or CSV file, in a full-fledged database.

尽管在此示例中,我们只是打印数据,但将完整的数据库中存储的是从TSV或CSV文件解析的相同信息,将非常容易。

翻译自: https://www.thoughtco.com/parsing-text-files-2641088

perl 修改文本文件

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值