bash脚本 文件_如何使用Bash脚本来管理从AWS S3存储桶下载和查看文件

bash脚本 文件

As you can read in this article, I recently had some trouble with my email server and decided to outsource email administration to Amazon's Simple Email Service (SES).

正如您在本文中读到的 ,最近我的电子邮件服务器遇到了一些麻烦,决定将电子邮件管理外包给Amazon的Simple Email Service(SES)。

The problem with that solution was that I had SES save new messages to an S3 bucket, and using the AWS Management Console to read files within S3 buckets gets stale really fast.

该解决方案的问题在于,我让SES将新消息保存到S3存储桶中,并且使用AWS管理控制台读取S3存储桶中的文件会很快变得过时。

So I decided to write a Bash script to automate the process of downloading, properly storing, and viewing new messages.

因此,我决定编写一个Bash脚本来自动化下载,正确存储和查看新消息的过程。

While I wrote this script for use on my Ubuntu Linux desktop, it wouldn't require too much fiddling to make it work on a macOS or Windows 10 system through Windows SubSystem for Linux.

当我编写此脚本以在Ubuntu Linux桌面上使用时,不需要太多的麻烦就可以通过Windows SubSystem for Linux在macOS或Windows 10系统上运行它

Here's the complete script all in one piece. After you take a few moments to look it over, I'll walk you through it one step at a time.

以下是完整的脚本。 您花了一些时间仔细查看后,我将一步一步地指导您完成操作。

We'll begin with the single command to download any messages currently residing in my S3 bucket (by the way, I've changed the names of the bucket and other filesystem and authentication details to protect my privacy).

我们将从单个命令开始下载当前驻留在我的S3存储桶中的所有消息(顺便说一句,为了保护我的隐私,我更改了存储桶的名称以及其他文件系统和身份验证详细信息)。

aws s3 cp \
   --recursive \
   s3://bucket-name/ \
   /home/david/s3-emails/tmpemails/  \
   --profile myaccount

Of course, this will only work if you've already installed and configured the AWS CLI for your local system. Now's the time to do that if you haven't already.

当然,这仅在您已经为本地系统安装并配置了AWS CLI的情况下才有效。 如果您还没有的话,现在是时候这样做了。

The cp command stands for "copy," --recursive tells the CLI to apply the operation even to multiple objects, s3://bucket-name points to my bucket (your bucket name will obviously be different), the /home/david... line is the absolute filesystem address to which I'd like the messages copied, and the --profile argument tells the CLI which of my multiple AWS accounts I'm referring to.

cp命令代表“复制”,-- recursive告诉CLI甚至将操作应用于多个对象, s3:// bucket-name指向我的存储桶(您的存储桶名称显然会有所不同),/ home / david ...行是我要将消息复制到的绝对文件系统地址,-- profile参数告诉CLI我要引用的是我多个AWS账户中的哪个。

The next section sets two variables that will make it much easier for me to specify filesystem locations through the rest of the script.

下一节将设置两个变量,这将使我更容易通过脚本的其余部分指定文件系统位置。

tmp_file_location=/home/david/s3-emails/tmpemails/*
base_location=/home/david/s3-emails/emails/

Note how the value of the tmp_file_location variable ends with an asterisk. That's because I want to refer to the files within that directory, rather than the directory itself.

请注意tmp_file_location变量的值如何以星号结尾。 那是因为我要引用该目录中的文件 ,而不是目录本身。

I'll create a new permanent directory within the .../emails/ hierarchy to make it easier for me to find messages later. The name of this new directory will be the current date.

我将在... / emails /层次结构中创建一个新的永久目录,以使我以后更容易找到消息。 这个新目录的名称将是当前日期。

today=$(date +"%m_%d_%Y")
[[ -d ${base_location}/"$today" ]] || mkdir ${base_location}/"$today"

I first create a new shell variable named today that will be populated by the output of the date +"%m_%d_%Y" command. date itself outputs the full date/timestamp, but what follows ("%m_%d_%Y") edits that output to a simpler and more readable format.

我首先创建一个名为Today的新shell变量,该变量将由date +“%m_%d_%Y”命令的输出填充。 date本身会输出完整的日期/时间戳,但随后的内容( “%m_%d_%Y” )会将其编辑为更简单且更易读的格式。

I then test for the existence of a directly using that name - which would indicate that I've already received emails on that day and, therefore, there's no need to recreate the directory. If such a directory does not exist (||), then mkdir will create it for me. If you don't run this test, your command could return annoying error messages.

然后,我将使用该名称直接测试是否存在-这表明我当天已经收到了电子邮件,因此,无需重新创建目录。 如果这样的目录存在(||),然后将MKDIR创建对我来说。 如果不运行此测试,则您的命令可能会返回令人讨厌的错误消息。

Since Amazon SES gives ugly and unreadable names to each of the messages it drops into my S3 bucket, I'll now dynamically rename them while, at the same time, moving them over to their new home (in the dated directory I just created).

由于Amazon SES给它放入S3存储桶中的每条消息都赋予丑陋且难以理解的名称,因此,我现在将对其动态重命名,同时将其移至新位置(在我刚刚创建的带日期的目录中) 。

for FILE in $tmp_file_location
do
   mv $FILE ${base_location}/${today}/email$(rand)
done

The for...do...done loop will read each of the files in the directory represented by the $tmp_file_location variable and then move it to the directory I just created (represented by the $base_location variable in addition to the current value of $today).

for ... do ... done循环将读取$ tmp_file_location变量表示的目录中的每个文件,然后将其移至我刚刚创建的目录(除了$ ...的当前值之外,还由$ base_location变量表示) 今天 )。

As part of the same operation, I'll give it its new name, the string "email" followed by a random number generated by the rand command. You may need to install a random number generator: that'll be apt install rand on Ubuntu.

作为同一操作的一部分,我将为其赋予新的名称,即字符串“ email ”,后跟由rand命令生成的随机数。 您可能需要安装随机数生成器:可以在Ubuntu上安装rand

An earlier version of the script created names differentiated by shorter, sequential numbers that were incremented using a count=1...count=$((count+1)) logic within the for loop. That worked fine as long as I didn't happen to receive more than one batch of messages on the same day. If I did, then the new messages would overwrite older files in that day's directory.

该脚本的早期版本创建的名称以较短的顺序号区分,这些顺序号使用for循环内的count = 1 ... count = $((count + 1))逻辑递增。 只要我当天没碰到多于一封邮件,就可以正常工作。 如果我这样做了,那么新消息将覆盖当天目录中的旧文件。

I guess it's mathematically possible that my rand command could assign overlapping numbers to two files but, given that the default range rand uses is between 1 and 32,576, that's a risk I'm willing to take.

我猜我的rand命令在数学上可能会将重叠的数字分配给两个文件,但是,鉴于rand使用的默认范围是1到32,576之间,这是我愿意承担的风险。

At this point, there should be files in the new directory with names like email3039, email25343, etc. for each of the new messages I was sent.

此时,在新目录中,应该为我发送的每条新消息都命名为email3039,email25343等文件。

Running the tree command on my own system shows me that five messages were saved to my 02_27_2020 directory, and one more to 02_28_2020 (these files were generated using the older version of my script, so they're numbered sequentially).

在我自己的系统上运行tree命令显示,有5条消息保存到02_27_2020目录中,另外1条保存到02_28_2020(这些文件是使用较旧版本的脚本生成的,因此按顺序编号)。

There are currently no files in tmpemails - that's because the mv command moves files to their new location, leaving nothing behind.

tmpemails当前没有文件-这是因为mv命令将文件移动到新位置,不留任何内容。

$ tree
.
├── emails
│   ├── 02_27_2020
│   │   ├── email1
│   │   ├── email2
│   │   ├── email3
│   │   ├── email4
│   │   ├── email5
│   └── 02_28_2020
│       └── email1
└── tmpemails

The final section of the script opens each new message in my favorite desktop text editor (Gedit). It uses a similar for...do...done loop, this time reading the names of each file in the new directory (referenced using the "today" command) and then opening the file in Gedit. Note the asterisk I added to the end of the directory location.

脚本的最后部分在我最喜欢的桌面文本编辑器(Gedit)中打开每个新消息。 它使用类似的for ... do ... done循环,这次读取新目录中的每个文件的名称(使用“ today ”命令引用),然后在Gedit中打开该文件。 请注意我添加到目录位置末尾的星号。

for NEWFILE in ${base_location}/${today}/*
do
   gedit $NEWFILE
done

There's still one more thing to do. If I don't clean out my S3 bucket, it'll download all the accumulated messages each time I run the script. That'll make it progressively harder to manage.

还有另一件事要做。 如果我不清理我的S3存储桶,则每次运行脚本时,它将下载所有累积的消息。 这将使其变得越来越难管理。

So, after successfully downloading my new messages, I run this short script to delete all the files in the bucket:

因此,在成功下载新消息后,我运行以下简短脚本以删除存储桶中的所有文件:

#!/bin/bash
# Delete all existing emails 

aws s3 rm --recursive s3://bucket-name/ --profile myaccount

翻译自: https://www.freecodecamp.org/news/bash-script-download-view-from-s3-bucket/

bash脚本 文件

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值