Globus介绍
Globus is research cyberinfrastructure, developed and operated as a not-for-profit service by the University of Chicago.
With Globus, you can easily, reliably and securely move, share, & discover data no matter where it lives – from a supercomputer, lab cluster, tape archive, public cloud or laptop. Access and manage all your data, even protected data, from anywhere, using your existing identities, with just a web browser.
Globus or https://www.globus.org/
如何使用
最简单的就是你可以登陆注册,就可以直接通过网页下载数据。
但是对于大数据(DAS)用网页一个一个下载不太现实,这里介绍的是如何下载大数据到自己的电脑上或者服务器上。
一共三步实现这一过程:
- 登陆注册 ;
- 把自己电脑变成Globus的终端 ;
- 通过终端下载数据
登陆注册大家自己搞就行,很简单的!
Globus的终端
如果把自己的电脑或者服务器变成Globus的一个终端,这个网站非常全面,但是过于冗杂。如果你想整真正搞明白,请务必认真参考!
Link orhttps://docs.globus.org/how-to/globus-connect-personal-linux/#globus-connect-personal-cli
下面是我自己用到的过程:
- 下载Globus到自己服务器
$ wget https://downloads.globus.org/globus-connect-personal/linux/stable/globusconnectpersonal-latest.tgz
- 解压和认证
$ tar xzf globusconnectpersonal-latest.tgz
$ cd globusconnectpersonal-x.y.z
$ ./globusconnectpersonal
这个过程中会出现一个网址,你把网址输入到浏览器,然后登陆你的账户,会给你一个验证码。把验证码输入到终端就可以了。还要设置一下你的终端名字~
- 运行和终止
# run
$ ./globusconnectpersonal -start &
# stop
$ ./globusconnectpersonal -stop
当你运行的时候或者终止的时候可以查看状态
如果你要下载数据,要保证你的globusconnectpersonal的连接状态
# status
$ ./globusconnectpersonal -status
Globus Online: connected
Transfer Status: idle
- 添加路径
默认的是自己的
/home/
路径,一般来说该路径的内存很小,所以要添加我们的储存路径。如果我们的储存路径是/data/xiayiran/
# open this file
vi ~/.globusonline/lta/config-paths
添加自己的储存路径
~/,0,0
/data/xiayiran,0,1
Congratulation!!!你完成了第二步!
下载数据
如何使用Globus下载数据,这个网站非常全面,但是过于冗杂。如果你想整真正搞明白,请务必认真参考!
Link orhttps://help.jasmin.ac.uk/article/4480-data-transfer-tools-globus-command-line-interface
- 安装globus-cli
$ pip install globus-cli
- 登陆
$ globus login --no-local-server
Please authenticate with Globus here:
------------------------------------
https://auth.globus.org/.../ I am website!!!!!!!
------------------------------------
Enter the resulting Authorization Code here:
这个过程中会出现一个网址,你把网址输入到浏览器,然后登陆你的账户,会给你一个验证码。把验证码输入到终端就可以了。
你就会看到下面内容,表明可以了!
You have successfully logged in to the Globus CLI!
You can check your primary identity with
globus whoami
For information on which of your identities are in session use
globus session show
Logout of the Globus CLI with
globus logout
- 找到你要下载数据的名称和账户
$ globus endpoint search "PubDAS" --filter-owner-id 4c984b40-a0b2-4d9e-b132-b32 735905e23@clients.auth.globus.org
ID | Owner | Display Name
------------------------------------ | ------------------------------------------------------------ | -------------
706e304c-5def-11ec-9b5c-f9dfb1abb183 | 4c984b40-a0b2-4d9e-b132-b32735905e23@clients.auth.globus.org | PubDAS
1013e4a6-5df1-11ec-bded-55fe55c2cfea | 4c984b40-a0b2-4d9e-b132-b32735905e23@clients.auth.globus.org | PubDAS-upload
我要下载的PubDAS数据,记录其ID。
export ep1=706e304c-5def-11ec-9b5c-f9dfb1abb183
查看该路径下的数据。
$ globus ls $ep1:
DAS-Month-02.2023/
FORESEE/
FOSSA/
Fairbanks/
LaFargeConcoMine/
Stanford-1-Campus/
Stanford-2-Sandhill-Road/
Stanford-3-ODH4/
Valencia/
License.txt
- 获取自己Globus的ID.
$ globus endpoint search "YourName(STEP2)" --filter-owner-id yourname(step1)@globusid.org
ID | Owner | Display Name
------------------------------------ | ---------------------------- | --------------------------
ddb59aef-6d04-11e5-ba46-22000b92c6ec | yourname(step1)@globusid.org | YourName(STEP2)
把ID写到环境中方便下面使用。
export ep2=ddb59aef-6d04-11e5-ba46-22000b92c6ec
- 下载数据
# here is defaut path (your home path)
$ globus transfer $ep1:License.txt $ep2:/~/License.txt
Message: The transfer has been accepted and a task has been created and queued for execution
Task ID: dfb36cd8-7d39-11ec-891f-939ceb6dfaf1
第二步中如果添加了自己的储存路径。
# here is defaut path (your home path)
$ globus transfer $ep1:License.txt $ep2:/data/xiayiran/License.txt
Message: The transfer has been accepted and a task has been created and queued for execution
Task ID: dfb36cd8-7d39-11ec-891f-939ceb6dfaf1
你可以利用上面的Task ID查看文件传输状态!
$ globus task show dfb36cd8-7d39-11ec-891f-939ceb6dfaf1
Label: None
Task ID: dfb36cd8-7d39-11ec-891f-939ceb6dfaf1
Is Paused: False
Type: TRANSFER
Directories: 0
Files: 1
Status: SUCCEEDED
Request Time: 2022-01-24T17:20:07+00:00
Faults: 0
Total Subtasks: 2
Subtasks Succeeded: 2
Subtasks Pending: 0
Subtasks Retrying: 0
Subtasks Failed: 0
Subtasks Canceled: 0
Subtasks Expired: 0
Subtasks with Skipped Errors: 0
Completion Time: 2022-01-24T17:20:08+00:00
Source Endpoint: ESnet Read-Only Test DTN at Starlight
Source Endpoint ID: 57218f41-3200-11e8-b907-0ac6873fc732
Destination Endpoint: Globus Tutorial Endpoint 1
Destination Endpoint ID: ddb59aef-6d04-11e5-ba46-22000b92c6ec
Bytes Transferred: 1000000
Bytes Per Second: 587058
Congratulation!!!Finished!
结语
Globus保存很多有用的数据,也欢迎大家分享自己的数据!