递归上传/下载文件夹

[1] 中的训练要换成用完整的 ScanNetV2(指完整的帧,但只下 .sens 和 _2d-instance.zip 两种类型的文件,详见 [2] 的参数),在两台服务器上分别下了一部分,现要在两台服务器之间互传。为了方便 hack(筛选文件、多进程)自己写了几个脚本:错觉:自己写的比较快

三个脚本:

  • export-dir.sh:导出文件夹的结构(写进一个 .sh,用来在另一台机上重建目录结构,为上传/下载做准备)和目录内的文件列表(写进一个 .txt,为下载做准备)。可以在里面指定 FILE_TYPE 来限定要列出的文件类型,默认全部。
  • send-folder.sh:上传用。用 export-dir.sh 导出的重建脚本在目标机器上建好目录之后,递归发送目录内的文件过去。可以指定 FILE_TYPE
  • fetch-folder.sh:下载用。用 export-dir.sh 在本地重建目录结构,再下载。因为不好指定 FILE_TYPE,所以在 export-dir.sh 中指定,然后根据导出的 .txt 下载。也可以不指定文件列表,就是全部下载,但已经下过的也会重复下,所以还是指定好。

注意其中有些地方必须用引号 ' 的,参考 [3]。

Scripts

export-dir.sh

  • $1 指定要导出的目录,$2 指定导出重建目录结构的 shell 的(路径和)名字,$3 是文件列表;
  • FILE_TYPE 要在文件内指定,是个数组。
#!/bin/bash

# Export the tree structure & file list of a folder to
#   a shell script & text file respectively for building
#   a same folder tree elsewhere & download the files.
## Arguments
# $1: path of folder to export
# $2: (optional) output script name
# $3: (optional) output file list text file name
## Inner Parameters
# FILE_TYPE: indicate what types of file to list

# FILE_TYPE=(scene*_*.sens scene*_*_2d-instance.zip)
FILE_TYPE=(' ')  # all

SRC=${1-"~/data/ScanNet"}
if [ ! -d $SRC ]; then
	echo * NO SUCH FOLDER: $SRC
	exit
fi
cd $SRC
SRC=`pwd`
cd - # back to last path

OUT_SHELL=${2-"copy-dir_`basename $SRC`.sh"}
cd `dirname $OUT_SHELL`
OUT_SHELL=`pwd`/`basename $OUT_SHELL`
cd -

OUT_TEXT=${3-"file-list_`basename $SRC`.txt"}
cd `dirname $OUT_TEXT`
OUT_TEXT=`pwd`/`basename $OUT_TEXT`
cd -

touch $OUT_SHELL  # permission test
if [ $? -ne 0 ]; then exit; fi
touch $OUT_TEXT   # permission test
if [ $? -ne 0 ]; then exit; fi
printf "\n\t%s\n\n" "$SRC  -->	$OUT_SHELL, $OUT_TEXT"

cd $SRC/..  # `/..` = `/`, no problem
src=`basename $SRC`

dfs()
{
	# print folder structure
	if [ $2 -gt 1 ]; then
		printf "|  %.0s" $(seq 2 $2)
	fi
	if [ $2 -gt 0 ]; then
		printf "|- "
	fi
	echo $1/

	echo "if [ ! -d $1 ]; then mkdir $1; fi" >> $OUT_SHELL
	cd $1
	echo "cd $1" >> $OUT_SHELL
	# src=$src/$1
	for ft in "${FILE_TYPE[@]}"; do  # enclosed by quotes
		# echo file type: $ft
		for f in `ls $ft 2>/dev/null`; do
			if [ -f $f ]; then	# is file
				echo $src/$f >> $OUT_TEXT
			fi
		done
	done
	for d in `ls -d */ 2>/dev/null`; do
		d=`basename $d`
		src=$src/$d
		dfs $d `expr 1 + $2`
		src=`dirname $src`
	done
	cd ..
	echo "cd .." >> $OUT_SHELL
	# src=`dirname $src`
}

if [ -f $OUT_TEXT ]; then
	# echo backing-up old file list: $OUT_TEXT
	# if [ -f $OUT_TEXT.bak ]; then
	# 	echo removing old backup: $OUT_TEXT.bak
	# 	rm $OUT_TEXT.bak
	# fi
	# mv $OUT_TEXT $OUT_TEXT.bak
	rm $OUT_TEXT
fi
echo "#/bin/bash" > $OUT_SHELL  # single '>'
dfs $src 0

send-folder.sh

  • $1 指定要上传的目录,IPPORTUSERDEST_ROOT上传过去放在哪,但包含要传的文件夹的名字。用 export-dir.sh 导出的重建 script 在目标机上建好目录,因为 scp 不能建目录)、N_PROCESSES 在脚本内指定;
  • 会在 ~/.cache/ 下面建目录,主要记录哪些已经发送成功的,断点重传时跳过。但如果下次要传同一个目录给另一台机,要删掉这个临时目录先,否则会无脑跳过
  • 多进程会传着传着卡住,不知道为什么…
#!/bin/bash

# Send (specified types of files within) a folder recursively to another machine.
## Prerequisites
# SSH password-free login in destination machine should be configured in advance.
# `scp` is used for sending.
## Arguments
# $1: path of folder to send

IP=1.2.3.4
PORT=22
USER=itom
# $DEST_ROOT: path / parent folder where the folder to send
#   will be placed in the destination machine,
#   i.e. <folder-to-send> -> $DEST_ROOT/<folder-to-send>
DEST_ROOT=/home/itom
N_PROCESSES=1
# FILE_TYPE=(scene*_*.sens scene*_*_2d-instance.zip)
FILE_TYPE=(' ')  # all

SRC=${1-"~/data/ScanNet"}
if [ ! -d $SRC ]; then
	echo * NO SUCH FOLDER: $SRC
	exit
fi
cd $SRC
SRC=`pwd` # full path -> begins with '/'

# temporary files
TMP_P=~/.cache/itom-send$SRC # NO '/' cuz $SRC begins with one
SENT_LOG=${TMP_P}/sent.txt
if [ ! -d $TMP_P ]; then mkdir -p $TMP_P; fi
touch $SENT_LOG
gather_logs()
{
	# gather (potential) ungathered sent logs
	for log_f in `ls ${TMP_P}/sent-*.txt 2>/dev/null`; do
		echo gather: $log_f
		cat ${log_f} >> $SENT_LOG
		rm ${log_f}
	done
}
# create auxiliary sending script
#	$1: full path of source file
#	$2: full path of destination file
#	$3: process ID
echo 'scp -P' $PORT '$1' $USER@$IP':$2 2>/dev/null' > ${TMP_P}/_send.sh # single '>'
echo 'if [ $? -eq 0 ]; then' >> ${TMP_P}/_send.sh
echo '	pid=${3-"0"}' >> ${TMP_P}/_send.sh
echo '	echo $1 >' ${TMP_P}'/sent-$pid.txt' >> ${TMP_P}/_send.sh
echo 'fi' >> ${TMP_P}/_send.sh

dest=${DEST_ROOT}
process_id=0

send()
{
	r=`grep "$1" ${SENT_LOG}`
	if [ "$r" != "" ]; then  # already sent
		echo skip: $1
	else
		dest_f=$dest/`basename $1`
		echo $1 -\> $dest_f

		if [ $N_PROCESSES -lt 2 ]; then
			scp -P $PORT $1 $USER@$IP:$dest_f
			echo $1 >> $SENT_LOG
		else
			bash ${TMP_P}/_send.sh $1 $dest_f $process_id &
			process_id=`expr 1 + $process_id`
			if [ $process_id -ge $N_PROCESSES ]; then
				wait
				process_id=0 # reset
				gather_logs
			fi
		fi
	fi
}

dfs()
{
	cd $1
	dest=$dest/$1 # remote enter
	for ft in "${FILE_TYPE[@]}"; do  # enclosed by quotes
		# echo file type: $ft
		for src_f in `ls $ft 2>/dev/null`; do
			if [ -f $src_f ]; then	# is file
				# echo $src_f
				send `pwd`/`basename $src_f`  # use full path
			fi
		done
	done
	for d in `ls -d */ 2>/dev/null`; do
		dfs `basename $d`
	done
	cd ..
	dest=`dirname $dest` # remote exit
}

gather_logs
cd $SRC/..  # `/..` = `/`, no problem
src=`basename $SRC`
dfs $src
if [ $N_PROCESSES -gt 1 ]; then
	gather_logs
fi

clear
echo Finish sending $SRC

fetch-folder.sh

  • $1指定要下载的文件夹(可含路径,同样要用 export-dir.sh 导出的重建 script 在目标机上建好目录);IP 那些也是脚本内指定,其中 SRC_ROOT 也是指出源机上那个文件夹的路径就好,带要下载的那个文件夹名字;
  • $2 指定文件列表,由 export-dir.sh 指出;
  • 下载中的文件带 .tmp 后缀,下载完才重命名去掉此后缀,这样文件存在就表示下载完成(思路抄自 [2]);
  • 下载前、下载完重命名前都会检查文件是否存在,第二次检查是考虑到可能有其它的下载程序在一起跑。
#!/bin/bash

# Fetch (specified types of files within) a remote folder recursively to local.
## Prerequisite
# Run `export-dir.sh` on source machine to get the tree structure (.sh) & file
#   list (.txt) of that source folder, and clone the folder tree at local in advance.
# SSH password-free login in destination machine should be configured in advance.
# `scp` is used for fetching.
## Arguments
# $1: path of (cloned) local destination folder to fetch
# $2: (optional) path to file list file (1 file per line)
#     If not provided, fetch all types of file with NO existence check,
#     multiprocessing & temporary file support.

IP=1.2.3.4
PORT=22
USER=itom
# $SRC_ROOT: path / parent folder where the remote folder to fetch lies
#   i.e. $SRC_ROOT/<folder-to-fetch> -> <folder-to-fetch>
SRC_ROOT=/home/itom/codes
N_PROCESSES=1

DEST=${1-"~/codes/ScanNet"}
if [ ! -d $DEST ]; then
	echo * NO SUCH FOLDER: $DEST
    echo Hint: use \`export-dir.sh\` to clone folder structure first
	exit
fi
cd $DEST
DEST=`pwd`
cd -

if [ ! -z $2 ]; then
	cd `dirname $2`
	FILE_LIST=`pwd`/`basename $2`  # use full path
	cd -
else
	unset FILE_LIST
fi

# temporary files
TMP_P=~/.cache/itom-fetch
if [ ! -d $TMP_P ]; then mkdir -p $TMP_P; fi
# create auxiliary fetching script
#	$1: full path of source file
#	$2: full path of destination file
echo 'if [ -f $2 ]; then' > ${TMP_P}/_fetch.sh # single '>'
echo '  echo skip: $2' >> ${TMP_P}/_fetch.sh
echo 'else' >> ${TMP_P}/_fetch.sh
echo '  echo fetch:' $USER@$IP:'$1 -\> $2' >> ${TMP_P}/_fetch.sh
echo '  scp -P' $PORT $USER@$IP:'$1 $2.tmp 2>/dev/null' >> ${TMP_P}/_fetch.sh
echo '  if [ $? -eq 0 -a ! -f $2 ]; then' >> ${TMP_P}/_fetch.sh
echo '	  mv $2.tmp $2' >> ${TMP_P}/_fetch.sh
echo '  elif [ -f $2.tmp ]; then' >> ${TMP_P}/_fetch.sh
echo '	  rm $2.tmp' >> ${TMP_P}/_fetch.sh
echo '  fi' >> ${TMP_P}/_fetch.sh
echo 'fi' >> ${TMP_P}/_fetch.sh

src=${SRC_ROOT}
process_id=0

dfs()
{
	cd $1
	src=$src/$1 # remote enter
	scp -P $PORT $USER@$IP:$src/* . 2> /dev/null
	for d in `ls -d */ 2>/dev/null`; do
		dfs `basename $d`
	done
	cd ..
	src=`dirname $src` # remote exit
}

loop_file()
{
	echo $1, $2
	for f in `cat $2`; do
		# f=$1/$f
		if [ $N_PROCESSES -gt 1 ]; then
			bash ${TMP_P}/_fetch.sh ${SRC_ROOT}/$f $1/$f &
			process_id=`expr 1 + $process_id`
			if [ $process_id -ge $N_PROCESSES ]; then
				wait
				process_id=0
			fi
		elif [ -f $1/$f ]; then
			echo skip: $1/$f
		else
			echo fetch: $USER@$IP:${SRC_ROOT}/$f -\> $1/$f
			scp -P $PORT $USER@$IP:${SRC_ROOT}/$f $1/$f.tmp
			# double-check of existence in case there is
			#   any other parallel download program
			if [ $? -eq 0 -a ! -f $1/$f ]; then
				mv $1/$f.tmp $1/$f
			elif [ -f $1/$f.tmp ]; then
				rm $1/$f.tmp
			fi
		fi
	done
}

cd $DEST/..
if [ -z $2 ]; then
	dest=`basename $DEST`
	dfs $dest
else
	dest_root=`pwd`
	loop_file ${dest_root} ${FILE_LIST}
fi

clear
echo Finish fetching $DEST

dos batch

在 windows 想用 bat 文件递归上传(也要用前文的 export-dir.sh 导出、在服务器创好目录树),pscp 免密参考 [4],bat 文件的递归写法参考 [5]。可以写一个单独的 dfs.bat 文件,接受命令行参数传:

@echo off
setlocal enabledelayedexpansion

@REM putty 安装路径(更好的做法是设置环境变量)
set PUTTY=%LOCALAPPDATA%\Programs\PuTTY
@REM putty 免密私钥
set KEY=%USERPROFILE%\.ssh\putty-pri.ppk
@REM 4 个命令行参数
set IP=%1
set USER=%2
set REMOTE=%3
set LOCAL=%4

echo %IP%, %USER%, %REMOTE%, %LOCAL%
call :dfs %LOCAL% %REMOTE%
goto :eof

:dfs
setlocal
	cd /d %1
	@REM 传文件
	for %%f in (*) do (
		%PUTTY%\pscp -i %KEY% "%%f" %USER%@%IP%:%2
	)
	@REM 递归遍历子目录
	for /d %%d in (*) do (
		call :dfs %%d %2/%%d
	)
	cd ..
endlocal
goto :eof

调用 bat 文件参考 [7]。windows 在 cmd 调用:

call dfs.bat 1.2.3.4 itom /home/itom/data %USERPROFILE%\Downloads\data

References

  1. MMDetection在ScanNet上训练
  2. scannet数据集下载文件
  3. shell形参延迟解析/替换
  4. pscp无密传数据
  5. dos递归遍历目录删log
  6. Iterate all files in a directory using a ‘for’ loop
  7. How to run multiple .BAT files within a .BAT file
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值