Is there a way to delete files older than 10 days on HDFS?
In Linux I would use:
find /path/to/directory/ -type f -mtime +10 -name '*.txt' -execdir rm -- {} \;
Is there a way to do this on HDFS? (Deletion to be done based on file creation date)
解决方案
Solution 1: Using multiple commands as answered by daemon12
hdfs dfs -ls /file/Path | tr -s " " | cut -d' ' -f6-8 | grep "^[0-9]" | awk 'BEGIN{ MIN=14400; LAST=60*MIN; "date +%s" | getline NOW } { cmd="date -d'\''"$1" "$2"'\'' +%s"; cmd | getline WHEN; DIFF=NOW-WHEN; if(DIFF > LAST){ print "Deleting: "$3; system("hdfs dfs -rm -r "$3) }}'
Solution 2:<