dolphinscheduler 2.0.6 资源中心改造方案一:通过SFTP下载文件


背景

使用调度过程中,并未涉及Hadoop生态圈,但是使用资源中心功能,必须搭建Hadoop或者AWS相关环境,耗时耗力耗费资源,因此对其进行改造,将文件上传到某一台服务器,其它服务器(worker)需要该资源时通过SFTP下载到本地使用

现状

image.png

  • 3.0以后版本有详细说明,看最新发布版本3.1.4版本介绍,资源中心这一块配置做了扩展,除了HDFSAWS S3,还增加了阿里云 OSS等。

image.png

image.png

代码改造详情

借助现有的单机模式,当本机存在文件时,直接读取;不存在时,则通过sftp/ftp方式从存储服务器下载,其实和现有的hdfs原理是一样的,都是先下载再执行

common.properties

  • 修改相关配置

image.png

  • 增加服务器配置信息(SFTP连接下载使用)
transfer.enable=true
transfer.ip=192.168.38.5
transfer.port=22 #如果使用FTP,默认端口应该是21
transfer.username=dolphinscheduler
transfer.password=dob7@ZvT
  • 源码
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# user data local directory path, please make sure the directory exists and have read write permissions
data.basedir.path=/tmp/dolphinscheduler

# resource storage type: HDFS, S3, NONE
resource.storage.type=HDFS

# resource store on HDFS/S3 path, resource file will store to this hadoop hdfs path, self configuration, please make sure the directory exists on hdfs and have read write permissions. "/dolphinscheduler" is recommended
resource.upload.path=/tmp/dslocalfiletest

# whether to startup kerberos
hadoop.security.authentication.startup.state=false

# java.security.krb5.conf path
java.security.krb5.conf.path=/opt/krb5.conf

# login user from keytab username
login.user.keytab.username=hdfs-mycluster@ESZ.COM

# login user from keytab path
login.user.keytab.path=/opt/hdfs.headless.keytab

# kerberos expire time, the unit is hour
kerberos.expire.time=2

# resource view suffixs
#resource.view.suffixs=txt,log,sh,bat,conf,cfg,py,java,sql,xml,hql,properties,json,yml,yaml,ini,js

# if resource.storage.type=HDFS, the user must have the permission to create directories under the HDFS root path
hdfs.root.user=hdfs

# if resource.storage.type=S3, the value like: s3a://dolphinscheduler; if resource.storage.type=HDFS and namenode HA is enabled, you need to copy core-site.xml and hdfs-site.xml to conf dir
#fs.defaultFS=hdfs://mycluster:8020
fs.defaultFS=file:///

# if resource.storage.type=S3, s3 endpoint
fs.s3a.endpoint=http://192.168.xx.xx:9010

# if resource.storage.type=S3, s3 access key
fs.s3a.access.key=A3DXS30FO22544RE

# if resource.storage.type=S3, s3 secret key
fs.s3a.secret.key=OloCLq3n+8+sdPHUhJ21XrSxTC+JK

# resourcemanager port, the default value is 8088 if not specified
resource.manager.httpaddress.port=8088

# if resourcemanager HA is enabled, please set the HA IPs; if resourcemanager is single, keep this value empty
yarn.resourcemanager.ha.rm.ids=192.168.xx.xx,192.168.xx.xx

# if resourcemanager HA is enabled or not use resourcemanager, please keep the default value; If resourcemanager is single, you only need to replace ds1 to actual resourcemanager hostname
yarn.application.status.address=http://ds1:%s/ws/v1/cluster/apps/%s

# job history status url when application number threshold is reached(default 10000, maybe it was set to 1000)
yarn.job.history.status.address=http://ds1:19888/ws/v1/history/mapreduce/jobs/%s

# datasource encryption enable
datasource.encryption.enable=false

# datasource encryption salt
datasource.encryption.salt=!@#$%^&*

# Whether hive SQL is executed in the same session
support.hive.oneSession=false

# use sudo or not, if set true, executing user is tenant user and deploy user needs sudo permissions; if set false, executing user is the deploy user and doesn't need sudo permissions
sudo.enable=true

# network interface preferred like eth0, default: empty
#dolphin.scheduler.network.interface.preferred=

# network IP gets priority, default: inner outer
#dolphin.scheduler.network.priority.strategy=default

# system env path
#dolphinscheduler.env.path=env/dolphinscheduler_env.sh

# development state
development.state=false

#datasource.plugin.dir config
datasource.plugin.dir=lib/plugin/datasource

transfer.enable=true
transfer.ip=221.221.221.5
transfer.port=22
transfer.username=dolphinscheduler
transfer.password=dob7@ZvT

Constants.java

  • 增加服务器信息常量(为了规范,调用时候直接引用常量)
    public static final String TRANSFER_ENABLE = "transfer.enable";
    public static final String TRANSFER_IP = "transfer.ip";
    public static final String TRANSFER_PORT = "transfer.port";
    public static final String TRANSFER_USERNAME = "transfer.username";
    public static final String TRANSFER_PASSWORD = "transfer.password";
  • 源码
/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 *    http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

package org.apache.dolphinscheduler.common;

import org.apache.dolphinscheduler.common.enums.ExecutionStatus;

import org.apache.commons.lang.StringUtils;
import org.apache.commons.lang.SystemUtils;

import java.util.regex.Pattern;

/**
 * Constants
 */
public final class Constants {

    private Constants() {
        throw new UnsupportedOperationException("Construct Constants");
    }

    /**
     * quartz config
     */
    public static final String ORG_QUARTZ_JOBSTORE_DRIVERDELEGATECLASS = "org.quartz.jobStore.driverDelegateClass";
    public static final String ORG_QUARTZ_SCHEDULER_INSTANCENAME = "org.quartz.scheduler.instanceName";
    public static final String ORG_QUARTZ_SCHEDULER_INSTANCEID = "org.quartz.scheduler.instanceId";
    public static final String ORG_QUARTZ_SCHEDULER_MAKESCHEDULERTHREADDAEMON = "org.quartz.scheduler.makeSchedulerThreadDaemon";
    public static final String ORG_QUARTZ_JOBSTORE_USEPROPERTIES = "org.quartz.jobStore.useProperties";
    public static final String ORG_QUARTZ_THREADPOOL_CLASS = "org.quartz.threadPool.class";
    public static final String ORG_QUARTZ_THREADPOOL_THREADCOUNT = "org.quartz.threadPool.threadCount";
    public static final String ORG_QUARTZ_THREADPOOL_MAKETHREADSDAEMONS = "org.quartz.threadPool.makeThreadsDaemons";
    public static final String ORG_QUARTZ_THREADPOOL_THREADPRIORITY = "org.quartz.threadPool.threadPriority";
    public static final String ORG_QUARTZ_JOBSTORE_CLASS = "org.quartz.jobStore.class";
    public static final String ORG_QUARTZ_JOBSTORE_TABLEPREFIX = "org.quartz.jobStore.tablePrefix";
    public static final String ORG_QUARTZ_JOBSTORE_ISCLUSTERED = "org.quartz.jobStore.isClustered";
    public static final String ORG_QUARTZ_JOBSTORE_MISFIRETHRESHOLD = "org.quartz.jobStore.misfireThreshold";
    public static final String ORG_QUARTZ_JOBSTORE_CLUSTERCHECKININTERVAL = "org.quartz.jobStore.clusterCheckinInterval";
    public static final String ORG_QUARTZ_JOBSTORE_ACQUIRETRIGGERSWITHINLOCK = "org.quartz.jobStore.acquireTriggersWithinLock";
    public static final String ORG_QUARTZ_JOBSTORE_DATASOURCE = "org.quartz.jobStore.dataSource";
    public static final String ORG_QUARTZ_DATASOURCE_MYDS_CONNECTIONPROVIDER_CLASS = "org.quartz.dataSource.myDs.connectionProvider.class";
    public static final String ORG_QUARTZ_SCHEDULER_BATCHTRIGGERACQUISTITIONMAXCOUNT = "org.quartz.scheduler.batchTriggerAcquisitionMaxCount";
    /**
     * quartz config default value
     */
    public static final String QUARTZ_TABLE_PREFIX = "QRTZ_";
    public static final String QUARTZ_MISFIRETHRESHOLD = "60000";
    public static final String QUARTZ_CLUSTERCHECKININTERVAL = "5000";
    public static final String QUARTZ_DATASOURCE = "myDs";
    public static final String QUARTZ_THREADCOUNT = "25";
    public static final String QUARTZ_THREADPRIORITY = "5";
    public static final String QUARTZ_INSTANCENAME = "DolphinScheduler";
    public static final String QUARTZ_INSTANCEID = "AUTO";
    public static final String QUARTZ_ACQUIRETRIGGERSWITHINLOCK = "true";
    public static final String QUARTZ_BATCHTRIGGERACQUISTITIONMAXCOUNT = "100";

    /**
     * common properties path
     */
    public static final String COMMON_PROPERTIES_PATH = "/common.properties";

    /**
     * alter properties
     */
    public static final String ALERT_PLUGIN_BINDING = "alert.plugin.binding";
    public static final String ALERT_PLUGIN_DIR = "alert.plugin.dir";
    public static final int ALERT_RPC_PORT = 50052;

    /**
     * registry properties
     */
    public static final String REGISTRY_DOLPHINSCHEDULER_MASTERS = "/nodes/master";
    public static final String REGISTRY_DOLPHINSCHEDULER_WORKERS = "/nodes/worker";
    public static final String REGISTRY_DOLPHINSCHEDULER_DEAD_SERVERS = "/dead-servers";
    public static final String REGISTRY_DOLPHINSCHEDULER_NODE = "/nodes";
    public static final String REGISTRY_DOLPHINSCHEDULER_LOCK_MASTERS = "/lock/masters";
    public static final String REGISTRY_DOLPHINSCHEDULER_LOCK_FAILOVER_MASTERS = "/lock/failover/masters";
    public static final String REGISTRY_DOLPHINSCHEDULER_LOCK_FAILOVER_WORKERS = "/lock/failover/workers";
    public static final String REGISTRY_DOLPHINSCHEDULER_LOCK_FAILOVER_STARTUP_MASTERS = "/lock/failover/startup-masters";
    public static final String REGISTRY_SERVERS = "registry.servers";
    public static final String FOLDER_SEPARATOR = "/";

    /**
     * fs.defaultFS
     */
    public static final String FS_DEFAULTFS = "fs.defaultFS";


    /**
     * fs s3a endpoint
     */
    public static final String FS_S3A_ENDPOINT = "fs.s3a.endpoint";

    /**
     * fs s3a access key
     */
    public static final String FS_S3A_ACCESS_KEY = "fs.s3a.access.key";

    /**
     * fs s3a secret key
     */
    public static final String FS_S3A_SECRET_KEY = "fs.s3a.secret.key";


    /**
     * hadoop configuration
     */
    public static final String HADOOP_RM_STATE_ACTIVE = "ACTIVE";

    public static final String HADOOP_RM_STATE_STANDBY = "STANDBY";

    public static final String HADOOP_RESOURCE_MANAGER_HTTPADDRESS_PORT = "resource.manager.httpaddress.port";

    /**
     * yarn.resourcemanager.ha.rm.ids
     */
    public static final String YARN_RESOURCEMANAGER_HA_RM_IDS = "yarn.resourcemanager.ha.rm.ids";


    /**
     * yarn.application.status.address
     */
    public static final String YARN_APPLICATION_STATUS_ADDRESS = "yarn.application.status.address";

    /**
     * yarn.job.history.status.address
     */
    public static final String YARN_JOB_HISTORY_STATUS_ADDRESS = "yarn.job.history.status.address";

    /**
     * hdfs configuration
     * hdfs.root.user
     */
    public static final String HDFS_ROOT_USER = "hdfs.root.user";

    /**
     * hdfs/s3 configuration
     * resource.upload.path
     */
    public static final String RESOURCE_UPLOAD_PATH = "resource.upload.path";

    /**
     * data basedir path
     */
    public static final String DATA_BASEDIR_PATH = "data.basedir.path";

    /**
     * dolphinscheduler.env.path
     */
    public static final String DOLPHINSCHEDULER_ENV_PATH = "dolphinscheduler.env.path";

    /**
     * environment properties default path
     */
    public static final String ENV_PATH = "env/dolphinscheduler_env.sh";

    /**
     * python home
     */
    public static final String PYTHON_HOME = "PYTHON_HOME";

    /**
     * resource.view.suffixs
     */
    public static final String RESOURCE_VIEW_SUFFIXS = "resource.view.suffixs";

    public static final String RESOURCE_VIEW_SUFFIXS_DEFAULT_VALUE = "txt,log,sh,bat,conf,cfg,py,java,sql,xml,hql,properties,json,yml,yaml,ini,js";

    /**
     * development.state
     */
    public static final String DEVELOPMENT_STATE = "development.state";

    /**
     * sudo enable
     */
    public static final String SUDO_ENABLE = "sudo.enable";
    
    public static final String TRANSFER_ENABLE = "transfer.enable";
    public static final String TRANSFER_IP = "transfer.ip";
    public static final String TRANSFER_PORT = "transfer.port";
    public static final String TRANSFER_USERNAME = "transfer.username";
    public static final String TRANSFER_PASSWORD = "transfer.password";

    /**
     * string true
     */
    public static final String STRING_TRUE = "true";

    /**
     * string false
     */
    public static final String STRING_FALSE = "false";

    /**
     * resource storage type
     */
    public static final String RESOURCE_STORAGE_TYPE = "resource.storage.type";

    /**
     * comma ,
     */
    public static final String COMMA = ",";

    /**
     * COLON :
     */
    public static final String COLON = ":";

    /**
     * SPACE " "
     */
    public static final String SPACE = " ";

    /**
     * SINGLE_SLASH /
     */
    public static final String SINGLE_SLASH = "/";

    /**
     * DOUBLE_SLASH //
     */
    public static final String DOUBLE_SLASH = "//";

    /**
     * SINGLE_QUOTES "'"
     */
    public static final String SINGLE_QUOTES = "'";
    /**
     * DOUBLE_QUOTES "\""
     */
    public static final String DOUBLE_QUOTES = "\"";

    /**
     * SEMICOLON ;
     */
    public static final String SEMICOLON = ";";

    /**
     * EQUAL SIGN
     */
    public static final String EQUAL_SIGN = "=";
    /**
     * AT SIGN
     */
    public static final String AT_SIGN = "@";


    /**
     * date format of yyyy-MM-dd HH:mm:ss
     */
    public static final String YYYY_MM_DD_HH_MM_SS = "yyyy-MM-dd HH:mm:ss";


    /**
     * date format of yyyyMMddHHmmss
     */
    public static final String YYYYMMDDHHMMSS = "yyyyMMddHHmmss";

    /**
     * date format of yyyyMMddHHmmssSSS
     */
    public static final String YYYYMMDDHHMMSSSSS = "yyyyMMddHHmmssSSS";
    /**
     * http connect time out
     */
    public static final int HTTP_CONNECT_TIMEOUT = 60 * 1000;


    /**
     * http connect request time out
     */
    public static final int HTTP_CONNECTION_REQUEST_TIMEOUT = 60 * 1000;

    /**
     * httpclient soceket time out
     */
    public static final int SOCKET_TIMEOUT = 60 * 1000;

    /**
     * http header
     */
    public static final String HTTP_HEADER_UNKNOWN = "unKnown";

    /**
     * http X-Forwarded-For
     */
    public static final String HTTP_X_FORWARDED_FOR = "X-Forwarded-For";

    /**
     * http X-Real-IP
     */
    public static final String HTTP_X_REAL_IP = "X-Real-IP";

    /**
     * UTF-8
     */
    public static final String UTF_8 = "UTF-8";

    /**
     * user name regex
     */
    public static final Pattern REGEX_USER_NAME = Pattern.compile("^[a-zA-Z0-9._-]{3,39}$");
    
    /**
     * default display rows
     */
    public static final int DEFAULT_DISPLAY_ROWS = 10;

    /**
     * read permission
     */
    public static final int READ_PERMISSION = 2 * 1;


    /**
     * write permission
     */
    public static final int WRITE_PERMISSION = 2 * 2;


    /**
     * execute permission
     */
    public static final int EXECUTE_PERMISSION = 1;

    /**
     * default admin permission
     */
    public static final int DEFAULT_ADMIN_PERMISSION = 7;

    /**
     * default hash map size
     */
    public static final int DEFAULT_HASH_MAP_SIZE = 16;


    /**
     * all permissions
     */
    public static final int ALL_PERMISSIONS = READ_PERMISSION | WRITE_PERMISSION | EXECUTE_PERMISSION;

    /**
     * max task timeout
     */
    public static final int MAX_TASK_TIMEOUT = 24 * 3600;


    /**
     * master cpu load
     */
    public static final int DEFAULT_MASTER_CPU_LOAD = Runtime.getRuntime().availableProcessors() * 2;

    /**
     * worker cpu load
     */
    public static final int DEFAULT_WORKER_CPU_LOAD = Runtime.getRuntime().availableProcessors() * 2;

    /**
     * worker host weight
     */
    public static final int DEFAULT_WORKER_HOST_WEIGHT = 100;

    /**
     * default log cache rows num,output when reach the number
     */
    public static final int DEFAULT_LOG_ROWS_NUM = 4 * 16;

    /**
     * log flush interval?output when reach the interval
     */
    public static final int DEFAULT_LOG_FLUSH_INTERVAL = 1000;


    /**
     * time unit secong to minutes
     */
    public static final int SEC_2_MINUTES_TIME_UNIT = 60;

    /***
     *
     * rpc port
     */
    public static final int RPC_PORT = 50051;

    /**
     * forbid running task
     */
    public static final String FLOWNODE_RUN_FLAG_FORBIDDEN = "FORBIDDEN";

    /**
     * normal running task
     */
    public static final String FLOWNODE_RUN_FLAG_NORMAL = "NORMAL";

    /**
     * datasource configuration path
     */

    public static final String COMMON_TASK_TYPE = "common";

    public static final String DEFAULT = "default";
    public static final String USER = "user";
    public static final String PASSWORD = "password";
    public static final String XXXXXX = "******";
    public static final String NULL = "NULL";
    public static final String THREAD_NAME_MASTER_SERVER = "Master-Server";
    public static final String THREAD_NAME_WORKER_SERVER = "Worker-Server";

    /**
     * command parameter keys
     */
    public static final String CMD_PARAM_RECOVER_PROCESS_ID_STRING = "ProcessInstanceId";

    public static final String CMD_PARAM_RECOVERY_START_NODE_STRING = "StartNodeIdList";

    public static final String CMD_PARAM_RECOVERY_WAITING_THREAD = "WaitingThreadInstanceId";

    public static final String CMD_PARAM_SUB_PROCESS = "processInstanceId";

    public static final String CMD_PARAM_EMPTY_SUB_PROCESS = "0";

    public static final String CMD_PARAM_SUB_PROCESS_PARENT_INSTANCE_ID = "parentProcessInstanceId";

    public static final String CMD_PARAM_SUB_PROCESS_DEFINE_CODE = "processDefinitionCode";

    public static final String CMD_PARAM_START_NODE_NAMES = "StartNodeNameList";

    public static final String CMD_PARAM_START_NODES = "StartNodeList";

    public static final String CMD_PARAM_START_PARAMS = "StartParams";

    public static final String CMD_PARAM_FATHER_PARAMS = "fatherParams";

    /**
     * complement data start date
     */
    public static final String CMDPARAM_COMPLEMENT_DATA_START_DATE = "complementStartDate";

    /**
     * complement data end date
     */
    public static final String CMDPARAM_COMPLEMENT_DATA_END_DATE = "complementEndDate";

    /**
     * complement date default cron string
     */
    public static final String DEFAULT_CRON_STRING = "0 0 0 * * ? *";

    public static final String SPRING_DATASOURCE_DRIVER_CLASS_NAME = "spring.datasource.driver-class-name";

    public static final String SPRING_DATASOURCE_URL = "spring.datasource.url";

    public static final String SPRING_DATASOURCE_USERNAME = "spring.datasource.username";

    public static final String SPRING_DATASOURCE_PASSWORD = "spring.datasource.password";

    public static final String SPRING_DATASOURCE_CONNECTION_TIMEOUT = "spring.datasource.connectionTimeout";

    public static final String SPRING_DATASOURCE_MIN_IDLE = "spring.datasource.minIdle";

    public static final String SPRING_DATASOURCE_MAX_ACTIVE = "spring.datasource.maxActive";

    public static final String SPRING_DATASOURCE_IDLE_TIMEOUT = "spring.datasource.idleTimeout";

    public static final String SPRING_DATASOURCE_MAX_LIFE_TIME = "spring.datasource.maxLifetime";

    public static final String SPRING_DATASOURCE_VALIDATION_TIMEOUT = "spring.datasource.validationTimeout";

    public static final String SPRING_DATASOURCE_VALIDATION_QUERY = "spring.datasource.validationQuery";

    public static final String SPRING_DATASOURCE_LEAK_DETECTION_THRESHOLD = "spring.datasource.leakDetectionThreshold";

    public static final String SPRING_DATASOURCE_INITIALIZATION_FAIL_TIMEOUT = "spring.datasource.initializationFailTimeout";

    public static final String SPRING_DATASOURCE_IS_AUTOCOMMIT = "spring.datasource.isAutoCommit";

    public static final String SPRING_DATASOURCE_CACHE_PREP_STMTS = "spring.datasource.cachePrepStmts";

    public static final String SPRING_DATASOURCE_PREP_STMT_CACHE_SIZE = "spring.datasource.prepStmtCacheSize";

    public static final String SPRING_DATASOURCE_PREP_STMT_CACHE_SQL_LIMIT = "spring.datasource.prepStmtCacheSqlLimit";

    public static final String CACHE_PREP_STMTS = "cachePrepStmts";

    public static final String PREP_STMT_CACHE_SIZE = "prepStmtCacheSize";

    public static final String PREP_STMT_CACHE_SQL_LIMIT = "prepStmtCacheSqlLimit";

    public static final String QUARTZ_PROPERTIES_PATH = "quartz.properties";

    /**
     * sleep time
     */
    public static final int SLEEP_TIME_MILLIS = 1000;

    /**
     * one second mils
     */
    public static final int SECOND_TIME_MILLIS = 1000;

    /**
     * master task instance cache-database refresh interval
     */
    public static final int CACHE_REFRESH_TIME_MILLIS = 20 * 1000;

    /**
     * heartbeat for zk info length
     */
    public static final int HEARTBEAT_FOR_ZOOKEEPER_INFO_LENGTH = 14;

    /**
     * jar
     */
    public static final String JAR = "jar";

    /**
     * hadoop
     */
    public static final String HADOOP = "hadoop";

    /**
     * -D <property>=<value>
     */
    public static final String D = "-D";

    /**
     * -D mapreduce.job.name=name
     */
    public static final String MR_NAME = "mapreduce.job.name";

    /**
     * -D mapreduce.job.queuename=queuename
     */
    public static final String MR_QUEUE = "mapreduce.job.queuename";


    /**
     * spark params constant
     */
    public static final String MASTER = "--master";

    public static final String DEPLOY_MODE = "--deploy-mode";

    /**
     * --class CLASS_NAME
     */
    public static final String MAIN_CLASS = "--class";

    /**
     * --driver-cores NUM
     */
    public static final String DRIVER_CORES = "--driver-cores";

    /**
     * --driver-memory MEM
     */
    public static final String DRIVER_MEMORY = "--driver-memory";

    /**
     * --num-executors NUM
     */
    public static final String NUM_EXECUTORS = "--num-executors";

    /**
     * --executor-cores NUM
     */
    public static final String EXECUTOR_CORES = "--executor-cores";

    /**
     * --executor-memory MEM
     */
    public static final String EXECUTOR_MEMORY = "--executor-memory";

    /**
     * --name NAME
     */
    public static final String SPARK_NAME = "--name";

    /**
     * --queue QUEUE
     */
    public static final String SPARK_QUEUE = "--queue";


    /**
     * exit code success
     */
    public static final int EXIT_CODE_SUCCESS = 0;

    /**
     * exit code kill
     */
    public static final int EXIT_CODE_KILL = 137;

    /**
     * exit code failure
     */
    public static final int EXIT_CODE_FAILURE = -1;

    /**
     * process or task definition failure
     */
    public static final int DEFINITION_FAILURE = -1;

    /**
     * process or task definition first version
     */
    public static final int VERSION_FIRST  = 1;

    /**
     * date format of yyyyMMdd
     */
    public static final String PARAMETER_FORMAT_DATE = "yyyyMMdd";

    /**
     * date format of yyyyMMddHHmmss
     */
    public static final String PARAMETER_FORMAT_TIME = "yyyyMMddHHmmss";

    /**
     * system date(yyyyMMddHHmmss)
     */
    public static final String PARAMETER_DATETIME = "system.datetime";

    /**
     * system date(yyyymmdd) today
     */
    public static final String PARAMETER_CURRENT_DATE = "system.biz.curdate";

    /**
     * system date(yyyymmdd) yesterday
     */
    public static final String PARAMETER_BUSINESS_DATE = "system.biz.date";

    /**
     * the absolute path of current executing task
     */
    public static final String PARAMETER_TASK_EXECUTE_PATH = "system.task.execute.path";

    /**
     * the instance id of current task
     */
    public static final String PARAMETER_TASK_INSTANCE_ID = "system.task.instance.id";

    /**
     * ACCEPTED
     */
    public static final String ACCEPTED = "ACCEPTED";

    /**
     * SUCCEEDED
     */
    public static final String SUCCEEDED = "SUCCEEDED";
    /**
     * ENDED
     */
    public static final String ENDED = "ENDED";
    /**
     * NEW
     */
    public static final String NEW = "NEW";
    /**
     * NEW_SAVING
     */
    public static final String NEW_SAVING = "NEW_SAVING";
    /**
     * SUBMITTED
     */
    public static final String SUBMITTED = "SUBMITTED";
    /**
     * FAILED
     */
    public static final String FAILED = "FAILED";
    /**
     * KILLED
     */
    public static final String KILLED = "KILLED";
    /**
     * RUNNING
     */
    public static final String RUNNING = "RUNNING";
    /**
     * underline  "_"
     */
    public static final String UNDERLINE = "_";
    /**
     * quartz job prifix
     */
    public static final String QUARTZ_JOB_PRIFIX = "job";
    /**
     * quartz job group prifix
     */
    public static final String QUARTZ_JOB_GROUP_PRIFIX = "jobgroup";
    /**
     * projectId
     */
    public static final String PROJECT_ID = "projectId";
    /**
     * processId
     */
    public static final String SCHEDULE_ID = "scheduleId";
    /**
     * schedule
     */
    public static final String SCHEDULE = "schedule";
    /**
     * application regex
     */
    public static final String APPLICATION_REGEX = "application_\\d+_\\d+";
    public static final String PID = SystemUtils.IS_OS_WINDOWS ? "handle" : "pid";
    /**
     * month_begin
     */
    public static final String MONTH_BEGIN = "month_begin";
    /**
     * add_months
     */
    public static final String ADD_MONTHS = "add_months";
    /**
     * month_end
     */
    public static final String MONTH_END = "month_end";
    /**
     * week_begin
     */
    public static final String WEEK_BEGIN = "week_begin";
    /**
     * week_end
     */
    public static final String WEEK_END = "week_end";
    /**
     * timestamp
     */
    public static final String TIMESTAMP = "timestamp";
    public static final char SUBTRACT_CHAR = '-';
    public static final char ADD_CHAR = '+';
    public static final char MULTIPLY_CHAR = '*';
    public static final char DIVISION_CHAR = '/';
    public static final char LEFT_BRACE_CHAR = '(';
    public static final char RIGHT_BRACE_CHAR = ')';
    public static final String ADD_STRING = "+";
    public static final String STAR = "*";
    public static final String DIVISION_STRING = "/";
    public static final String LEFT_BRACE_STRING = "(";
    public static final char P = 'P';
    public static final char N = 'N';
    public static final String SUBTRACT_STRING = "-";
    public static final String GLOBAL_PARAMS = "globalParams";
    public static final String LOCAL_PARAMS = "localParams";
    public static final String LOCAL_PARAMS_LIST = "localParamsList";
    public static final String SUBPROCESS_INSTANCE_ID = "subProcessInstanceId";
    public static final String PROCESS_INSTANCE_STATE = "processInstanceState";
    public static final String PARENT_WORKFLOW_INSTANCE = "parentWorkflowInstance";
    public static final String CONDITION_RESULT = "conditionResult";
    public static final String SWITCH_RESULT = "switchResult";
    public static final String WAIT_START_TIMEOUT = "waitStartTimeout";
    public static final String DEPENDENCE = "dependence";
    public static final String TASK_TYPE = "taskType";
    public static final String TASK_LIST = "taskList";
    public static final String WARNING_GROUP_NAME="warningGroupName";
    public static final String RWXR_XR_X = "rwxr-xr-x";
    public static final String QUEUE = "queue";
    public static final String QUEUE_NAME = "queueName";
    public static final int LOG_QUERY_SKIP_LINE_NUMBER = 0;
    public static final int LOG_QUERY_LIMIT = 4096;

    /**
     * master/worker server use for zk
     */
    public static final String MASTER_TYPE = "master";
    public static final String WORKER_TYPE = "worker";
    public static final String DELETE_OP = "delete";
    public static final String ADD_OP = "add";
    public static final String ALIAS = "alias";
    public static final String CONTENT = "content";
    public static final String DEPENDENT_SPLIT = ":||";
    public static final String DEPENDENT_ALL = "ALL";
    public static final long DEPENDENT_ALL_TASK_CODE = 0;



    /**
     * preview schedule execute count
     */
    public static final int PREVIEW_SCHEDULE_EXECUTE_COUNT = 5;

    /**
     * kerberos
     */
    public static final String KERBEROS = "kerberos";

    /**
     * kerberos expire time
     */
    public static final String KERBEROS_EXPIRE_TIME = "kerberos.expire.time";

    /**
     * java.security.krb5.conf
     */
    public static final String JAVA_SECURITY_KRB5_CONF = "java.security.krb5.conf";

    /**
     * java.security.krb5.conf.path
     */
    public static final String JAVA_SECURITY_KRB5_CONF_PATH = "java.security.krb5.conf.path";

    /**
     * hadoop.security.authentication
     */
    public static final String HADOOP_SECURITY_AUTHENTICATION = "hadoop.security.authentication";

    /**
     * hadoop.security.authentication
     */
    public static final String HADOOP_SECURITY_AUTHENTICATION_STARTUP_STATE = "hadoop.security.authentication.startup.state";

    /**
     * com.amazonaws.services.s3.enableV4
     */
    public static final String AWS_S3_V4 = "com.amazonaws.services.s3.enableV4";

    /**
     * loginUserFromKeytab user
     */
    public static final String LOGIN_USER_KEY_TAB_USERNAME = "login.user.keytab.username";

    /**
     * loginUserFromKeytab path
     */
    public static final String LOGIN_USER_KEY_TAB_PATH = "login.user.keytab.path";

    /**
     * task log info format
     */
    public static final String TASK_LOG_INFO_FORMAT = "TaskLogInfo-%s";

    /**
     * hive conf
     */
    public static final String HIVE_CONF = "hiveconf:";

    /**
     * flink
     */
    public static final String FLINK_YARN_CLUSTER = "yarn-cluster";
    public static final String FLINK_RUN_MODE = "-m";
    public static final String FLINK_YARN_SLOT = "-ys";
    public static final String FLINK_APP_NAME = "-ynm";
    public static final String FLINK_QUEUE = "-yqu";
    public static final String FLINK_TASK_MANAGE = "-yn";

    public static final String FLINK_JOB_MANAGE_MEM = "-yjm";
    public static final String FLINK_TASK_MANAGE_MEM = "-ytm";
    public static final String FLINK_MAIN_CLASS = "-c";
    public static final String FLINK_PARALLELISM = "-p";
    public static final String FLINK_SHUTDOWN_ON_ATTACHED_EXIT = "-sae";


    public static final int[] NOT_TERMINATED_STATES = new int[] {
        ExecutionStatus.SUBMITTED_SUCCESS.ordinal(),
        ExecutionStatus.RUNNING_EXECUTION.ordinal(),
        ExecutionStatus.DELAY_EXECUTION.ordinal(),
        ExecutionStatus.READY_PAUSE.ordinal(),
        ExecutionStatus.READY_STOP.ordinal(),
        ExecutionStatus.NEED_FAULT_TOLERANCE.ordinal(),
        ExecutionStatus.WAITING_THREAD.ordinal(),
        ExecutionStatus.WAITING_DEPEND.ordinal()
    };

    /**
     * status
     */
    public static final String STATUS = "status";

    /**
     * message
     */
    public static final String MSG = "msg";

    /**
     * data total
     */
    public static final String COUNT = "count";

    /**
     * page size
     */
    public static final String PAGE_SIZE = "pageSize";

    /**
     * current page no
     */
    public static final String PAGE_NUMBER = "pageNo";


    /**
     *
     */
    public static final String DATA_LIST = "data";

    public static final String TOTAL_LIST = "totalList";

    public static final String CURRENT_PAGE = "currentPage";

    public static final String TOTAL_PAGE = "totalPage";

    public static final String TOTAL = "total";

    /**
     * workflow
     */
    public static final String WORKFLOW_LIST = "workFlowList";
    public static final String WORKFLOW_RELATION_LIST = "workFlowRelationList";

    /**
     * session user
     */
    public static final String SESSION_USER = "session.user";

    public static final String SESSION_ID = "sessionId";

    public static final String PASSWORD_DEFAULT = "******";

    /**
     * locale
     */
    public static final String LOCALE_LANGUAGE = "language";

    /**
     * driver
     */
    public static final String ORG_POSTGRESQL_DRIVER = "org.postgresql.Driver";
    public static final String COM_MYSQL_JDBC_DRIVER = "com.mysql.jdbc.Driver";
    public static final String ORG_APACHE_HIVE_JDBC_HIVE_DRIVER = "org.apache.hive.jdbc.HiveDriver";
    public static final String COM_CLICKHOUSE_JDBC_DRIVER = "ru.yandex.clickhouse.ClickHouseDriver";
    public static final String COM_ORACLE_JDBC_DRIVER = "oracle.jdbc.driver.OracleDriver";
    public static final String COM_SQLSERVER_JDBC_DRIVER = "com.microsoft.sqlserver.jdbc.SQLServerDriver";
    public static final String COM_DB2_JDBC_DRIVER = "com.ibm.db2.jcc.DB2Driver";
    public static final String COM_PRESTO_JDBC_DRIVER = "com.facebook.presto.jdbc.PrestoDriver";


    /**
     * validation Query
     */
    public static final String POSTGRESQL_VALIDATION_QUERY = "select version()";
    public static final String MYSQL_VALIDATION_QUERY = "select 1";
    public static final String HIVE_VALIDATION_QUERY = "select 1";
    public static final String CLICKHOUSE_VALIDATION_QUERY = "select 1";
    public static final String ORACLE_VALIDATION_QUERY = "select 1 from dual";
    public static final String SQLSERVER_VALIDATION_QUERY = "select 1";
    public static final String DB2_VALIDATION_QUERY = "select 1 from sysibm.sysdummy1";
    public static final String PRESTO_VALIDATION_QUERY = "select 1";

    /**
     * database type
     */
    public static final String MYSQL = "MYSQL";
    public static final String POSTGRESQL = "POSTGRESQL";
    public static final String HIVE = "HIVE";
    public static final String SPARK = "SPARK";
    public static final String CLICKHOUSE = "CLICKHOUSE";
    public static final String ORACLE = "ORACLE";
    public static final String SQLSERVER = "SQLSERVER";
    public static final String DB2 = "DB2";
    public static final String PRESTO = "PRESTO";

    /**
     * jdbc url
     */
    public static final String JDBC_MYSQL = "jdbc:mysql://";
    public static final String JDBC_POSTGRESQL = "jdbc:postgresql://";
    public static final String JDBC_HIVE_2 = "jdbc:hive2://";
    public static final String JDBC_CLICKHOUSE = "jdbc:clickhouse://";
    public static final String JDBC_ORACLE_SID = "jdbc:oracle:thin:@";
    public static final String JDBC_ORACLE_SERVICE_NAME = "jdbc:oracle:thin:@//";
    public static final String JDBC_SQLSERVER = "jdbc:sqlserver://";
    public static final String JDBC_DB2 = "jdbc:db2://";
    public static final String JDBC_PRESTO = "jdbc:presto://";


    public static final String ADDRESS = "address";
    public static final String DATABASE = "database";
    public static final String JDBC_URL = "jdbcUrl";
    public static final String PRINCIPAL = "principal";
    public static final String OTHER = "other";
    public static final String ORACLE_DB_CONNECT_TYPE = "connectType";
    public static final String KERBEROS_KRB5_CONF_PATH = "javaSecurityKrb5Conf";
    public static final String KERBEROS_KEY_TAB_USERNAME = "loginUserKeytabUsername";
    public static final String KERBEROS_KEY_TAB_PATH = "loginUserKeytabPath";

    /**
     * session timeout
     */
    public static final int SESSION_TIME_OUT = 7200;
    public static final int MAX_FILE_SIZE = 1024 * 1024 * 1024;
    public static final String UDF = "UDF";
    public static final String CLASS = "class";
    public static final String RECEIVERS = "receivers";
    public static final String RECEIVERS_CC = "receiversCc";


    /**
     * dataSource sensitive param
     */
    public static final String DATASOURCE_PASSWORD_REGEX = "(?<=((?i)password((\\\\\":\\\\\")|(=')))).*?(?=((\\\\\")|(')))";

    /**
     * default worker group
     */
    public static final String DEFAULT_WORKER_GROUP = "default";

    public static final Integer TASK_INFO_LENGTH = 5;

    /**
     * new
     * schedule time
     */
    public static final String PARAMETER_SHECDULE_TIME = "schedule.time";
    /**
     * authorize writable perm
     */
    public static final int AUTHORIZE_WRITABLE_PERM = 7;
    /**
     * authorize readable perm
     */
    public static final int AUTHORIZE_READABLE_PERM = 4;

    public static final int NORMAL_NODE_STATUS = 0;
    public static final int ABNORMAL_NODE_STATUS = 1;
    public static final int BUSY_NODE_STATUE = 2;

    public static final String START_TIME = "start time";
    public static final String END_TIME = "end time";
    public static final String START_END_DATE = "startDate,endDate";

    /**
     * system line separator
     */
    public static final String SYSTEM_LINE_SEPARATOR = System.getProperty("line.separator");

    /**
     * datasource encryption salt
     */
    public static final String DATASOURCE_ENCRYPTION_SALT_DEFAULT = "!@#$%^&*";
    public static final String DATASOURCE_ENCRYPTION_ENABLE = "datasource.encryption.enable";
    public static final String DATASOURCE_ENCRYPTION_SALT = "datasource.encryption.salt";

    /**
     * network interface preferred
     */
    public static final String DOLPHIN_SCHEDULER_NETWORK_INTERFACE_PREFERRED = "dolphin.scheduler.network.interface.preferred";

    /**
     * network IP gets priority, default inner outer
     */
    public static final String DOLPHIN_SCHEDULER_NETWORK_PRIORITY_STRATEGY = "dolphin.scheduler.network.priority.strategy";

    /**
     * exec shell scripts
     */
    public static final String SH = "sh";

    /**
     * pstree, get pud and sub pid
     */
    public static final String PSTREE = "pstree";

    /**
     * snow flake, data center id, this id must be greater than 0 and less than 32
     */
    public static final String SNOW_FLAKE_DATA_CENTER_ID = "data.center.id";

    /**
     * docker & kubernetes
     */
    public static final boolean DOCKER_MODE = !StringUtils.isEmpty(System.getenv("DOCKER"));
    public static final Boolean KUBERNETES_MODE = !StringUtils.isEmpty(System.getenv("KUBERNETES_SERVICE_HOST")) && !StringUtils.isEmpty(System.getenv("KUBERNETES_SERVICE_PORT"));

    /**
     * dry run flag
     */
    public static final int DRY_RUN_FLAG_NO = 0;
    public static final int DRY_RUN_FLAG_YES = 1;

    public static final String CACHE_KEY_VALUE_ALL = "'all'";
}

TaskExecuteThread

  • 增加读取服务器代码

image.png
image.png
在这里插入图片描述

  • 源码
/*

Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements.  See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License.  You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package org.apache.dolphinscheduler.server.worker.runner;

import org.apache.dolphinscheduler.common.Constants;
import org.apache.dolphinscheduler.common.enums.Event;
import org.apache.dolphinscheduler.common.enums.ExecutionStatus;
import org.apache.dolphinscheduler.common.enums.TaskType;
import org.apache.dolphinscheduler.common.process.Property;
import org.apache.dolphinscheduler.common.utils.*;
import org.apache.dolphinscheduler.remote.command.Command;
import org.apache.dolphinscheduler.remote.command.TaskExecuteAckCommand;
import org.apache.dolphinscheduler.remote.command.TaskExecuteResponseCommand;
import org.apache.dolphinscheduler.server.utils.LogUtils;
import org.apache.dolphinscheduler.server.utils.ProcessUtils;
import org.apache.dolphinscheduler.server.worker.cache.ResponceCache;
import org.apache.dolphinscheduler.server.worker.plugin.TaskPluginManager;
import org.apache.dolphinscheduler.server.worker.processor.TaskCallbackService;
import org.apache.dolphinscheduler.service.alert.AlertClientService;
import org.apache.dolphinscheduler.service.queue.entity.TaskExecutionContext;
import org.apache.dolphinscheduler.spi.task.AbstractTask;
import org.apache.dolphinscheduler.spi.task.TaskAlertInfo;
import org.apache.dolphinscheduler.spi.task.TaskChannel;
import org.apache.dolphinscheduler.spi.task.TaskConstants;
import org.apache.dolphinscheduler.spi.task.TaskExecutionContextCacheManager;
import org.apache.dolphinscheduler.spi.task.request.TaskRequest;

import org.apache.commons.collections.MapUtils;
import org.apache.commons.lang.StringUtils;

import java.io.File;
import java.io.IOException;
import java.util.Date;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.concurrent.Delayed;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.stream.Collectors;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import com.github.rholder.retry.RetryException;

/**

 task scheduler thread
*/
public class TaskExecuteThread implements Runnable, Delayed {
/** 
  logger
*/
private final Logger logger = LoggerFactory.getLogger(TaskExecuteThread.class);
/** 
  task instance
*/
private TaskExecutionContext taskExecutionContext;
/** 
  abstract task
*/
private AbstractTask task;
/** 
  task callback service
*/
private TaskCallbackService taskCallbackService;
/** 
  alert client server
*/
private AlertClientService alertClientService;
private TaskPluginManager taskPluginManager;
/** 
  constructor
  
  @param  taskExecutionContext taskExecutionContext 
  @param  taskCallbackService taskCallbackService 
*/
public TaskExecuteThread(TaskExecutionContext taskExecutionContext,
TaskCallbackService taskCallbackService,
AlertClientService alertClientService) {
this.taskExecutionContext = taskExecutionContext;
this.taskCallbackService = taskCallbackService;
this.alertClientService = alertClientService;
}
public TaskExecuteThread(TaskExecutionContext taskExecutionContext,
TaskCallbackService taskCallbackService,
AlertClientService alertClientService,
TaskPluginManager taskPluginManager) {
this.taskExecutionContext = taskExecutionContext;
this.taskCallbackService = taskCallbackService;
this.alertClientService = alertClientService;
this.taskPluginManager = taskPluginManager;
}
@Override public void run() {
TaskExecuteResponseCommand responseCommand = new TaskExecuteResponseCommand(taskExecutionContext.getTaskInstanceId(), taskExecutionContext.getProcessInstanceId());
try {
taskExecutionContext.setLogPath(LogUtils.getTaskLogPath(taskExecutionContext)); 
     // local execute path
     String execLocalPath = getExecLocalPath(taskExecutionContext);
     FileUtils.createWorkDirIfAbsent(execLocalPath);
     logger.info("task instance local execute path : {}", execLocalPath);
     taskExecutionContext.setExecutePath(execLocalPath);

     // check if the OS user exists
     if (!OSUtils.getUserList().contains(taskExecutionContext.getTenantCode())) {
         String errorLog = String.format("tenantCode: %s does not exist", taskExecutionContext.getTenantCode());
         logger.error(errorLog);
         responseCommand.setStatus(ExecutionStatus.FAILURE.getCode());
         responseCommand.setEndTime(new Date());
         return;
     }

     if (taskExecutionContext.getStartTime() == null) {
         taskExecutionContext.setStartTime(new Date());
     }
     if (taskExecutionContext.getCurrentExecutionStatus() != ExecutionStatus.RUNNING_EXECUTION) {
         changeTaskExecutionStatusToRunning();
     }
     logger.info("the task begins to execute. task instance id: {}", taskExecutionContext.getTaskInstanceId());
     taskExecutionContext.setCurrentExecutionStatus(ExecutionStatus.RUNNING_EXECUTION);
     sendTaskExecuteRunningCommand(taskExecutionContext);
     int dryRun = taskExecutionContext.getDryRun();
     // copy hdfs/minio file to local
     if (dryRun == Constants.DRY_RUN_FLAG_NO) {
         downloadResource(taskExecutionContext.getExecutePath(),
                 taskExecutionContext.getResources(),
                 taskExecutionContext.getHost(),
                 logger);
     }

     taskExecutionContext.setEnvFile(CommonUtils.getSystemEnvPath());
     taskExecutionContext.setDefinedParams(getGlobalParamsMap());

     taskExecutionContext.setTaskAppId(String.format("%s_%s",
             taskExecutionContext.getProcessInstanceId(),
             taskExecutionContext.getTaskInstanceId()));

     preBuildBusinessParams();

     TaskChannel taskChannel = taskPluginManager.getTaskChannelMap().get(taskExecutionContext.getTaskType());
     if (null == taskChannel) {
         throw new RuntimeException(String.format("%s Task Plugin Not Found,Please Check Config File.", taskExecutionContext.getTaskType()));
     }
     TaskRequest taskRequest = JSONUtils.parseObject(JSONUtils.toJsonString(taskExecutionContext), TaskRequest.class);
     String taskLogName = LoggerUtils.buildTaskId(LoggerUtils.TASK_LOGGER_INFO_PREFIX,
             taskExecutionContext.getProcessDefineCode(),
             taskExecutionContext.getProcessDefineVersion(),
             taskExecutionContext.getProcessInstanceId(),
             taskExecutionContext.getTaskInstanceId());
     taskRequest.setTaskLogName(taskLogName);

     // set the name of the current thread
     Thread.currentThread().setName(String.format(TaskConstants.TASK_LOGGER_THREAD_NAME_FORMAT,taskLogName));

     task = taskChannel.createTask(taskRequest);

     // task init
     this.task.init();

     //init varPool
     this.task.getParameters().setVarPool(taskExecutionContext.getVarPool());

     if (dryRun == Constants.DRY_RUN_FLAG_NO) {
         // task handle
         this.task.handle();

         // task result process
         if (this.task.getNeedAlert()) {
             sendAlert(this.task.getTaskAlertInfo());
         }
         responseCommand.setStatus(this.task.getExitStatus().getCode());
     } else {
         responseCommand.setStatus(ExecutionStatus.SUCCESS.getCode());
         task.setExitStatusCode(Constants.EXIT_CODE_SUCCESS);
     }
     responseCommand.setEndTime(new Date());
     responseCommand.setProcessId(this.task.getProcessId());
     responseCommand.setAppIds(this.task.getAppIds());
     responseCommand.setVarPool(JSONUtils.toJsonString(this.task.getParameters().getVarPool()));
     logger.info("task instance id : {},task final status : {}", taskExecutionContext.getTaskInstanceId(), this.task.getExitStatus());
 } catch (Throwable e) {
     logger.error("task scheduler failure", e);
     kill();
     responseCommand.setStatus(ExecutionStatus.FAILURE.getCode());
     responseCommand.setEndTime(new Date());
     responseCommand.setProcessId(task.getProcessId());
     responseCommand.setAppIds(task.getAppIds());
 } finally {
     TaskExecutionContextCacheManager.removeByTaskInstanceId(taskExecutionContext.getTaskInstanceId());
     ResponceCache.get().cache(taskExecutionContext.getTaskInstanceId(), responseCommand.convert2Command(), Event.RESULT);
     taskCallbackService.sendResult(taskExecutionContext.getTaskInstanceId(), responseCommand.convert2Command());
     clearTaskExecPath();
 }
}
/** 
  get execute local path
  
  @param  taskExecutionContext taskExecutionContext 
  @return  execute local path 
*/
private String getExecLocalPath(TaskExecutionContext taskExecutionContext) {
return FileUtils.getProcessExecDir(taskExecutionContext.getProjectCode(),
taskExecutionContext.getProcessDefineCode(),
taskExecutionContext.getProcessDefineVersion(),
taskExecutionContext.getProcessInstanceId(),
taskExecutionContext.getTaskInstanceId());
}
private void sendTaskExecuteRunningCommand(TaskExecutionContext taskExecutionContext) {
TaskExecuteAckCommand command = buildTaskExecuteRunningCommand(taskExecutionContext);
// add response cache
ResponceCache.get().cache(taskExecutionContext.getTaskInstanceId(), command.convert2Command(), Event.ACK);
taskCallbackService.sendAck(taskExecutionContext.getTaskInstanceId(), command.convert2Command());
}
private TaskExecuteAckCommand buildTaskExecuteRunningCommand(TaskExecutionContext taskExecutionContext) {
TaskExecuteAckCommand command = new TaskExecuteAckCommand();
command.setTaskInstanceId(taskExecutionContext.getTaskInstanceId());
command.setProcessInstanceId(taskExecutionContext.getProcessInstanceId());
command.setStatus(taskExecutionContext.getCurrentExecutionStatus().getCode());
command.setLogPath(taskExecutionContext.getLogPath());
command.setHost(taskExecutionContext.getHost());
command.setStartTime(taskExecutionContext.getStartTime());
command.setExecutePath(taskExecutionContext.getExecutePath());
return command;
}
private void sendAlert(TaskAlertInfo taskAlertInfo) {
alertClientService.sendAlert(taskAlertInfo.getAlertGroupId(), taskAlertInfo.getTitle(), taskAlertInfo.getContent());
}
/** 
   when task finish, clear execute path.
*/
private void clearTaskExecPath() {
logger.info("develop mode is: {}", CommonUtils.isDevelopMode());
if (!CommonUtils.isDevelopMode()) {
// get exec dir
String execLocalPath = taskExecutionContext.getExecutePath(); 
 if (StringUtils.isEmpty(execLocalPath)) {
     logger.warn("task: {} exec local path is empty.", taskExecutionContext.getTaskName());
     return;
 }

 if ("/".equals(execLocalPath)) {
     logger.warn("task: {} exec local path is '/', direct deletion is not allowed", taskExecutionContext.getTaskName());
     return;
 }

 try {
     org.apache.commons.io.FileUtils.deleteDirectory(new File(execLocalPath));
     logger.info("exec local path: {} cleared.", execLocalPath);
 } catch (IOException e) {
     logger.error("delete exec dir failed : {}", e.getMessage(), e);
 }

}
} 
/** 
   get global paras map 
  
   @return  map 
*/
private Map<String, String> getGlobalParamsMap() {
Map<String, String> globalParamsMap = new HashMap<>(16);
// global params string
String globalParamsStr = taskExecutionContext.getGlobalParams();
if (globalParamsStr != null) {
List globalParamsList = JSONUtils.toList(globalParamsStr, Property.class);
globalParamsMap.putAll(globalParamsList.stream().collect(Collectors.toMap(Property::getProp, Property::getValue)));
}
return globalParamsMap;
} 
/** 
  kill task
*/
public void kill() {
if (task != null) {
try {
task.cancelApplication(true);
ProcessUtils.killYarnJob(taskExecutionContext);
} catch (Exception e) {
logger.error(e.getMessage(), e);
}
}
}
/** 
   download resource file 
  
   @param  execLocalPath execLocalPath  
   @param  projectRes projectRes  
   @param  string  
   @param  logger logger 
*/
private void downloadResource(String execLocalPath, Map<String, String> projectRes, String host, Logger logger) {
if (MapUtils.isEmpty(projectRes)) {
return;
}
Set<Map.Entry<String, String>> resEntries = projectRes.entrySet();
for (Map.Entry<String, String> resource : resEntries) {
String fullName = resource.getKey();
String tenantCode = resource.getValue();
File resFile = new File(execLocalPath, fullName);
if (!resFile.exists()) {
try {
// query the tenant code of the resource according to the name of the resource
String resHdfsPath = HadoopUtils.getHdfsResourceFileName(tenantCode, fullName);
if(PropertyUtils.getBoolean(Constants.TRANSFER_ENABLE) && host.contains(PropertyUtils.getString(Constants.TRANSFER_IP))) {
logger.info("get resource file from local ip: {} path :{}",PropertyUtils.getString(Constants.TRANSFER_IP), resHdfsPath);
SFTPUtil.connectServer(PropertyUtils.getString(Constants.TRANSFER_IP),PropertyUtils.getInt(Constants.TRANSFER_PORT),
PropertyUtils.getString(Constants.TRANSFER_USERNAME),PropertyUtils.getString(Constants.TRANSFER_PASSWORD));
SFTPUtil.downloadFile(resHdfsPath, execLocalPath + File.separator + fullName.replaceAll("[/]", ""));
SFTPUtil.close();
}else {
logger.info("get resource file from hdfs or local :{}", resHdfsPath);
HadoopUtils.getInstance().copyHdfsToLocal(resHdfsPath, execLocalPath + File.separator + fullName, false, true);
}
} catch (Exception e) {
logger.error(e.getMessage(), e);
if(PropertyUtils.getBoolean(Constants.TRANSFER_ENABLE)) {
SFTPUtil.close();
}
throw new RuntimeException(e.getMessage());
}
} else {
logger.info("file : {} exists ", resFile.getName());
}
}
} 
/** 
  send an ack to change the status of the task.
*/
private void changeTaskExecutionStatusToRunning() {
taskExecutionContext.setCurrentExecutionStatus(ExecutionStatus.RUNNING_EXECUTION);
Command ackCommand = buildAckCommand().convert2Command();
try {
RetryerUtils.retryCall(() -> {
taskCallbackService.sendAck(taskExecutionContext.getTaskInstanceId(), ackCommand);
return Boolean.TRUE;
});
} catch (ExecutionException | RetryException e) {
logger.error(e.getMessage(), e);
}
}
/** 
  build ack command.
  
  @return  TaskExecuteAckCommand 
*/
private TaskExecuteAckCommand buildAckCommand() {
TaskExecuteAckCommand ackCommand = new TaskExecuteAckCommand();
ackCommand.setTaskInstanceId(taskExecutionContext.getTaskInstanceId());
ackCommand.setStatus(taskExecutionContext.getCurrentExecutionStatus().getCode());
ackCommand.setStartTime(taskExecutionContext.getStartTime());
ackCommand.setLogPath(taskExecutionContext.getLogPath());
ackCommand.setHost(taskExecutionContext.getHost());
if (TaskType.SQL.getDesc().equalsIgnoreCase(taskExecutionContext.getTaskType()) || TaskType.PROCEDURE.getDesc().equalsIgnoreCase(taskExecutionContext.getTaskType())) {
ackCommand.setExecutePath(null);
} else {
ackCommand.setExecutePath(taskExecutionContext.getExecutePath());
}
return ackCommand;
}
/** 
  get current TaskExecutionContext
  
  @return  TaskExecutionContext 
*/
public TaskExecutionContext getTaskExecutionContext() {
return this.taskExecutionContext;
}
@Override public long getDelay(TimeUnit unit) {
return unit.convert(DateUtils.getRemainTime(taskExecutionContext.getFirstSubmitTime(),
taskExecutionContext.getDelayTime() * 60L), TimeUnit.SECONDS);
}
@Override public int compareTo(Delayed o) {
if (o == null) {
return 1;
}
return Long.compare(this.getDelay(TimeUnit.MILLISECONDS), o.getDelay(TimeUnit.MILLISECONDS));
}
private void preBuildBusinessParams() {
Map<String, Property> paramsMap = new HashMap<>();
// replace variable TIME with $[YYYYmmddd...] in shell file when history run job and batch complement job
if (taskExecutionContext.getScheduleTime() != null) {
Date date = taskExecutionContext.getScheduleTime();
String dateTime = DateUtils.format(date, Constants.PARAMETER_FORMAT_TIME);
Property p = new Property();
p.setValue(dateTime);
p.setProp(Constants.PARAMETER_DATETIME);
paramsMap.put(Constants.PARAMETER_DATETIME, p);
}
taskExecutionContext.setParamsMap(paramsMap);
}
public AbstractTask getTask() {
return task;
}
} 

下载文件工具类相关代码

SFTPUtil源码

package org.apache.dolphinscheduler.common.utils;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.net.SocketException;
import java.util.ArrayList;
import java.util.List;
import java.util.Properties;
import java.util.Vector;

import com.jcraft.jsch.ChannelSftp;
import com.jcraft.jsch.JSch;
import com.jcraft.jsch.JSchException;
import com.jcraft.jsch.Session;
import com.jcraft.jsch.SftpException;

public class SFTPUtil {
private static Session session = null;
private static ChannelSftp channel = null;

/**
 * 连接sftp服务器
 *
 * @param serverIP 服务IP
 * @param port     端口
 * @param userName 用户名
 * @param password 密码
 * @throws SocketException SocketException
 * @throws IOException     IOException
 * @throws JSchException   JSchException
 */
public static void connectServer(String serverIP, int port, String userName, String password) throws SocketException, IOException, JSchException {
    JSch jsch = new JSch();
    // 根据用户名,主机ip,端口获取一个Session对象
    session = jsch.getSession(userName, serverIP, port);
    // 设置密码
    session.setPassword(password);
    // 为Session对象设置properties
    Properties config = new Properties();
    config.put("StrictHostKeyChecking", "no");
    session.setConfig(config);
    // 通过Session建立链接
    session.connect();
    // 打开SFTP通道
    channel = (ChannelSftp) session.openChannel("sftp");
    // 建立SFTP通道的连接
    channel.connect();

}

/**
 * 自动关闭资源
 */
public static void close() {
    if (channel != null) {
        channel.disconnect();
    }
    if (session != null) {
        session.disconnect();
    }
}

public static List<ChannelSftp.LsEntry> getDirList(String path) throws SftpException {
    List<ChannelSftp.LsEntry> list = new ArrayList<>();
    if (channel != null) {
        Vector vv = channel.ls(path);
        if (vv == null && vv.size() == 0) {
            return list;
        } else {
            Object[] aa = vv.toArray();
            for (int i = 0; i < aa.length; i++) {
                ChannelSftp.LsEntry temp = (ChannelSftp.LsEntry) aa[i];
                list.add(temp);

            }
        }
    }
    return list;
}

/**
 * 下载文件
 *
 * @param remotePathFile 远程文件
 * @param localPathFile  本地文件[绝对路径]
 * @throws SftpException SftpException
 * @throws IOException   IOException
 */
public static void downloadFile(String remotePathFile, String localPathFile) throws SftpException, IOException {
    try (FileOutputStream os = new FileOutputStream(new File(localPathFile))) {
        if (channel == null)
            throw new IOException("sftp server not login");
        channel.get(remotePathFile, os);
    }
}

/**
 * 上传文件
 *
 * @param remoteFile 远程文件
 * @param localFile
 * @throws SftpException
 * @throws IOException
 */
public static void uploadFile(String remoteFile, String localFile) throws SftpException, IOException {
    try (FileInputStream in = new FileInputStream(new File(localFile))) {
        if (channel == null)
            throw new IOException("sftp server not login");
        channel.put(in, remoteFile);
    }
}
public static void main(String[] args) throws SocketException, IOException, JSchException, SftpException {
    SFTPUtil.connectServer("192.168.38.5",22,"dolphinscheduler","dob7@ZvT");
    SFTPUtil.downloadFile("/tmp/dslocalfiletest/dolphin/resources/685ab8fe07f38e80651378a160c0fdaa.jpeg", "D:\\tmp\\685ab8fe07f38e80651378a160c0fdaa.jpeg");
    SFTPUtil.close();
}

}

FtpUtils源码

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;

import org.apache.commons.net.ftp.FTPClient;
import org.apache.commons.net.ftp.FTPFile;
import org.apache.commons.net.ftp.FTPReply;

public class FtpUtils {

private static FTPClient ftp;

/**
 * 获取ftp连接
 *
 * @param url      服务地址
 * @param port     服务端口
 * @param username 用户名
 * @param password 密码
 * @param path     上传到ftp哪呢目录
 * @return
 * @throws IOException
 */
public static FTPClient initFtpClient(String url, int port, String username, String password, String path) throws IOException {
    ftp=new FTPClient();
    int reply;
    ftp.connect(url, port);//连接FTP服务器
    ftp.login(username, password);//登录
    ftp.setFileType(FTPClient.BINARY_FILE_TYPE);
    ftp.enterLocalPassiveMode();
    reply = ftp.getReplyCode();
    if (!FTPReply.isPositiveCompletion(reply)) {
        ftp.disconnect();
        System.out.println("连接ftp服务器失败...ftp服务:" + url + ":" + port);
    }
    ftp.changeWorkingDirectory(path);
    System.out.println("connect successfu...ftp服务:" + url + ":" + port);
    return ftp;
}

/**
 * 上传文件夹或上传单个文件
 * @param file  上传的文件
 * @throws IOException
 * @throws FtpProtocolException
 */
public static void uploadFolder(File file) throws IOException {
    boolean rsult = false;
    if (file.isDirectory()) {
        ftp.makeDirectory(file.getName());//在ftp服务上创建一个和要上传文件同名的文件夹
        ftp.changeWorkingDirectory(file.getName());//改变工作目录到刚创建的文件夹下
        File[] files = file.listFiles();
        for (File f : files) {
            if (f.isDirectory()) {   //子文件夹
                uploadFolder(f);
                ftp.changeToParentDirectory(); //上传完子文件夹切换回父目录
            } else {
                File file1 = new File(file.getPath() + File.separator + f.getName());
                InputStream input = new FileInputStream(file1);
                boolean b = ftp.storeFile(file1.getName(), input);
                input.close();
            }
        }
    } else {
        File file1 = new File(file.getPath());
        FileInputStream input = new FileInputStream(file1);
        ftp.storeFile(file1.getName(), input);
        input.close();
    }
}

/**
 * 上传单个文件, 判断是否上传成功
 * @param file 要上传的文件
 * @return
 * @throws IOException
 */
public static boolean uploadFile(File file) throws IOException {
    InputStream input=new FileInputStream(file);
    boolean b = ftp.storeFile(file.getName(), input);
    return b;
}

/**
 *  下载ftp上文件, 支持指定文件下载, 目录下载
 * @param remotePath    ftp文件目录
 * @param localPath     本地保存文件目录
 * @param fileName      要下载的文件名
 * @throws IOException
 */
public static void downFiles(String remotePath,String localPath,String fileName) throws IOException {
    ftp.changeWorkingDirectory(remotePath);//转移到FTP服务器目录
    FTPFile[] fs = ftp.listFiles();
    for (FTPFile ff : fs) {
        if (fileName!=null&&!ff.getName().equals(fileName)){
            continue;
        }
        if (ff.isDirectory()){
            File file=new File(localPath+File.separator+ff.getName());
            if (!file.exists())file.mkdirs();
            downFiles(remotePath+File.separator+ff.getName(),localPath+File.separator+ff.getName(),"");
            ftp.changeToParentDirectory(); //上传完子文件夹切换回父目录
        }else {
            File localFile = new File(localPath + File.separator + ff.getName());
            OutputStream is = new FileOutputStream(localFile);
            ftp.retrieveFile(ff.getName(), is);
        }
    }
}

/**
 * 断开ftp连接
 * @throws IOException
 */
public static void shutDownFtp() throws IOException {
    ftp.logout();
    ftp.disconnect();
}

public static void main(String[] args) {
    //ftp连接
    try {
        FtpUtils.initFtpClient("192.168.38.5",21,"dolphinscheduler","dob7@ZvT","");
    } catch (IOException e) {
        System.out.println("ftp服务连接异常!");
    }
    //下载文件
    try {
        FtpUtils.downFiles("/tmp/danji/dolphin/resources","D:\\","process_1678851988656.json");
    } catch (IOException e) {
        System.out.println("ftp下载文件异常!");
    }
}

}

pom.xml

<dependency>
    <groupId>com.jcraft</groupId>
    <artifactId>jsch</artifactId>
    <version>0.1.42</version>
</dependency>

其它

Hadoop之FileSystem

  • 为什么配置文件要配置HDFS,明明没有用到hadoop集群

因为用到了hadoop的FileSystem(通过java操作HDFS系统),其中的get()方法,可以获取HDFS,也可以获取本地文件系统(单机),虽然没用到Hadoop集群,但是用到了Hadoop提供的方法(hadoop-common-2.7.3.jar)
image.png
image.png

/**

Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements.  See the NOTICE file
distributed with this work for additional information
regarding copyright ownership.  The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License.  You may obtain a copy of the License at

 
http://www.apache.org/licenses/LICENSE-2.0
 

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package org.apache.hadoop.fs;

import java.io.Closeable;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.lang.ref.WeakReference;
import java.lang.ref.ReferenceQueue;
import java.net.URI;
import java.net.URISyntaxException;
import java.security.PrivilegedExceptionAction;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.EnumSet;
import java.util.HashMap;
import java.util.HashSet;
import java.util.IdentityHashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.NoSuchElementException;
import java.util.ServiceConfigurationError;
import java.util.ServiceLoader;
import java.util.Set;
import java.util.Stack;
import java.util.TreeSet;
import java.util.concurrent.atomic.AtomicLong;

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.hadoop.classification.InterfaceAudience;
import org.apache.hadoop.classification.InterfaceStability;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Options.ChecksumOpt;
import org.apache.hadoop.fs.Options.Rename;
import org.apache.hadoop.fs.permission.AclEntry;
import org.apache.hadoop.fs.permission.AclStatus;
import org.apache.hadoop.fs.permission.FsAction;
import org.apache.hadoop.fs.permission.FsPermission;
import org.apache.hadoop.io.MultipleIOException;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.net.NetUtils;
import org.apache.hadoop.security.AccessControlException;
import org.apache.hadoop.security.Credentials;
import org.apache.hadoop.security.SecurityUtil;
import org.apache.hadoop.security.UserGroupInformation;
import org.apache.hadoop.security.token.Token;
import org.apache.hadoop.util.ClassUtil;
import org.apache.hadoop.util.DataChecksum;
import org.apache.hadoop.util.Progressable;
import org.apache.hadoop.util.ReflectionUtils;
import org.apache.hadoop.util.ShutdownHookManager;
import org.apache.hadoop.util.StringUtils;

import com.google.common.annotations.VisibleForTesting;

/

An abstract base class for a fairly generic filesystem.  It
may be implemented as a distributed filesystem, or as a "local"
one that reflects the locally-connected disk.  The local version
exists for small Hadoop instances and for testing.

 

All user code that may potentially use the Hadoop Distributed
File System should be written to use a FileSystem object.  The
Hadoop DFS is a multi-machine system that appears as a single
disk.  It's useful because of its fault tolerance and potentially
very large capacity.

 
The local implementation is {@link  LocalFileSystem} and distributed 
implementation is DistributedFileSystem.
*/
@InterfaceAudience.Public
@InterfaceStability.Stable
public abstract class FileSystem extends Configured implements Closeable {
public static final String FS_DEFAULT_NAME_KEY =
CommonConfigurationKeys.FS_DEFAULT_NAME_KEY;
public static final String DEFAULT_FS =
CommonConfigurationKeys.FS_DEFAULT_NAME_DEFAULT;

public static final Log LOG = LogFactory.getLog(FileSystem.class);

/**

Priority of the FileSystem shutdown hook.
*/
public static final int SHUTDOWN_HOOK_PRIORITY = 10;

/** FileSystem cache */
static final Cache CACHE = new Cache();

/** The key this instance is stored under in the cache. */
private Cache.Key key;

/** Recording statistics per a FileSystem class */
private static final Map<Class<? extends FileSystem>, Statistics>
statisticsTable =
new IdentityHashMap<Class<? extends FileSystem>, Statistics>();

/**

The statistics for this file system.
*/
protected Statistics statistics;

/**

A cache of files that should be deleted when filsystem is closed
or the JVM is exited.
*/
private Set deleteOnExit = new TreeSet();

boolean resolveSymlinks;
/**

This method adds a file system for testing so that we can find it later. It
is only for testing.
@param  uri the uri to store it under 
@param  conf the configuration to store it under 
@param  fs the file system to store 
@throws  IOException 
*/
static void addFileSystemForTesting(URI uri, Configuration conf,
FileSystem fs) throws IOException {
CACHE.map.put(new Cache.Key(uri, conf), fs);
}

/**

Get a filesystem instance based on the uri, the passed
configuration and the user
@param  uri of the filesystem 
@param  conf the configuration to use 
@param  user to perform the get as 
@return  the filesystem instance 
@throws  IOException 
@throws  InterruptedException 
*/
public static FileSystem get(final URI uri, final Configuration conf,
final String user) throws IOException, InterruptedException {
String ticketCachePath =
conf.get(CommonConfigurationKeys.KERBEROS_TICKET_CACHE_PATH);
UserGroupInformation ugi =
UserGroupInformation.getBestUGI(ticketCachePath, user);
return ugi.doAs(new PrivilegedExceptionAction() {
@Override  
public FileSystem run() throws IOException {
return get(uri, conf);
}
});
}

/**

Returns the configured filesystem implementation.
@param  conf the configuration to use 
*/
public static FileSystem get(Configuration conf) throws IOException {
return get(getDefaultUri(conf), conf);
}

/** Get the default filesystem URI from a configuration.

@param  conf the configuration to use 
@return  the uri of the default filesystem 
*/
public static URI getDefaultUri(Configuration conf) {
return URI.create(fixName(conf.get(FS_DEFAULT_NAME_KEY, DEFAULT_FS)));
}

/** Set the default filesystem URI in a configuration.

@param  conf the configuration to alter 
@param  uri the new default filesystem uri 
*/
public static void setDefaultUri(Configuration conf, URI uri) {
conf.set(FS_DEFAULT_NAME_KEY, uri.toString());
}

/** Set the default filesystem URI in a configuration.

@param  conf the configuration to alter 
@param  uri the new default filesystem uri 
*/
public static void setDefaultUri(Configuration conf, String uri) {
setDefaultUri(conf, URI.create(fixName(uri)));
}

/** Called after a new FileSystem instance is constructed.

@param  name a uri whose authority section names the host, port, etc. 
for this FileSystem
@param  conf the configuration 
*/
public void initialize(URI name, Configuration conf) throws IOException {
statistics = getStatistics(name.getScheme(), getClass());
resolveSymlinks = conf.getBoolean(
CommonConfigurationKeys.FS_CLIENT_RESOLVE_REMOTE_SYMLINKS_KEY,
CommonConfigurationKeys.FS_CLIENT_RESOLVE_REMOTE_SYMLINKS_DEFAULT);
}

/**

Return the protocol scheme for the FileSystem.
 
This implementation throws an UnsupportedOperationException.

@return  the protocol scheme for the FileSystem. 
*/
public String getScheme() {
throw new UnsupportedOperationException("Not implemented by the " + getClass().getSimpleName() + " FileSystem implementation");
}

/** Returns a URI whose scheme and authority identify this FileSystem.*/
public abstract URI getUri();

/**

Return a canonicalized form of this FileSystem's URI.

The default implementation simply calls {@link  #canonicalizeUri(URI)} 
on the filesystem's own URI, so subclasses typically only need to
implement that method.

@see  #canonicalizeUri(URI) 
*/
protected URI getCanonicalUri() {
return canonicalizeUri(getUri());
}

/**

Canonicalize the given URI.

This is filesystem-dependent, but may for example consist of
canonicalizing the hostname using DNS and adding the default
port if not specified.

The default implementation simply fills in the default port if
not specified and if the filesystem has a default port.

@return  URI 
@see  NetUtils#getCanonicalUri(URI, int) 
*/
protected URI canonicalizeUri(URI uri) {
if (uri.getPort() == -1 && getDefaultPort() > 0) {
// reconstruct the uri with the default port set
try {
uri = new URI(uri.getScheme(), uri.getUserInfo(),
uri.getHost(), getDefaultPort(),
uri.getPath(), uri.getQuery(), uri.getFragment());
} catch (URISyntaxException e) {
// Should never happen!
throw new AssertionError("Valid URI became unparseable: " +
uri);
}
}

return uri;

}

/**

Get the default port for this file system.
@return  the default port or 0 if there isn't one 
*/
protected int getDefaultPort() {
return 0;
}

protected static FileSystem getFSofPath(final Path absOrFqPath,
final Configuration conf)
throws UnsupportedFileSystemException, IOException {
absOrFqPath.checkNotSchemeWithRelative();
absOrFqPath.checkNotRelative();

// Uses the default file system if not fully qualified
return get(absOrFqPath.toUri(), conf);

}

/**

Get a canonical service name for this file system.  The token cache is
the only user of the canonical service name, and uses it to lookup this
filesystem's service tokens.
If file system provides a token of its own then it must have a canonical
name, otherwise canonical name can be null.

Default Impl: If the file system has child file systems
(such as an embedded file system) then it is assumed that the fs has no
tokens of its own and hence returns a null name; otherwise a service
name is built using Uri and port.

@return  a service string that uniquely identifies this file system, null 
 
    if the filesystem does not implement tokens
 
@see  SecurityUtil#buildDTServiceName(URI, int) 
*/
@InterfaceAudience.LimitedPrivate({ "HDFS", "MapReduce" })
public String getCanonicalServiceName() {
return (getChildFileSystems() == null)
? SecurityUtil.buildDTServiceName(getUri(), getDefaultPort())
: null;
}

/** @deprecated  call #getUri() instead.*/ 
@Deprecated  
public String getName() { return getUri().toString(); }

/** @deprecated  call #get(URI,Configuration) instead. */ 
@Deprecated  
public static FileSystem getNamed(String name, Configuration conf)
throws IOException {
return get(URI.create(fixName(name)), conf);
}

/** Update old-format filesystem names, for back-compatibility.  This should

eventually be replaced with a checkName() method that throws an exception
for old-format names. */
private static String fixName(String name) {
// convert old-format name to new-format name
if (name.equals("local")) {         // "local" is now "file:///".
LOG.warn(""local" is a deprecated filesystem name."
+" Use "file:///" instead.");
name = "file:///";
} else if (name.indexOf('/')==-1) {   // unqualified is "hdfs://"
LOG.warn("""+name+"" is a deprecated filesystem name."
+" Use "hdfs://"+name+"/" instead.");
name = "hdfs://"+name;
}
return name;
}

/**

Get the local file system.
@param  conf the configuration to configure the file system with 
@return  a LocalFileSystem 
*/
public static LocalFileSystem getLocal(Configuration conf)
throws IOException {
return (LocalFileSystem)get(LocalFileSystem.NAME, conf);
}

/** Returns the FileSystem for this URI's scheme and authority.  The scheme

of the URI determines a configuration property name,
fs.scheme.class whose value names the FileSystem class.
The entire URI is passed to the FileSystem instance's initialize method.
*/
public static FileSystem get(URI uri, Configuration conf) throws IOException {
String scheme = uri.getScheme();
String authority = uri.getAuthority();

if (scheme == null && authority == null) {     // use default FS
  return get(conf);
}

if (scheme != null && authority == null) {     // no authority
  URI defaultUri = getDefaultUri(conf);
  if (scheme.equals(defaultUri.getScheme())    // if scheme matches default
      && defaultUri.getAuthority() != null) {  // & default has authority
    return get(defaultUri, conf);              // return default
  }
}

String disableCacheName = String.format("fs.%s.impl.disable.cache", scheme);
if (conf.getBoolean(disableCacheName, false)) {
  return createFileSystem(uri, conf);
}

return CACHE.get(uri, conf);

}

/**

Returns the FileSystem for this URI's scheme and authority and the
passed user. Internally invokes {@link  #newInstance(URI, Configuration)} 
@param  uri of the filesystem 
@param  conf the configuration to use 
@param  user to perform the get as 
@return  filesystem instance 
@throws  IOException 
@throws  InterruptedException 
*/
public static FileSystem newInstance(final URI uri, final Configuration conf,
final String user) throws IOException, InterruptedException {
String ticketCachePath =
conf.get(CommonConfigurationKeys.KERBEROS_TICKET_CACHE_PATH);
UserGroupInformation ugi =
UserGroupInformation.getBestUGI(ticketCachePath, user);
return ugi.doAs(new PrivilegedExceptionAction() {
@Override  
public FileSystem run() throws IOException {
return newInstance(uri,conf);
}
});
}
/** Returns the FileSystem for this URI's scheme and authority.  The scheme
of the URI determines a configuration property name,
fs.scheme.class whose value names the FileSystem class.
The entire URI is passed to the FileSystem instance's initialize method.
This always returns a new FileSystem object.
*/
public static FileSystem newInstance(URI uri, Configuration conf) throws IOException {
String scheme = uri.getScheme();
String authority = uri.getAuthority();

if (scheme == null) {                       // no scheme: use default FS
  return newInstance(conf);
}

if (authority == null) {                       // no authority
  URI defaultUri = getDefaultUri(conf);
  if (scheme.equals(defaultUri.getScheme())    // if scheme matches default
      && defaultUri.getAuthority() != null) {  // & default has authority
    return newInstance(defaultUri, conf);              // return default
  }
}
return CACHE.getUnique(uri, conf);

}

/** Returns a unique configured filesystem implementation.

This always returns a new FileSystem object.
@param  conf the configuration to use 
*/
public static FileSystem newInstance(Configuration conf) throws IOException {
return newInstance(getDefaultUri(conf), conf);
}

/**

Get a unique local file system object
@param  conf the configuration to configure the file system with 
@return  a LocalFileSystem 
This always returns a new FileSystem object.
*/
public static LocalFileSystem newInstanceLocal(Configuration conf)
throws IOException {
return (LocalFileSystem)newInstance(LocalFileSystem.NAME, conf);
}

/**

Close all cached filesystems. Be sure those filesystems are not
used anymore.

@throws  IOException 
*/
public static void closeAll() throws IOException {
CACHE.closeAll();
}

/**

Close all cached filesystems for a given UGI. Be sure those filesystems
are not used anymore.
@param  ugi user group info to close 
@throws  IOException 
*/
public static void closeAllForUGI(UserGroupInformation ugi)
throws IOException {
CACHE.closeAll(ugi);
}

/**

Make sure that a path specifies a FileSystem.
@param  path to use 
*/
public Path makeQualified(Path path) {
checkPath(path);
return path.makeQualified(this.getUri(), this.getWorkingDirectory());
}

/**

Get a new delegation token for this file system.
This is an internal method that should have been declared protected
but wasn't historically.
Callers should use {@link  #addDelegationTokens(String, Credentials)} 

@param  renewer the account name that is allowed to renew the token. 
@return  a new delegation token 
@throws  IOException 
*/
@InterfaceAudience.Private()
public Token<?> getDelegationToken(String renewer) throws IOException {
return null;
}

/**

Obtain all delegation tokens used by this FileSystem that are not
already present in the given Credentials.  Existing tokens will neither
be verified as valid nor having the given renewer.  Missing tokens will
be acquired and added to the given Credentials.

Default Impl: works for simple fs with its own token
and also for an embedded fs whose tokens are those of its
children file system (i.e. the embedded fs has not tokens of its
own).

@param  renewer the user allowed to renew the delegation tokens 
@param  credentials cache in which to add new delegation tokens 
@return  list of new delegation tokens 
@throws  IOException 
*/
@InterfaceAudience.LimitedPrivate({ "HDFS", "MapReduce" })
public Token<?>[] addDelegationTokens(
final String renewer, Credentials credentials) throws IOException {
if (credentials == null) {
credentials = new Credentials();
}
final List<Token> tokens = new ArrayList>();
collectDelegationTokens(renewer, credentials, tokens);
return tokens.toArray(new Token<?>[tokens.size()]);
}

/**

Recursively obtain the tokens for this FileSystem and all descended
FileSystems as determined by getChildFileSystems().
@param  renewer the user allowed to renew the delegation tokens 
@param  credentials cache in which to add the new delegation tokens 
@param  tokens list in which to add acquired tokens 
@throws  IOException 
*/
private void collectDelegationTokens(final String renewer,
final Credentials credentials,
final List<Token<?>> tokens)
throws IOException {
final String serviceName = getCanonicalServiceName();
// Collect token of the this filesystem and then of its embedded children
if (serviceName != null) { // fs has token, grab it
final Text service = new Text(serviceName);
Token<?> token = credentials.getToken(service);
if (token == null) {
token = getDelegationToken(renewer);
if (token != null) {
tokens.add(token);
credentials.addToken(service, token);
}
}
}
// Now collect the tokens from the children
final FileSystem[] children = getChildFileSystems();
if (children != null) {
for (final FileSystem fs : children) {
fs.collectDelegationTokens(renewer, credentials, tokens);
}
}
}

/**

Get all the immediate child FileSystems embedded in this FileSystem.
It does not recurse and get grand children.  If a FileSystem
has multiple child FileSystems, then it should return a unique list
of those FileSystems.  Default is to return null to signify no children.

@return  FileSystems used by this FileSystem 
*/
@InterfaceAudience.LimitedPrivate({ "HDFS" })
@VisibleForTesting  
public FileSystem[] getChildFileSystems() {
return null;
}

/** create a file with the provided permission

The permission of the file is set to be the provided permission as in
setPermission, not permission&~umask

It is implemented using two RPCs. It is understood that it is inefficient,
but the implementation is thread-safe. The other option is to change the
value of umask in configuration to be 0, but it is not thread-safe.

@param  fs file system handle 
@param  file the name of the file to be created 
@param  permission the permission of the file 
@return  an output stream 
@throws  IOException 
*/
public static FSDataOutputStream create(FileSystem fs,
Path file, FsPermission permission) throws IOException {
// create the file with default permission
FSDataOutputStream out = fs.create(file);
// set its permission to the supplied one
fs.setPermission(file, permission);
return out;
}

/** create a directory with the provided permission

The permission of the directory is set to be the provided permission as in
setPermission, not permission&~umask

@see  #create(FileSystem, Path, FsPermission) 

@param  fs file system handle 
@param  dir the name of the directory to be created 
@param  permission the permission of the directory 
@return  true if the directory creation succeeds; false otherwise 
@throws  IOException 
*/
public static boolean mkdirs(FileSystem fs, Path dir, FsPermission permission)
throws IOException {
// create the directory using the default permission
boolean result = fs.mkdirs(dir);
// set its permission to be the supplied one
fs.setPermission(dir, permission);
return result;
}

///
// FileSystem
///

protected FileSystem() {
super(null);
}

/**

Check that a Path belongs to this FileSystem.
@param  path to check 
*/
protected void checkPath(Path path) {
URI uri = path.toUri();
String thatScheme = uri.getScheme();
if (thatScheme == null)                // fs is relative
return;
URI thisUri = getCanonicalUri();
String thisScheme = thisUri.getScheme();
//authority and scheme are not case sensitive
if (thisScheme.equalsIgnoreCase(thatScheme)) {// schemes match
String thisAuthority = thisUri.getAuthority();
String thatAuthority = uri.getAuthority();
if (thatAuthority == null &&                // path's authority is null
thisAuthority != null) {                // fs has an authority
URI defaultUri = getDefaultUri(getConf());
if (thisScheme.equalsIgnoreCase(defaultUri.getScheme())) {
uri = defaultUri; // schemes match, so use this uri instead
} else {
uri = null; // can't determine auth of the path
}
}
if (uri != null) {
// canonicalize uri before comparing with this fs
uri = canonicalizeUri(uri);
thatAuthority = uri.getAuthority();
if (thisAuthority == thatAuthority ||       // authorities match
(thisAuthority != null &&
thisAuthority.equalsIgnoreCase(thatAuthority)))
return;
}
}
throw new IllegalArgumentException("Wrong FS: "+path+
", expected: "+this.getUri());
}

/**

Return an array containing hostnames, offset and size of
portions of the given file.  For a nonexistent
file or regions, null will be returned.

This call is most helpful with DFS, where it returns
hostnames of machines that contain the given file.

The FileSystem will simply return an elt containing 'localhost'.

@param  file FilesStatus to get data from 
@param  start offset into the given file 
@param  len length for which to get locations for 
*/
public BlockLocation[] getFileBlockLocations(FileStatus file,
long start, long len) throws IOException {
if (file == null) {
return null;
}

if (start < 0 || len < 0) {
  throw new IllegalArgumentException("Invalid start or len parameter");
}

if (file.getLen() <= start) {
  return new BlockLocation[0];

}
String[] name = { "localhost:50010" };
String[] host = { "localhost" };
return new BlockLocation[] {
  new BlockLocation(name, host, 0, file.getLen()) };

}

/**

Return an array containing hostnames, offset and size of
portions of the given file.  For a nonexistent
file or regions, null will be returned.

This call is most helpful with DFS, where it returns
hostnames of machines that contain the given file.

The FileSystem will simply return an elt containing 'localhost'.

@param  p path is used to identify an FS since an FS could have 
 
     another FS that it could be delegating the call to
 
@param  start offset into the given file 
@param  len length for which to get locations for 
*/
public BlockLocation[] getFileBlockLocations(Path p,
long start, long len) throws IOException {
if (p == null) {
throw new NullPointerException();
}
FileStatus file = getFileStatus(p);
return getFileBlockLocations(file, start, len);
}

/**

Return a set of server default configuration values
@return  server default configuration values 
@throws  IOException 
@deprecated  use {@link  #getServerDefaults(Path)} instead 
*/
@Deprecated  
public FsServerDefaults getServerDefaults() throws IOException {
Configuration conf = getConf();
// CRC32 is chosen as default as it is available in all
// releases that support checksum.
// The client trash configuration is ignored.
return new FsServerDefaults(getDefaultBlockSize(),
conf.getInt("io.bytes.per.checksum", 512),
64 * 1024,
getDefaultReplication(),
conf.getInt("io.file.buffer.size", 4096),
false,
CommonConfigurationKeysPublic.FS_TRASH_INTERVAL_DEFAULT,
DataChecksum.Type.CRC32);
}

/**

Return a set of server default configuration values
@param  p path is used to identify an FS since an FS could have 
 
     another FS that it could be delegating the call to
 
@return  server default configuration values 
@throws  IOException 
*/
public FsServerDefaults getServerDefaults(Path p) throws IOException {
return getServerDefaults();
}

/**

Return the fully-qualified path of path f resolving the path
through any symlinks or mount point
@param  p path to be resolved 
@return  fully qualified path 
@throws  FileNotFoundException 
*/
public Path resolvePath(final Path p) throws IOException {
checkPath(p);
return getFileStatus(p).getPath();
}

/**

Opens an FSDataInputStream at the indicated Path.
@param  f the file name to open 
@param  bufferSize the size of the buffer to be used. 
*/
public abstract FSDataInputStream open(Path f, int bufferSize)
throws IOException;

/**

Opens an FSDataInputStream at the indicated Path.
@param  f the file to open 
*/
public FSDataInputStream open(Path f) throws IOException {
return open(f, getConf().getInt("io.file.buffer.size", 4096));
}

/**

Create an FSDataOutputStream at the indicated Path.
Files are overwritten by default.
@param  f the file to create 
*/
public FSDataOutputStream create(Path f) throws IOException {
return create(f, true);
}

/**

Create an FSDataOutputStream at the indicated Path.
@param  f the file to create 
@param  overwrite if a file with this name already exists, then if true, 
the file will be overwritten, and if false an exception will be thrown.
*/
public FSDataOutputStream create(Path f, boolean overwrite)
throws IOException {
return create(f, overwrite,
getConf().getInt("io.file.buffer.size", 4096),
getDefaultReplication(f),
getDefaultBlockSize(f));
}

/**

Create an FSDataOutputStream at the indicated Path with write-progress
reporting.
Files are overwritten by default.
@param  f the file to create 
@param  progress to report progress 
*/
public FSDataOutputStream create(Path f, Progressable progress)
throws IOException {
return create(f, true,
getConf().getInt("io.file.buffer.size", 4096),
getDefaultReplication(f),
getDefaultBlockSize(f), progress);
}

/**

Create an FSDataOutputStream at the indicated Path.
Files are overwritten by default.
@param  f the file to create 
@param  replication the replication factor 
*/
public FSDataOutputStream create(Path f, short replication)
throws IOException {
return create(f, true,
getConf().getInt("io.file.buffer.size", 4096),
replication,
getDefaultBlockSize(f));
}

/**

Create an FSDataOutputStream at the indicated Path with write-progress
reporting.
Files are overwritten by default.
@param  f the file to create 
@param  replication the replication factor 
@param  progress to report progress 
*/
public FSDataOutputStream create(Path f, short replication,
Progressable progress) throws IOException {
return create(f, true,
getConf().getInt(
CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_KEY,
CommonConfigurationKeysPublic.IO_FILE_BUFFER_SIZE_DEFAULT),
replication,
getDefaultBlockSize(f), progress);
}

/**

Create an FSDataOutputStream at the indicated Path.
@param  f the file name to create 
@param  overwrite if a file with this name already exists, then if true, 
the file will be overwritten, and if false an error will be thrown.
@param  bufferSize the size of the buffer to be used. 
*/
public FSDataOutputStream create(Path f,
boolean overwrite,
int bufferSize
) throws IOException {
return create(f, overwrite, bufferSize,
getDefaultReplication(f),
getDefaultBlockSize(f));
}

/**

Create an FSDataOutputStream at the indicated Path with write-progress
reporting.
@param  f the path of the file to open 
@param  overwrite if a file with this name already exists, then if true, 
the file will be overwritten, and if false an error will be thrown.
@param  bufferSize the size of the buffer to be used. 
*/
public FSDataOutputStream create(Path f,
boolean overwrite,
int bufferSize,
Progressable progress
) throws IOException {
return create(f, overwrite, bufferSize,
getDefaultReplication(f),
getDefaultBlockSize(f), progress);
}

/**

Create an FSDataOutputStream at the indicated Path.
@param  f the file name to open 
@param  overwrite if a file with this name already exists, then if true, 
the file will be overwritten, and if false an error will be thrown.
@param  bufferSize the size of the buffer to be used. 
@param  replication required block replication for the file. 
*/
public FSDataOutputStream create(Path f,
boolean overwrite,
int bufferSize,
short replication,
long blockSize
) throws IOException {
return create(f, overwrite, bufferSize, replication, blockSize, null);
}

/**

Create an FSDataOutputStream at the indicated Path with write-progress
reporting.
@param  f the file name to open 
@param  overwrite if a file with this name already exists, then if true, 
the file will be overwritten, and if false an error will be thrown.
@param  bufferSize the size of the buffer to be used. 
@param  replication required block replication for the file. 
*/
public FSDataOutputStream create(Path f,
boolean overwrite,
int bufferSize,
short replication,
long blockSize,
Progressable progress
) throws IOException {
return this.create(f, FsPermission.getFileDefault().applyUMask(
FsPermission.getUMask(getConf())), overwrite, bufferSize,
replication, blockSize, progress);
}

/**

Create an FSDataOutputStream at the indicated Path with write-progress
reporting.
@param  f the file name to open 
@param  permission 
@param  overwrite if a file with this name already exists, then if true, 
the file will be overwritten, and if false an error will be thrown.
@param  bufferSize the size of the buffer to be used. 
@param  replication required block replication for the file. 
@param  blockSize 
@param  progress 
@throws  IOException 
@see  #setPermission(Path, FsPermission) 
*/
public abstract FSDataOutputStream create(Path f,
FsPermission permission,
boolean overwrite,
int bufferSize,
short replication,
long blockSize,
Progressable progress) throws IOException;

/**

Create an FSDataOutputStream at the indicated Path with write-progress
reporting.
@param  f the file name to open 
@param  permission 
@param  flags {@link  CreateFlag}s to use for this stream. 
@param  bufferSize the size of the buffer to be used. 
@param  replication required block replication for the file. 
@param  blockSize 
@param  progress 
@throws  IOException 
@see  #setPermission(Path, FsPermission) 
*/
public FSDataOutputStream create(Path f,
FsPermission permission,
EnumSet flags,
int bufferSize,
short replication,
long blockSize,
Progressable progress) throws IOException {
return create(f, permission, flags, bufferSize, replication,
blockSize, progress, null);
}

/**

Create an FSDataOutputStream at the indicated Path with a custom
checksum option
@param  f the file name to open 
@param  permission 
@param  flags {@link  CreateFlag}s to use for this stream. 
@param  bufferSize the size of the buffer to be used. 
@param  replication required block replication for the file. 
@param  blockSize 
@param  progress 
@param  checksumOpt checksum parameter. If null, the values 
 
   found in conf will be used.
 
@throws  IOException 
@see  #setPermission(Path, FsPermission) 
*/
public FSDataOutputStream create(Path f,
FsPermission permission,
EnumSet flags,
int bufferSize,
short replication,
long blockSize,
Progressable progress,
ChecksumOpt checksumOpt) throws IOException {
// Checksum options are ignored by default. The file systems that
// implement checksum need to override this method. The full
// support is currently only available in DFS.
return create(f, permission, flags.contains(CreateFlag.OVERWRITE),
bufferSize, replication, blockSize, progress);
}

/*.

This create has been added to support the FileContext that processes
the permission
with umask before calling this method.
This a temporary method added to support the transition from FileSystem
to FileContext for user applications.
*/
@Deprecated  
protected FSDataOutputStream primitiveCreate(Path f,
FsPermission absolutePermission, EnumSet flag, int bufferSize,
short replication, long blockSize, Progressable progress,
ChecksumOpt checksumOpt) throws IOException {

boolean pathExists = exists(f);
CreateFlag.validate(f, pathExists, flag);

// Default impl  assumes that permissions do not matter and 
// nor does the bytesPerChecksum  hence
// calling the regular create is good enough.
// FSs that implement permissions should override this.

if (pathExists && flag.contains(CreateFlag.APPEND)) {
  return append(f, bufferSize, progress);
}

return this.create(f, absolutePermission,
    flag.contains(CreateFlag.OVERWRITE), bufferSize, replication,
    blockSize, progress);

}

/**

This version of the mkdirs method assumes that the permission is absolute.
It has been added to support the FileContext that processes the permission
with umask before calling this method.
This a temporary method added to support the transition from FileSystem
to FileContext for user applications.
*/
@Deprecated  
protected boolean primitiveMkdir(Path f, FsPermission absolutePermission)
throws IOException {
// Default impl is to assume that permissions do not matter and hence
// calling the regular mkdirs is good enough.
// FSs that implement permissions should override this.
return this.mkdirs(f, absolutePermission);
}

/**

This version of the mkdirs method assumes that the permission is absolute.
It has been added to support the FileContext that processes the permission
with umask before calling this method.
This a temporary method added to support the transition from FileSystem
to FileContext for user applications.
*/
@Deprecated  
protected void primitiveMkdir(Path f, FsPermission absolutePermission,
boolean createParent)
throws IOException {

if (!createParent) { // parent must exist.
  // since the this.mkdirs makes parent dirs automatically
  // we must throw exception if parent does not exist.
  final FileStatus stat = getFileStatus(f.getParent());
  if (stat == null) {
    throw new FileNotFoundException("Missing parent:" + f);
  }
  if (!stat.isDirectory()) {
    throw new ParentNotDirectoryException("parent is not a dir");
  }
  // parent does exist - go ahead with mkdir of leaf
}
// Default impl is to assume that permissions do not matter and hence
// calling the regular mkdirs is good enough.
// FSs that implement permissions should override this.
if (!this.mkdirs(f, absolutePermission)) {
  throw new IOException("mkdir of "+ f + " failed");
}

}

/**

Opens an FSDataOutputStream at the indicated Path with write-progress
reporting. Same as create(), except fails if parent directory doesn't
already exist.
@param  f the file name to open 
@param  overwrite if a file with this name already exists, then if true, 
the file will be overwritten, and if false an error will be thrown.
@param  bufferSize the size of the buffer to be used. 
@param  replication required block replication for the file. 
@param  blockSize 
@param  progress 
@throws  IOException 
@see  #setPermission(Path, FsPermission) 
@deprecated  API only for 0.20-append 
*/
@Deprecated  
public FSDataOutputStream createNonRecursive(Path f,
boolean overwrite,
int bufferSize, short replication, long blockSize,
Progressable progress) throws IOException {
return this.createNonRecursive(f, FsPermission.getFileDefault(),
overwrite, bufferSize, replication, blockSize, progress);
}

/**

Opens an FSDataOutputStream at the indicated Path with write-progress
reporting. Same as create(), except fails if parent directory doesn't
already exist.
@param  f the file name to open 
@param  permission 
@param  overwrite if a file with this name already exists, then if true, 
the file will be overwritten, and if false an error will be thrown.
@param  bufferSize the size of the buffer to be used. 
@param  replication required block replication for the file. 
@param  blockSize 
@param  progress 
@throws  IOException 
@see  #setPermission(Path, FsPermission) 
@deprecated  API only for 0.20-append 
*/
@Deprecated  
public FSDataOutputStream createNonRecursive(Path f, FsPermission permission,
boolean overwrite, int bufferSize, short replication, long blockSize,
Progressable progress) throws IOException {
return createNonRecursive(f, permission,
overwrite ? EnumSet.of(CreateFlag.CREATE, CreateFlag.OVERWRITE)
: EnumSet.of(CreateFlag.CREATE), bufferSize,
replication, blockSize, progress);
}

/**
* Opens an FSDataOutputStream at the indicated Path with write-progress
* reporting. Same as create(), except fails if parent directory doesn't
* already exist.
* @param  f the file name to open 
* @param  permission 
* @param  flags {@link  CreateFlag}s to use for this stream. 
* @param  bufferSize the size of the buffer to be used. 
* @param  replication required block replication for the file. 
* @param  blockSize 
* @param  progress 
* @throws  IOException 
* @see  #setPermission(Path, FsPermission) 
* @deprecated  API only for 0.20-append 
*/
@Deprecated  
public FSDataOutputStream createNonRecursive(Path f, FsPermission permission,
EnumSet flags, int bufferSize, short replication, long blockSize,
Progressable progress) throws IOException {
throw new IOException("createNonRecursive unsupported for this filesystem "
+ this.getClass());
}

/**

Creates the given Path as a brand-new zero-length file.  If
create fails, or if it already existed, return false.

@param  f path to use for create 
*/
public boolean createNewFile(Path f) throws IOException {
if (exists(f)) {
return false;
} else {
create(f, false, getConf().getInt("io.file.buffer.size", 4096)).close();
return true;
}
}

/**

Append to an existing file (optional operation).
Same as append(f, getConf().getInt("io.file.buffer.size", 4096), null)
@param  f the existing file to be appended. 
@throws  IOException 
*/
public FSDataOutputStream append(Path f) throws IOException {
return append(f, getConf().getInt("io.file.buffer.size", 4096), null);
}
/**
Append to an existing file (optional operation).
Same as append(f, bufferSize, null).
@param  f the existing file to be appended. 
@param  bufferSize the size of the buffer to be used. 
@throws  IOException 
*/
public FSDataOutputStream append(Path f, int bufferSize) throws IOException {
return append(f, bufferSize, null);
}

/**

Append to an existing file (optional operation).
@param  f the existing file to be appended. 
@param  bufferSize the size of the buffer to be used. 
@param  progress for reporting progress if it is not null. 
@throws  IOException 
*/
public abstract FSDataOutputStream append(Path f, int bufferSize,
Progressable progress) throws IOException;

/**

Concat existing files together.
@param  trg the path to the target destination. 
@param  psrcs the paths to the sources to use for the concatenation. 
@throws  IOException 
*/
public void concat(final Path trg, final Path [] psrcs) throws IOException {
throw new UnsupportedOperationException("Not implemented by the " +
getClass().getSimpleName() + " FileSystem implementation");
}

/**

Get replication.

@deprecated  Use getFileStatus() instead 
@param  src file name 
@return  file replication 
@throws  IOException 
*/
@Deprecated  
public short getReplication(Path src) throws IOException {
return getFileStatus(src).getReplication();
}

/**

Set replication for an existing file.

@param  src file name 
@param  replication new replication 
@throws  IOException 
@return  true if successful; 
 
    false if file does not exist or is a directory
 

*/
public boolean setReplication(Path src, short replication)
throws IOException {
return true;
}

/**

Renames Path src to Path dst.  Can take place on local fs
or remote DFS.
@param  src path to be renamed 
@param  dst new path after rename 
@throws  IOException on failure 
@return  true if rename is successful 
*/
public abstract boolean rename(Path src, Path dst) throws IOException;

/**

Renames Path src to Path dst
 
 
 
 
Fails if src is a file and dst is a directory. 
 
Fails if src is a directory and dst is a file. 
 
Fails if the parent of dst does not exist or is a file. 
 
 

If OVERWRITE option is not passed as an argument, rename fails

if the dst already exists.

 

If OVERWRITE option is passed as an argument, rename overwrites

the dst if it is a file or an empty directory. Rename fails if dst is

a non-empty directory.

 

Note that atomicity of rename is dependent on the file system

implementation. Please refer to the file system documentation for

details. This default implementation is non atomic.

 

This method is deprecated since it is a temporary method added to

support the transition from FileSystem to FileContext for user

applications.



@param  src path to be renamed 

@param  dst new path after rename 

@throws  IOException on failure 
*/
@Deprecated  
protected void rename(final Path src, final Path dst,
final Rename... options) throws IOException {
// Default implementation
final FileStatus srcStatus = getFileLinkStatus(src);
if (srcStatus == null) {
throw new FileNotFoundException("rename source " + src + " not found.");
}
boolean overwrite = false;
if (null != options) {
  for (Rename option : options) {
    if (option == Rename.OVERWRITE) {
      overwrite = true;
    }
  }
}

FileStatus dstStatus;
try {
  dstStatus = getFileLinkStatus(dst);
} catch (IOException e) {
  dstStatus = null;
}
if (dstStatus != null) {
  if (srcStatus.isDirectory() != dstStatus.isDirectory()) {
    throw new IOException("Source " + src + " Destination " + dst
        + " both should be either file or directory");
  }
  if (!overwrite) {
    throw new FileAlreadyExistsException("rename destination " + dst
        + " already exists.");
  }
  // Delete the destination that is a file or an empty directory
  if (dstStatus.isDirectory()) {
    FileStatus[] list = listStatus(dst);
    if (list != null && list.length != 0) {
      throw new IOException(
          "rename cannot overwrite non empty destination directory " + dst);
    }
  }
  delete(dst, false);
} else {
  final Path parent = dst.getParent();
  final FileStatus parentStatus = getFileStatus(parent);
  if (parentStatus == null) {
    throw new FileNotFoundException("rename destination parent " + parent
        + " not found.");
  }
  if (!parentStatus.isDirectory()) {
    throw new ParentNotDirectoryException("rename destination parent " + parent
        + " is a file.");
  }
}
if (!rename(src, dst)) {
  throw new IOException("rename from " + src + " to " + dst + " failed.");
}

}

/**

Truncate the file in the indicated path to the indicated size.
 
 
Fails if path is a directory. 
 
Fails if path does not exist. 
 
Fails if path is not closed. 
 
Fails if new size is greater than current size. 
 
@param  f The path to the file to be truncated 

@param  newLength The size the file is to be truncated to 



@return  true if the file has been truncated to the desired

newLength and is immediately available to be reused for

write operations such as append, or

false if a background process of adjusting the length of

the last block has been started, and clients should wait for it to

complete before proceeding with further file updates.
*/
public boolean truncate(Path f, long newLength) throws IOException {
throw new UnsupportedOperationException("Not implemented by the " +
getClass().getSimpleName() + " FileSystem implementation");
}
/**

Delete a file
@deprecated  Use {@link  #delete(Path, boolean)} instead. 
*/
@Deprecated  
public boolean delete(Path f) throws IOException {
return delete(f, true);
}

/** Delete a file.
*

@param  f the path to delete. 
@param  recursive if path is a directory and set to 
true, the directory is deleted else throws an exception. In
case of a file the recursive can be set to either true or false.
@return   true if delete is successful else false. 
@throws  IOException 
*/
public abstract boolean delete(Path f, boolean recursive) throws IOException;

/**

Mark a path to be deleted when FileSystem is closed.
When the JVM shuts down,
all FileSystem objects will be closed automatically.
Then,
the marked path will be deleted as a result of closing the FileSystem.

The path has to exist in the file system.

@param  f the path to delete. 
@return   true if deleteOnExit is successful, otherwise false. 
@throws  IOException 
*/
public boolean deleteOnExit(Path f) throws IOException {
if (!exists(f)) {
return false;
}
synchronized (deleteOnExit) {
deleteOnExit.add(f);
}
return true;
}

/**

Cancel the deletion of the path when the FileSystem is closed
@param  f the path to cancel deletion 
*/
public boolean cancelDeleteOnExit(Path f) {
synchronized (deleteOnExit) {
return deleteOnExit.remove(f);
}
}

/**

Delete all files that were marked as delete-on-exit. This recursively
deletes all files in the specified paths.
*/
protected void processDeleteOnExit() {
synchronized (deleteOnExit) {
for (Iterator iter = deleteOnExit.iterator(); iter.hasNext();) {
Path path = iter.next();
try {
if (exists(path)) {
delete(path, true);
}
}
catch (IOException e) {
LOG.info("Ignoring failure to deleteOnExit for path " + path);
}
iter.remove();
}
}
}

/** Check if exists.

@param  f source file 
*/
public boolean exists(Path f) throws IOException {
try {
return getFileStatus(f) != null;
} catch (FileNotFoundException e) {
return false;
}
}

/** True iff the named path is a directory.

Note: Avoid using this method. Instead reuse the FileStatus
returned by getFileStatus() or listStatus() methods.
@param  f path to check 
*/
public boolean isDirectory(Path f) throws IOException {
try {
return getFileStatus(f).isDirectory();
} catch (FileNotFoundException e) {
return false;               // f does not exist
}
}

/** True iff the named path is a regular file.

Note: Avoid using this method. Instead reuse the FileStatus
returned by getFileStatus() or listStatus() methods.
@param  f path to check 
*/
public boolean isFile(Path f) throws IOException {
try {
return getFileStatus(f).isFile();
} catch (FileNotFoundException e) {
return false;               // f does not exist
}
}

/** The number of bytes in a file. */
/** @deprecated  Use getFileStatus() instead */ 
@Deprecated  
public long getLength(Path f) throws IOException {
return getFileStatus(f).getLen();
}

/** Return the {@link  ContentSummary} of a given {@link  Path}. 

@param  f path to use 
*/
public ContentSummary getContentSummary(Path f) throws IOException {
FileStatus status = getFileStatus(f);
if (status.isFile()) {
// f is a file
long length = status.getLen();
return new ContentSummary.Builder().length(length).
fileCount(1).directoryCount(0).spaceConsumed(length).build();
}
// f is a directory
long[] summary = {0, 0, 1};
for(FileStatus s : listStatus(f)) {
long length = s.getLen();
ContentSummary c = s.isDirectory() ? getContentSummary(s.getPath()) :
new ContentSummary.Builder().length(length).
fileCount(1).directoryCount(0).spaceConsumed(length).build();
summary[0] += c.getLength();
summary[1] += c.getFileCount();
summary[2] += c.getDirectoryCount();
}
return new ContentSummary.Builder().length(summary[0]).
fileCount(summary[1]).directoryCount(summary[2]).
spaceConsumed(summary[0]).build();
}

final private static PathFilter DEFAULT_FILTER = new PathFilter() {
@Override  
public boolean accept(Path file) {
return true;
}
};

/**

List the statuses of the files/directories in the given path if the path is
a directory.

@param  f given path 
@return  the statuses of the files/directories in the given patch 
@throws  FileNotFoundException when the path does not exist; 
 
    IOException see specific implementation
 

*/
public abstract FileStatus[] listStatus(Path f) throws FileNotFoundException,
IOException;

/*

Filter files/directories in the given path using the user-supplied path
filter. Results are added to the given array results.
*/
private void listStatus(ArrayList results, Path f,
PathFilter filter) throws FileNotFoundException, IOException {
FileStatus listing[] = listStatus(f);
if (listing == null) {
throw new IOException("Error accessing " + f);
}

for (int i = 0; i < listing.length; i++) {
  if (filter.accept(listing[i].getPath())) {
    results.add(listing[i]);
  }
}

}

/**

@return  an iterator over the corrupt files under the given path 
(may contain duplicates if a file has more than one corrupt block)
@throws  IOException 
*/
public RemoteIterator listCorruptFileBlocks(Path path)
throws IOException {
throw new UnsupportedOperationException(getClass().getCanonicalName() +
" does not support" +
" listCorruptFileBlocks");
}

/**

Filter files/directories in the given path using the user-supplied path
filter.

@param  f 
 
     a path name
 
@param  filter 
 
     the user-supplied path filter
 
@return  an array of FileStatus objects for the files under the given path 
 
    after applying the filter
 
@throws  FileNotFoundException when the path does not exist; 
 
    IOException see specific implementation
 

*/
public FileStatus[] listStatus(Path f, PathFilter filter)
throws FileNotFoundException, IOException {
ArrayList results = new ArrayList();
listStatus(results, f, filter);
return results.toArray(new FileStatus[results.size()]);
}

/**

Filter files/directories in the given list of paths using default
path filter.

@param  files 
 
     a list of paths
 
@return  a list of statuses for the files under the given paths after 
 
    applying the filter default Path filter
 
@throws  FileNotFoundException when the path does not exist; 
 
    IOException see specific implementation
 

*/
public FileStatus[] listStatus(Path[] files)
throws FileNotFoundException, IOException {
return listStatus(files, DEFAULT_FILTER);
}

/**

Filter files/directories in the given list of paths using user-supplied
path filter.

@param  files 
 
     a list of paths
 
@param  filter 
 
     the user-supplied path filter
 
@return  a list of statuses for the files under the given paths after 
 
    applying the filter
 
@throws  FileNotFoundException when the path does not exist; 
 
    IOException see specific implementation
 

*/
public FileStatus[] listStatus(Path[] files, PathFilter filter)
throws FileNotFoundException, IOException {
ArrayList results = new ArrayList();
for (int i = 0; i < files.length; i++) {
listStatus(results, files[i], filter);
}
return results.toArray(new FileStatus[results.size()]);
}

/**

 Return all the files that match filePattern and are not checksum 
 files. Results are sorted by their names. 

 
 A filename pattern is composed of regular characters and 
 special pattern matching characters, which are: 

 
 
 
 
   ?  
  Matches any single character. 

 
   *  
  Matches zero or more characters. 

 
   [abc]  
  Matches a single character from character set 
 
<tt>{<i>a,b,c</i>}</tt>.
 

 
   [a-b]  
  Matches a single character from the character range 
 
<tt>{<i>a...b</i>}</tt>.  Note that character <tt><i>a</i></tt> must be
 
 
lexicographically less than or equal to character <tt><i>b</i></tt>.
 

 
   [^a]  
  Matches a single character that is not from character set or range 
 
<tt>{<i>a</i>}</tt>.  Note that the <tt>^</tt> character must occur
 
 
immediately to the right of the opening bracket.
 

 
   \c  
  Removes (escapes) any special meaning of character c. 

 
   {ab,cd}  
  Matches a string from the string set {ab, cd}  

 
   {ab,c{de,fh}}  
  Matches a string from the string set {ab, cde, cfh} 

 
 
 

 @param  pathPattern a regular expression specifying a pth pattern  
 @return  an array of paths that match the path pattern  
 @throws  IOException 
*/
public FileStatus[] globStatus(Path pathPattern) throws IOException {
return new Globber(this, pathPattern, DEFAULT_FILTER).glob();
} 

/**

Return an array of FileStatus objects whose path names match pathPattern
and is accepted by the user-supplied path filter. Results are sorted by
their path names.
Return null if pathPattern has no glob and the path does not exist.
Return an empty array if pathPattern has a glob and no path matches it.

@param  pathPattern 
 
     a regular expression specifying the path pattern
 
@param  filter 
 
     a user-supplied path filter
 
@return  an array of FileStatus objects 
@throws  IOException if any I/O error occurs when fetching file status 
*/
public FileStatus[] globStatus(Path pathPattern, PathFilter filter)
throws IOException {
return new Globber(this, pathPattern, filter).glob();
}

/**

List the statuses of the files/directories in the given path if the path is
a directory.
Return the file's status and block locations If the path is a file.

If a returned status is a file, it contains the file's block locations.

@param  f is the path 

@return  an iterator that traverses statuses of the files/directories 
 
    in the given path
 

@throws  FileNotFoundException If f does not exist
@throws  IOException If an I/O error occurred 
*/
public RemoteIterator listLocatedStatus(final Path f)
throws FileNotFoundException, IOException {
return listLocatedStatus(f, DEFAULT_FILTER);
}

/**

 Listing a directory 
 The returned results include its block location if it is a file 
 The results are filtered by the given path filter 
 @param  f a path  
 @param  filter a path filter  
 @return  an iterator that traverses statuses of the files/directories  
 
    in the given path
 
 @throws  FileNotFoundException if f does not exist 
 @throws  IOException if any I/O error occurred 
*/
protected RemoteIterator listLocatedStatus(final Path f,
final PathFilter filter)
throws FileNotFoundException, IOException {
return new RemoteIterator() {
private final FileStatus[] stats = listStatus(f, filter);
private int i = 0;
@Override  
public boolean hasNext() {
return i<stats.length;
}
@Override  
public LocatedFileStatus next() throws IOException {
if (!hasNext()) {
throw new NoSuchElementException("No more entry in " + f);
}
FileStatus result = stats[i++];
// for files, use getBlockLocations(FileStatus, int, int) to avoid
// calling getFileStatus(Path) to load the FileStatus again
BlockLocation[] locs = result.isFile() ?
getFileBlockLocations(result, 0, result.getLen()) :
null;
return new LocatedFileStatus(result, locs);
}
};
} 

/**

 Returns a remote iterator so that followup calls are made on demand 
 while consuming the entries. Each file system implementation should 
 override this method and provide a more efficient implementation, if 
 possible. 

 @param  p target path  
 @return  remote iterator 
*/
public RemoteIterator listStatusIterator(final Path p)
throws FileNotFoundException, IOException {
return new RemoteIterator() {
private final FileStatus[] stats = listStatus(p);
private int i = 0;
@Override  
public boolean hasNext() {
return i<stats.length;
}
@Override  
public FileStatus next() throws IOException {
if (!hasNext()) {
throw new NoSuchElementException("No more entry in " + p);
}
return stats[i++];
}
};
} 

/**

List the statuses and block locations of the files in the given path.

If the path is a directory,
if recursive is false, returns files in the directory;
if recursive is true, return files in the subtree rooted at the path.
If the path is a file, return the file's status and block locations.

@param  f is the path 
@param  recursive if the subdirectories need to be traversed recursively 

@return  an iterator that traverses statuses of the files 

@throws  FileNotFoundException when the path does not exist; 
 
    IOException see specific implementation
 

*/
public RemoteIterator listFiles(
final Path f, final boolean recursive)
throws FileNotFoundException, IOException {
return new RemoteIterator() {
private Stack<RemoteIterator> itors =
new Stack<RemoteIterator>();
private RemoteIterator curItor =
listLocatedStatus(f);
private LocatedFileStatus curFile;

  @Override
  public boolean hasNext() throws IOException {
    while (curFile == null) {
      if (curItor.hasNext()) {
        handleFileStat(curItor.next());
      } else if (!itors.empty()) {
        curItor = itors.pop();
      } else {
        return false;
      }
    }
    return true;
  }

  /**
   * Process the input stat.
   * If it is a file, return the file stat.
   * If it is a directory, traverse the directory if recursive is true;
   * ignore it if recursive is false.
   * @param stat input status
   * @throws IOException if any IO error occurs
   */
  private void handleFileStat(LocatedFileStatus stat) throws IOException {
    if (stat.isFile()) { // file
      curFile = stat;
    } else if (recursive) { // directory
      itors.push(curItor);
      curItor = listLocatedStatus(stat.getPath());
    }
  }

  @Override
  public LocatedFileStatus next() throws IOException {
    if (hasNext()) {
      LocatedFileStatus result = curFile;
      curFile = null;
      return result;
    } 
    throw new java.util.NoSuchElementException("No more entry in " + f);
  }
};

}

/** Return the current user's home directory in this filesystem.

The default implementation returns "/user/$USER/".
*/
public Path getHomeDirectory() {
return this.makeQualified(
new Path("/user/"+System.getProperty("user.name")));
}

/**

Set the current working directory for the given file system. All relative
paths will be resolved relative to it.

@param  new_dir 
*/
public abstract void setWorkingDirectory(Path new_dir);

/**

Get the current working directory for the given file system
@return  the directory pathname 
*/
public abstract Path getWorkingDirectory();

/**

Note: with the new FilesContext class, getWorkingDirectory()
will be removed.
The working directory is implemented in FilesContext.

Some file systems like LocalFileSystem have an initial workingDir
that we use as the starting workingDir. For other file systems
like HDFS there is no built in notion of an initial workingDir.

@return  if there is built in notion of workingDir then it 
is returned; else a null is returned.
*/
protected Path getInitialWorkingDirectory() {
return null;
}

/**

Call {@link  #mkdirs(Path, FsPermission)} with default permission. 
*/
public boolean mkdirs(Path f) throws IOException {
return mkdirs(f, FsPermission.getDirDefault());
}

/**

Make the given file and all non-existent parents into
directories. Has the semantics of Unix 'mkdir -p'.
Existence of the directory hierarchy is not an error.
@param  f path to create 
@param  permission to apply to f 
*/
public abstract boolean mkdirs(Path f, FsPermission permission
) throws IOException;

/**

The src file is on the local disk.  Add it to FS at
the given dst name and the source is kept intact afterwards
@param  src path 
@param  dst path 
*/
public void copyFromLocalFile(Path src, Path dst)
throws IOException {
copyFromLocalFile(false, src, dst);
}

/**

The src files is on the local disk.  Add it to FS at
the given dst name, removing the source afterwards.
@param  srcs path 
@param  dst path 
*/
public void moveFromLocalFile(Path[] srcs, Path dst)
throws IOException {
copyFromLocalFile(true, true, srcs, dst);
}

/**

The src file is on the local disk.  Add it to FS at
the given dst name, removing the source afterwards.
@param  src path 
@param  dst path 
*/
public void moveFromLocalFile(Path src, Path dst)
throws IOException {
copyFromLocalFile(true, src, dst);
}

/**

The src file is on the local disk.  Add it to FS at
the given dst name.
delSrc indicates if the source should be removed
@param  delSrc whether to delete the src 
@param  src path 
@param  dst path 
*/
public void copyFromLocalFile(boolean delSrc, Path src, Path dst)
throws IOException {
copyFromLocalFile(delSrc, true, src, dst);
}

/**

The src files are on the local disk.  Add it to FS at
the given dst name.
delSrc indicates if the source should be removed
@param  delSrc whether to delete the src 
@param  overwrite whether to overwrite an existing file 
@param  srcs array of paths which are source 
@param  dst path 
*/
public void copyFromLocalFile(boolean delSrc, boolean overwrite,
Path[] srcs, Path dst)
throws IOException {
Configuration conf = getConf();
FileUtil.copy(getLocal(conf), srcs, this, dst, delSrc, overwrite, conf);
}

/**

The src file is on the local disk.  Add it to FS at
the given dst name.
delSrc indicates if the source should be removed
@param  delSrc whether to delete the src 
@param  overwrite whether to overwrite an existing file 
@param  src path 
@param  dst path 
*/
public void copyFromLocalFile(boolean delSrc, boolean overwrite,
Path src, Path dst)
throws IOException {
Configuration conf = getConf();
FileUtil.copy(getLocal(conf), src, this, dst, delSrc, overwrite, conf);
}

/**

The src file is under FS, and the dst is on the local disk.
Copy it from FS control to the local dst name.
@param  src path 
@param  dst path 
*/
public void copyToLocalFile(Path src, Path dst) throws IOException {
copyToLocalFile(false, src, dst);
}

/**

The src file is under FS, and the dst is on the local disk.
Copy it from FS control to the local dst name.
Remove the source afterwards
@param  src path 
@param  dst path 
*/
public void moveToLocalFile(Path src, Path dst) throws IOException {
copyToLocalFile(true, src, dst);
}

/**

The src file is under FS, and the dst is on the local disk.
Copy it from FS control to the local dst name.
delSrc indicates if the src will be removed or not.
@param  delSrc whether to delete the src 
@param  src path 
@param  dst path 
*/
public void copyToLocalFile(boolean delSrc, Path src, Path dst)
throws IOException {
copyToLocalFile(delSrc, src, dst, false);
}

/**

The src file is under FS, and the dst is on the local disk. Copy it from FS
control to the local dst name. delSrc indicates if the src will be removed
or not. useRawLocalFileSystem indicates whether to use RawLocalFileSystem
as local file system or not. RawLocalFileSystem is non crc file system.So,
It will not create any crc files at local.

@param  delSrc 
 
     whether to delete the src
 
@param  src 
 
     path
 
@param  dst 
 
     path
 
@param  useRawLocalFileSystem 
 
     whether to use RawLocalFileSystem as local file system or not.
 

@throws  IOException 
 
      - if any IO error
 

*/
public void copyToLocalFile(boolean delSrc, Path src, Path dst,
boolean useRawLocalFileSystem) throws IOException {
Configuration conf = getConf();
FileSystem local = null;
if (useRawLocalFileSystem) {
local = getLocal(conf).getRawFileSystem();
} else {
local = getLocal(conf);
}
FileUtil.copy(this, src, local, dst, delSrc, conf);
}

/**

Returns a local File that the user can write output to.  The caller
provides both the eventual FS target name and the local working
file.  If the FS is local, we write directly into the target.  If
the FS is remote, we write into the tmp local area.
@param  fsOutputFile path of output file 
@param  tmpLocalFile path of local tmp file 
*/
public Path startLocalOutput(Path fsOutputFile, Path tmpLocalFile)
throws IOException {
return tmpLocalFile;
}

/**

Called when we're all done writing to the target.  A local FS will
do nothing, because we've written to exactly the right place.  A remote
FS will copy the contents of tmpLocalFile to the correct target at
fsOutputFile.
@param  fsOutputFile path of output file 
@param  tmpLocalFile path to local tmp file 
*/
public void completeLocalOutput(Path fsOutputFile, Path tmpLocalFile)
throws IOException {
moveFromLocalFile(tmpLocalFile, fsOutputFile);
}

/**

No more filesystem operations are needed.  Will
release any held locks.
*/
@Override  
public void close() throws IOException {
// delete all files that were marked as delete-on-exit.
processDeleteOnExit();
CACHE.remove(this.key, this);
}

/** Return the total size of all files in the filesystem.*/
public long getUsed() throws IOException{
long used = 0;
FileStatus[] files = listStatus(new Path("/"));
for(FileStatus file:files){
used += file.getLen();
}
return used;
}

/**

Get the block size for a particular file.
@param  f the filename 
@return  the number of bytes in a block 
*/
/** @deprecated  Use getFileStatus() instead */ 
@Deprecated  
public long getBlockSize(Path f) throws IOException {
return getFileStatus(f).getBlockSize();
}

/**

Return the number of bytes that large input files should be optimally
be split into to minimize i/o time.
@deprecated  use {@link  #getDefaultBlockSize(Path)} instead 
*/
@Deprecated  
public long getDefaultBlockSize() {
// default to 32MB: large enough to minimize the impact of seeks
return getConf().getLong("fs.local.block.size", 32  1024  1024);
}

/** Return the number of bytes that large input files should be optimally

be split into to minimize i/o time.  The given path will be used to
locate the actual filesystem.  The full path does not have to exist.
@param  f path of file 
@return  the default block size for the path's filesystem 
*/
public long getDefaultBlockSize(Path f) {
return getDefaultBlockSize();
}

/**

Get the default replication.
@deprecated  use {@link  #getDefaultReplication(Path)} instead 
*/
@Deprecated  
public short getDefaultReplication() { return 1; }

/**

Get the default replication for a path.   The given path will be used to
locate the actual filesystem.  The full path does not have to exist.
@param  path of the file 
@return  default replication for the path's filesystem 
*/
public short getDefaultReplication(Path path) {
return getDefaultReplication();
}

/**

Return a file status object that represents the path.
@param  f The path we want information from 
@return  a FileStatus object 
@throws  FileNotFoundException when the path does not exist; 
 
    IOException see specific implementation
 

*/
public abstract FileStatus getFileStatus(Path f) throws IOException;

/**

Checks if the user can access a path.  The mode specifies which access
checks to perform.  If the requested permissions are granted, then the
method returns normally.  If access is denied, then the method throws an
{@link  AccessControlException}. 
 
The default implementation of this method calls {@link  #getFileStatus(Path)} 
and checks the returned permissions against the requested permissions.
Note that the getFileStatus call will be subject to authorization checks.
Typically, this requires search (execute) permissions on each directory in
the path's prefix, but this is implementation-defined.  Any file system
that provides a richer authorization model (such as ACLs) may override the
default implementation so that it checks against that model instead.
 
In general, applications should avoid using this method, due to the risk of
time-of-check/time-of-use race conditions.  The permissions on a file may
change immediately after the access call returns.  Most applications should
prefer running specific file system actions as the desired user represented
by a {@link  UserGroupInformation}. 

@param  path Path to check 
@param  mode type of access to check 
@throws  AccessControlException if access is denied 
@throws  FileNotFoundException if the path does not exist 
@throws  IOException see specific implementation 
*/
@InterfaceAudience.LimitedPrivate({"HDFS", "Hive"})
public void access(Path path, FsAction mode) throws AccessControlException,
FileNotFoundException, IOException {
checkAccessPermissions(this.getFileStatus(path), mode);
}

/**

This method provides the default implementation of
{@link  #access(Path, FsAction)}. 

@param  stat FileStatus to check 
@param  mode type of access to check 
@throws  IOException for any error 
*/
@InterfaceAudience.Private
static void checkAccessPermissions(FileStatus stat, FsAction mode)
throws IOException {
FsPermission perm = stat.getPermission();
UserGroupInformation ugi = UserGroupInformation.getCurrentUser();
String user = ugi.getShortUserName();
List groups = Arrays.asList(ugi.getGroupNames());
if (user.equals(stat.getOwner())) {
if (perm.getUserAction().implies(mode)) {
return;
}
} else if (groups.contains(stat.getGroup())) {
if (perm.getGroupAction().implies(mode)) {
return;
}
} else {
if (perm.getOtherAction().implies(mode)) {
return;
}
}
throw new AccessControlException(String.format(
"Permission denied: user=%s, path="%s":%s:%s:%s%s", user, stat.getPath(),
stat.getOwner(), stat.getGroup(), stat.isDirectory() ? "d" : "-", perm));
}

/**

See {@link  FileContext#fixRelativePart} 
*/
protected Path fixRelativePart(Path p) {
if (p.isUriPathAbsolute()) {
return p;
} else {
return new Path(getWorkingDirectory(), p);
}
}

/**

See {@link  FileContext#createSymlink(Path, Path, boolean)} 
*/
public void createSymlink(final Path target, final Path link,
final boolean createParent) throws AccessControlException,
FileAlreadyExistsException, FileNotFoundException,
ParentNotDirectoryException, UnsupportedFileSystemException,
IOException {
// Supporting filesystems should override this method
throw new UnsupportedOperationException(
"Filesystem does not support symlinks!");
}

/**

See {@link  FileContext#getFileLinkStatus(Path)} 
*/
public FileStatus getFileLinkStatus(final Path f)
throws AccessControlException, FileNotFoundException,
UnsupportedFileSystemException, IOException {
// Supporting filesystems should override this method
return getFileStatus(f);
}

/**

See {@link  AbstractFileSystem#supportsSymlinks()} 
*/
public boolean supportsSymlinks() {
return false;
}

/**

See {@link  FileContext#getLinkTarget(Path)} 
*/
public Path getLinkTarget(Path f) throws IOException {
// Supporting filesystems should override this method
throw new UnsupportedOperationException(
"Filesystem does not support symlinks!");
}

/**

See {@link  AbstractFileSystem#getLinkTarget(Path)} 
*/
protected Path resolveLink(Path f) throws IOException {
// Supporting filesystems should override this method
throw new UnsupportedOperationException(
"Filesystem does not support symlinks!");
}

/**

Get the checksum of a file.

@param  f The file path 
@return  The file checksum.  The default return value is null, 
which indicates that no checksum algorithm is implemented
in the corresponding FileSystem.
*/
public FileChecksum getFileChecksum(Path f) throws IOException {
return getFileChecksum(f, Long.MAX_VALUE);
}

/**

Get the checksum of a file, from the beginning of the file till the
specific length.
@param  f The file path 
@param  length The length of the file range for checksum calculation 
@return  The file checksum. 
*/
public FileChecksum getFileChecksum(Path f, final long length)
throws IOException {
return null;
}

/**

Set the verify checksum flag. This is only applicable if the
corresponding FileSystem supports checksum. By default doesn't do anything.
@param  verifyChecksum 
*/
public void setVerifyChecksum(boolean verifyChecksum) {
//doesn't do anything
}

/**

Set the write checksum flag. This is only applicable if the
corresponding FileSystem supports checksum. By default doesn't do anything.
@param  writeChecksum 
*/
public void setWriteChecksum(boolean writeChecksum) {
//doesn't do anything
}

/**

Returns a status object describing the use and capacity of the
file system. If the file system has multiple partitions, the
use and capacity of the root partition is reflected.

@return  a FsStatus object 
@throws  IOException 
 
      see specific implementation
 

*/
public FsStatus getStatus() throws IOException {
return getStatus(null);
}

/**

Returns a status object describing the use and capacity of the
file system. If the file system has multiple partitions, the
use and capacity of the partition pointed to by the specified
path is reflected.
@param  p Path for which status should be obtained. null means 
the default partition.
@return  a FsStatus object 
@throws  IOException 
 
      see specific implementation
 

*/
public FsStatus getStatus(Path p) throws IOException {
return new FsStatus(Long.MAX_VALUE, 0, Long.MAX_VALUE);
}

/**

Set permission of a path.
@param  p 
@param  permission 
*/
public void setPermission(Path p, FsPermission permission
) throws IOException {
}

/**

Set owner of a path (i.e. a file or a directory).
The parameters username and groupname cannot both be null.
@param  p The path 
@param  username If it is null, the original username remains unchanged. 
@param  groupname If it is null, the original groupname remains unchanged. 
*/
public void setOwner(Path p, String username, String groupname
) throws IOException {
}

/**

Set access time of a file
@param  p The path 
@param  mtime Set the modification time of this file. 
 
         The number of milliseconds since Jan 1, 1970.
 
 
         A value of -1 means that this call should not set modification time.
 
@param  atime Set the access time of this file. 
 
         The number of milliseconds since Jan 1, 1970.
 
 
         A value of -1 means that this call should not set access time.
 

*/
public void setTimes(Path p, long mtime, long atime
) throws IOException {
}

/**

Create a snapshot with a default name.
@param  path The directory where snapshots will be taken. 
@return  the snapshot path. 
*/
public final Path createSnapshot(Path path) throws IOException {
return createSnapshot(path, null);
}

/**

Create a snapshot
@param  path The directory where snapshots will be taken. 
@param  snapshotName The name of the snapshot 
@return  the snapshot path. 
*/
public Path createSnapshot(Path path, String snapshotName)
throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName() 
  " doesn't support createSnapshot");
}

/**

Rename a snapshot
@param  path The directory path where the snapshot was taken 
@param  snapshotOldName Old name of the snapshot 
@param  snapshotNewName New name of the snapshot 
@throws  IOException 
*/
public void renameSnapshot(Path path, String snapshotOldName,
String snapshotNewName) throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName() 
  " doesn't support renameSnapshot");
}

/**

Delete a snapshot of a directory
@param  path  The directory that the to-be-deleted snapshot belongs to 
@param  snapshotName The name of the snapshot 
*/
public void deleteSnapshot(Path path, String snapshotName)
throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName() 
  " doesn't support deleteSnapshot");
}

/**

Modifies ACL entries of files and directories.  This method can add new ACL
entries or modify the permissions on existing ACL entries.  All existing
ACL entries that are not specified in this call are retained without
changes.  (Modifications are merged into the current ACL.)

@param  path Path to modify 
@param  aclSpec List  describing modifications
@throws  IOException if an ACL could not be modified 
*/
public void modifyAclEntries(Path path, List aclSpec)
throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName() 
  " doesn't support modifyAclEntries");
}

/**

Removes ACL entries from files and directories.  Other ACL entries are
retained.

@param  path Path to modify 
@param  aclSpec List  describing entries to remove
@throws  IOException if an ACL could not be modified 
*/
public void removeAclEntries(Path path, List aclSpec)
throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName() 
  " doesn't support removeAclEntries");
}

/**

Removes all default ACL entries from files and directories.

@param  path Path to modify 
@throws  IOException if an ACL could not be modified 
*/
public void removeDefaultAcl(Path path)
throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName() 
  " doesn't support removeDefaultAcl");
}

/**

Removes all but the base ACL entries of files and directories.  The entries
for user, group, and others are retained for compatibility with permission
bits.

@param  path Path to modify 
@throws  IOException if an ACL could not be removed 
*/
public void removeAcl(Path path)
throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName() 
  " doesn't support removeAcl");
}

/**

Fully replaces ACL of files and directories, discarding all existing
entries.

@param  path Path to modify 
@param  aclSpec List  describing modifications, must include entries
for user, group, and others for compatibility with permission bits.
@throws  IOException if an ACL could not be modified 
*/
public void setAcl(Path path, List aclSpec) throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName() 
  " doesn't support setAcl");
}

/**

Gets the ACL of a file or directory.

@param  path Path to get 
@return  AclStatus describing the ACL of the file or directory 
@throws  IOException if an ACL could not be read 
*/
public AclStatus getAclStatus(Path path) throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName() 
  " doesn't support getAclStatus");
}

/**

Set an xattr of a file or directory.
The name must be prefixed with the namespace followed by ".". For example,
"user.attr".
 
Refer to the HDFS extended attributes user documentation for details.

@param  path Path to modify 
@param  name xattr name. 
@param  value xattr value. 
@throws  IOException 
*/
public void setXAttr(Path path, String name, byte[] value)
throws IOException {
setXAttr(path, name, value, EnumSet.of(XAttrSetFlag.CREATE,
XAttrSetFlag.REPLACE));
}

/**

Set an xattr of a file or directory.
The name must be prefixed with the namespace followed by ".". For example,
"user.attr".
 
Refer to the HDFS extended attributes user documentation for details.

@param  path Path to modify 
@param  name xattr name. 
@param  value xattr value. 
@param  flag xattr set flag 
@throws  IOException 
*/
public void setXAttr(Path path, String name, byte[] value,
EnumSet flag) throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName() 
  " doesn't support setXAttr");
}

/**

Get an xattr name and value for a file or directory.
The name must be prefixed with the namespace followed by ".". For example,
"user.attr".
 
Refer to the HDFS extended attributes user documentation for details.

@param  path Path to get extended attribute 
@param  name xattr name. 
@return  byte[] xattr value. 
@throws  IOException 
*/
public byte[] getXAttr(Path path, String name) throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName() 
  " doesn't support getXAttr");
}

/**

Get all of the xattr name/value pairs for a file or directory.
Only those xattrs which the logged-in user has permissions to view
are returned.
 
Refer to the HDFS extended attributes user documentation for details.

@param  path Path to get extended attributes 
@return  Map<String, byte[]> describing the XAttrs of the file or directory 
@throws  IOException 
*/
public Map<String, byte[]> getXAttrs(Path path) throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName() 
  " doesn't support getXAttrs");
}

/**

Get all of the xattrs name/value pairs for a file or directory.
Only those xattrs which the logged-in user has permissions to view
are returned.
 
Refer to the HDFS extended attributes user documentation for details.

@param  path Path to get extended attributes 
@param  names XAttr names. 
@return  Map<String, byte[]> describing the XAttrs of the file or directory 
@throws  IOException 
*/
public Map<String, byte[]> getXAttrs(Path path, List names)
throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName() 
  " doesn't support getXAttrs");
}

/**

Get all of the xattr names for a file or directory.
Only those xattr names which the logged-in user has permissions to view
are returned.
 
Refer to the HDFS extended attributes user documentation for details.

@param  path Path to get extended attributes 
@return  List  of the XAttr names of the file or directory
@throws  IOException 
*/
public List listXAttrs(Path path) throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName()
+ " doesn't support listXAttrs");
}

/**

Remove an xattr of a file or directory.
The name must be prefixed with the namespace followed by ".". For example,
"user.attr".
 
Refer to the HDFS extended attributes user documentation for details.

@param  path Path to remove extended attribute 
@param  name xattr name 
@throws  IOException 
*/
public void removeXAttr(Path path, String name) throws IOException {
throw new UnsupportedOperationException(getClass().getSimpleName() 
  " doesn't support removeXAttr");
}

// making it volatile to be able to do a double checked locking
private volatile static boolean FILE_SYSTEMS_LOADED = false;

private static final Map<String, Class<? extends FileSystem>>
SERVICE_FILE_SYSTEMS = new HashMap<String, Class<? extends FileSystem>>();

private static void loadFileSystems() {
synchronized (FileSystem.class) {
if (!FILE_SYSTEMS_LOADED) {
ServiceLoader serviceLoader = ServiceLoader.load(FileSystem.class);
Iterator it = serviceLoader.iterator();
while (it.hasNext()) {
FileSystem fs = null;
try {
fs = it.next();
try {
SERVICE_FILE_SYSTEMS.put(fs.getScheme(), fs.getClass());
} catch (Exception e) {
LOG.warn("Cannot load: " + fs + " from " +
ClassUtil.findContainingJar(fs.getClass()), e);
}
} catch (ServiceConfigurationError ee) {
LOG.warn("Cannot load filesystem", ee);
}
}
FILE_SYSTEMS_LOADED = true;
}
}
}

public static Class<? extends FileSystem> getFileSystemClass(String scheme,
Configuration conf) throws IOException {
if (!FILE_SYSTEMS_LOADED) {
loadFileSystems();
}
Class<? extends FileSystem> clazz = null;
if (conf != null) {
clazz = (Class<? extends FileSystem>) conf.getClass("fs." + scheme + ".impl", null);
}
if (clazz == null) {
clazz = SERVICE_FILE_SYSTEMS.get(scheme);
}
if (clazz == null) {
throw new IOException("No FileSystem for scheme: " + scheme);
}
return clazz;
}

private static FileSystem createFileSystem(URI uri, Configuration conf
) throws IOException {
Class<?> clazz = getFileSystemClass(uri.getScheme(), conf);
FileSystem fs = (FileSystem)ReflectionUtils.newInstance(clazz, conf);
fs.initialize(uri, conf);
return fs;
}

/** Caching FileSystem objects */
static class Cache {
private final ClientFinalizer clientFinalizer = new ClientFinalizer();

private final Map<Key, FileSystem> map = new HashMap<Key, FileSystem>();
private final Set<Key> toAutoClose = new HashSet<Key>();

/** A variable that makes all objects in the cache unique */
private static AtomicLong unique = new AtomicLong(1);

FileSystem get(URI uri, Configuration conf) throws IOException{
  Key key = new Key(uri, conf);
  return getInternal(uri, conf, key);
}

/** The objects inserted into the cache using this method are all unique */
FileSystem getUnique(URI uri, Configuration conf) throws IOException{
  Key key = new Key(uri, conf, unique.getAndIncrement());
  return getInternal(uri, conf, key);
}

private FileSystem getInternal(URI uri, Configuration conf, Key key) throws IOException{
  FileSystem fs;
  synchronized (this) {
    fs = map.get(key);
  }
  if (fs != null) {
    return fs;
  }

  fs = createFileSystem(uri, conf);
  synchronized (this) { // refetch the lock again
    FileSystem oldfs = map.get(key);
    if (oldfs != null) { // a file system is created while lock is releasing
      fs.close(); // close the new file system
      return oldfs;  // return the old file system
    }
    
    // now insert the new file system into the map
    if (map.isEmpty()
            && !ShutdownHookManager.get().isShutdownInProgress()) {
      ShutdownHookManager.get().addShutdownHook(clientFinalizer, SHUTDOWN_HOOK_PRIORITY);
    }
    fs.key = key;
    map.put(key, fs);
    if (conf.getBoolean("fs.automatic.close", true)) {
      toAutoClose.add(key);
    }
    return fs;
  }
}

synchronized void remove(Key key, FileSystem fs) {
  if (map.containsKey(key) && fs == map.get(key)) {
    map.remove(key);
    toAutoClose.remove(key);
    }
}

synchronized void closeAll() throws IOException {
  closeAll(false);
}

/**
 * Close all FileSystem instances in the Cache.
 * @param onlyAutomatic only close those that are marked for automatic closing
 */
synchronized void closeAll(boolean onlyAutomatic) throws IOException {
  List<IOException> exceptions = new ArrayList<IOException>();

  // Make a copy of the keys in the map since we'll be modifying
  // the map while iterating over it, which isn't safe.
  List<Key> keys = new ArrayList<Key>();
  keys.addAll(map.keySet());

  for (Key key : keys) {
    final FileSystem fs = map.get(key);

    if (onlyAutomatic && !toAutoClose.contains(key)) {
      continue;
    }

    //remove from cache
    remove(key, fs);

    if (fs != null) {
      try {
        fs.close();
      }
      catch(IOException ioe) {
        exceptions.add(ioe);
      }
    }
  }

  if (!exceptions.isEmpty()) {
    throw MultipleIOException.createIOException(exceptions);
  }
}

private class ClientFinalizer implements Runnable {
  @Override
  public synchronized void run() {
    try {
      closeAll(true);
    } catch (IOException e) {
      LOG.info("FileSystem.Cache.closeAll() threw an exception:\n" + e);
    }
  }
}

synchronized void closeAll(UserGroupInformation ugi) throws IOException {
  List<FileSystem> targetFSList = new ArrayList<FileSystem>();
  //Make a pass over the list and collect the filesystems to close
  //we cannot close inline since close() removes the entry from the Map
  for (Map.Entry<Key, FileSystem> entry : map.entrySet()) {
    final Key key = entry.getKey();
    final FileSystem fs = entry.getValue();
    if (ugi.equals(key.ugi) && fs != null) {
      targetFSList.add(fs);   
    }
  }
  List<IOException> exceptions = new ArrayList<IOException>();
  //now make a pass over the target list and close each
  for (FileSystem fs : targetFSList) {
    try {
      fs.close();
    }
    catch(IOException ioe) {
      exceptions.add(ioe);
    }
  }
  if (!exceptions.isEmpty()) {
    throw MultipleIOException.createIOException(exceptions);
  }
}

/** FileSystem.Cache.Key */
static class Key {
  final String scheme;
  final String authority;
  final UserGroupInformation ugi;
  final long unique;   // an artificial way to make a key unique

  Key(URI uri, Configuration conf) throws IOException {
    this(uri, conf, 0);
  }

  Key(URI uri, Configuration conf, long unique) throws IOException {
    scheme = uri.getScheme()==null ?
        "" : StringUtils.toLowerCase(uri.getScheme());
    authority = uri.getAuthority()==null ?
        "" : StringUtils.toLowerCase(uri.getAuthority());
    this.unique = unique;
    
    this.ugi = UserGroupInformation.getCurrentUser();
  }

  @Override
  public int hashCode() {
    return (scheme + authority).hashCode() + ugi.hashCode() + (int)unique;
  }

  static boolean isEqual(Object a, Object b) {
    return a == b || (a != null && a.equals(b));        
  }

  @Override
  public boolean equals(Object obj) {
    if (obj == this) {
      return true;
    }
    if (obj != null && obj instanceof Key) {
      Key that = (Key)obj;
      return isEqual(this.scheme, that.scheme)
             && isEqual(this.authority, that.authority)
             && isEqual(this.ugi, that.ugi)
             && (this.unique == that.unique);
    }
    return false;        
  }

  @Override
  public String toString() {
    return "("+ugi.toString() + ")@" + scheme + "://" + authority;        
  }
}

}

/**

 Tracks statistics about how many reads, writes, and so forth have been 
 done in a FileSystem. 

 Since there is only one of these objects per FileSystem, there will 
 typically be many threads writing to this object.  Almost every operation 
 on an open file will involve a write to this object.  In contrast, reading 
 statistics is done infrequently by most programs, and not at all by others. 
 Hence, this is optimized for writes. 

 Each thread writes to its own thread-local area of memory.  This removes 
 contention and allows us to scale up to many, many threads.  To read 
 statistics, the reader thread totals up the contents of all of the 
 thread-local data areas.
*/
public static final class Statistics {
/** 
  Statistics data.
  
  There is only a single writer to thread-local StatisticsData objects.
  Hence, volatile is adequate here-- we do not need AtomicLong or similar
  to prevent lost updates.
  The Java specification guarantees that updates to volatile longs will
  be perceived as atomic with respect to other threads, which is all we
  need.
*/
public static class StatisticsData {
volatile long bytesRead;
volatile long bytesWritten;
volatile int readOps;
volatile int largeReadOps;
volatile int writeOps;
/** 
  Add another StatisticsData object to this one.
*/
void add(StatisticsData other) {
this.bytesRead += other.bytesRead;
this.bytesWritten += other.bytesWritten;
this.readOps += other.readOps;
this.largeReadOps += other.largeReadOps;
this.writeOps += other.writeOps;
}
/** 
  Negate the values of all statistics.
*/
void negate() {
this.bytesRead = -this.bytesRead;
this.bytesWritten = -this.bytesWritten;
this.readOps = -this.readOps;
this.largeReadOps = -this.largeReadOps;
this.writeOps = -this.writeOps;
}
@Override public String toString() {
return bytesRead + " bytes read, " + bytesWritten + " bytes written, "
+ readOps + " read ops, " + largeReadOps + " large read ops, "
+ writeOps + " write ops";
}
public long getBytesRead() {
return bytesRead;
}
public long getBytesWritten() {
return bytesWritten;
}
public int getReadOps() {
return readOps;
}
public int getLargeReadOps() {
return largeReadOps;
}
public int getWriteOps() {
return writeOps;
}
} 

private interface StatisticsAggregator<T> {
  void accept(StatisticsData data);
  T aggregate();
}

private final String scheme;

/**
 * rootData is data that doesn't belong to any thread, but will be added
 * to the totals.  This is useful for making copies of Statistics objects,
 * and for storing data that pertains to threads that have been garbage
 * collected.  Protected by the Statistics lock.
 */
private final StatisticsData rootData;

/**
 * Thread-local data.
 */
private final ThreadLocal<StatisticsData> threadData;

/**
 * Set of all thread-local data areas.  Protected by the Statistics lock.
 * The references to the statistics data are kept using weak references
 * to the associated threads. Proper clean-up is performed by the cleaner
 * thread when the threads are garbage collected.
 */
private final Set<StatisticsDataReference> allData;

/**
 * Global reference queue and a cleaner thread that manage statistics data
 * references from all filesystem instances.
 */
private static final ReferenceQueue<Thread> STATS_DATA_REF_QUEUE;
private static final Thread STATS_DATA_CLEANER;

static {
  STATS_DATA_REF_QUEUE = new ReferenceQueue<Thread>();
  // start a single daemon cleaner thread
  STATS_DATA_CLEANER = new Thread(new StatisticsDataReferenceCleaner());
  STATS_DATA_CLEANER.
      setName(StatisticsDataReferenceCleaner.class.getName());
  STATS_DATA_CLEANER.setDaemon(true);
  STATS_DATA_CLEANER.start();
}

public Statistics(String scheme) {
  this.scheme = scheme;
  this.rootData = new StatisticsData();
  this.threadData = new ThreadLocal<StatisticsData>();
  this.allData = new HashSet<StatisticsDataReference>();
}

/**
 * Copy constructor.
 * 
 * @param other    The input Statistics object which is cloned.
 */
public Statistics(Statistics other) {
  this.scheme = other.scheme;
  this.rootData = new StatisticsData();
  other.visitAll(new StatisticsAggregator<Void>() {
    @Override
    public void accept(StatisticsData data) {
      rootData.add(data);
    }

    public Void aggregate() {
      return null;
    }
  });
  this.threadData = new ThreadLocal<StatisticsData>();
  this.allData = new HashSet<StatisticsDataReference>();
}

/**
 * A weak reference to a thread that also includes the data associated
 * with that thread. On the thread being garbage collected, it is enqueued
 * to the reference queue for clean-up.
 */
private class StatisticsDataReference extends WeakReference<Thread> {
  private final StatisticsData data;

  public StatisticsDataReference(StatisticsData data, Thread thread) {
    super(thread, STATS_DATA_REF_QUEUE);
    this.data = data;
  }

  public StatisticsData getData() {
    return data;
  }

  /**
   * Performs clean-up action when the associated thread is garbage
   * collected.
   */
  public void cleanUp() {
    // use the statistics lock for safety
    synchronized (Statistics.this) {
      /*
       * If the thread that created this thread-local data no longer exists,
       * remove the StatisticsData from our list and fold the values into
       * rootData.
       */
      rootData.add(data);
      allData.remove(this);
    }
  }
}

/**
 * Background action to act on references being removed.
 */
private static class StatisticsDataReferenceCleaner implements Runnable {
  @Override
  public void run() {
    while (true) {
      try {
        StatisticsDataReference ref =
            (StatisticsDataReference)STATS_DATA_REF_QUEUE.remove();
        ref.cleanUp();
      } catch (Throwable th) {
        // the cleaner thread should continue to run even if there are
        // exceptions, including InterruptedException
        LOG.warn("exception in the cleaner thread but it will continue to "
            + "run", th);
      }
    }
  }
}

/**
 * Get or create the thread-local data associated with the current thread.
 */
public StatisticsData getThreadStatistics() {
  StatisticsData data = threadData.get();
  if (data == null) {
    data = new StatisticsData();
    threadData.set(data);
    StatisticsDataReference ref =
        new StatisticsDataReference(data, Thread.currentThread());
    synchronized(this) {
      allData.add(ref);
    }
  }
  return data;
}

/**
 * Increment the bytes read in the statistics
 * @param newBytes the additional bytes read
 */
public void incrementBytesRead(long newBytes) {
  getThreadStatistics().bytesRead += newBytes;
}

/**
 * Increment the bytes written in the statistics
 * @param newBytes the additional bytes written
 */
public void incrementBytesWritten(long newBytes) {
  getThreadStatistics().bytesWritten += newBytes;
}

/**
 * Increment the number of read operations
 * @param count number of read operations
 */
public void incrementReadOps(int count) {
  getThreadStatistics().readOps += count;
}

/**
 * Increment the number of large read operations
 * @param count number of large read operations
 */
public void incrementLargeReadOps(int count) {
  getThreadStatistics().largeReadOps += count;
}

/**
 * Increment the number of write operations
 * @param count number of write operations
 */
public void incrementWriteOps(int count) {
  getThreadStatistics().writeOps += count;
}

/**
 * Apply the given aggregator to all StatisticsData objects associated with
 * this Statistics object.
 *
 * For each StatisticsData object, we will call accept on the visitor.
 * Finally, at the end, we will call aggregate to get the final total. 
 *
 * @param         The visitor to use.
 * @return        The total.
 */
private synchronized <T> T visitAll(StatisticsAggregator<T> visitor) {
  visitor.accept(rootData);
  for (StatisticsDataReference ref: allData) {
    StatisticsData data = ref.getData();
    visitor.accept(data);
  }
  return visitor.aggregate();
}

/**
 * Get the total number of bytes read
 * @return the number of bytes
 */
public long getBytesRead() {
  return visitAll(new StatisticsAggregator<Long>() {
    private long bytesRead = 0;

    @Override
    public void accept(StatisticsData data) {
      bytesRead += data.bytesRead;
    }

    public Long aggregate() {
      return bytesRead;
    }
  });
}

/**
 * Get the total number of bytes written
 * @return the number of bytes
 */
public long getBytesWritten() {
  return visitAll(new StatisticsAggregator<Long>() {
    private long bytesWritten = 0;

    @Override
    public void accept(StatisticsData data) {
      bytesWritten += data.bytesWritten;
    }

    public Long aggregate() {
      return bytesWritten;
    }
  });
}

/**
 * Get the number of file system read operations such as list files
 * @return number of read operations
 */
public int getReadOps() {
  return visitAll(new StatisticsAggregator<Integer>() {
    private int readOps = 0;

    @Override
    public void accept(StatisticsData data) {
      readOps += data.readOps;
      readOps += data.largeReadOps;
    }

    public Integer aggregate() {
      return readOps;
    }
  });
}

/**
 * Get the number of large file system read operations such as list files
 * under a large directory
 * @return number of large read operations
 */
public int getLargeReadOps() {
  return visitAll(new StatisticsAggregator<Integer>() {
    private int largeReadOps = 0;

    @Override
    public void accept(StatisticsData data) {
      largeReadOps += data.largeReadOps;
    }

    public Integer aggregate() {
      return largeReadOps;
    }
  });
}

/**
 * Get the number of file system write operations such as create, append 
 * rename etc.
 * @return number of write operations
 */
public int getWriteOps() {
  return visitAll(new StatisticsAggregator<Integer>() {
    private int writeOps = 0;

    @Override
    public void accept(StatisticsData data) {
      writeOps += data.writeOps;
    }

    public Integer aggregate() {
      return writeOps;
    }
  });
}


@Override
public String toString() {
  return visitAll(new StatisticsAggregator<String>() {
    private StatisticsData total = new StatisticsData();

    @Override
    public void accept(StatisticsData data) {
      total.add(data);
    }

    public String aggregate() {
      return total.toString();
    }
  });
}

/**
 * Resets all statistics to 0.
 *
 * In order to reset, we add up all the thread-local statistics data, and
 * set rootData to the negative of that.
 *
 * This may seem like a counterintuitive way to reset the statsitics.  Why
 * can't we just zero out all the thread-local data?  Well, thread-local
 * data can only be modified by the thread that owns it.  If we tried to
 * modify the thread-local data from this thread, our modification might get
 * interleaved with a read-modify-write operation done by the thread that
 * owns the data.  That would result in our update getting lost.
 *
 * The approach used here avoids this problem because it only ever reads
 * (not writes) the thread-local data.  Both reads and writes to rootData
 * are done under the lock, so we're free to modify rootData from any thread
 * that holds the lock.
 */
public void reset() {
  visitAll(new StatisticsAggregator<Void>() {
    private StatisticsData total = new StatisticsData();

    @Override
    public void accept(StatisticsData data) {
      total.add(data);
    }

    public Void aggregate() {
      total.negate();
      rootData.add(total);
      return null;
    }
  });
}

/**
 * Get the uri scheme associated with this statistics object.
 * @return the schema associated with this set of statistics
 */
public String getScheme() {
  return scheme;
}

@VisibleForTesting
synchronized int getAllThreadLocalDataSize() {
  return allData.size();
}

}

/**

Get the Map of Statistics object indexed by URI Scheme.
@return  a Map having a key as URI scheme and value as Statistics object 
@deprecated  use {@link  #getAllStatistics} instead 
*/
@Deprecated  
public static synchronized Map<String, Statistics> getStatistics() {
Map<String, Statistics> result = new HashMap<String, Statistics>();
for(Statistics stat: statisticsTable.values()) {
result.put(stat.getScheme(), stat);
}
return result;
}

/**

Return the FileSystem classes that have Statistics
*/
public static synchronized List getAllStatistics() {
return new ArrayList(statisticsTable.values());
}

/**

Get the statistics for a particular file system
@param  cls the class to lookup 
@return  a statistics object 
*/
public static synchronized
Statistics getStatistics(String scheme, Class<? extends FileSystem> cls) {
Statistics result = statisticsTable.get(cls);
if (result == null) {
result = new Statistics(scheme);
statisticsTable.put(cls, result);
}
return result;
}

/**

Reset all statistics for all file systems
*/
public static synchronized void clearStatistics() {
for(Statistics stat: statisticsTable.values()) {
stat.reset();
}
}

/**

Print all statistics for all file systems
*/
public static synchronized
void printStatistics() throws IOException {
for (Map.Entry<Class<? extends FileSystem>, Statistics> pair:
statisticsTable.entrySet()) {
System.out.println("  FileSystem " + pair.getKey().getName() +
": " + pair.getValue());
}
}

// Symlinks are temporarily disabled - see HADOOP-10020 and HADOOP-10052
private static boolean symlinksEnabled = false;

private static Configuration conf = null;

@VisibleForTesting  
public static boolean areSymlinksEnabled() {
return symlinksEnabled;
}

@VisibleForTesting  
public static void enableSymlinks() {
symlinksEnabled = true;
}
}

其它存储方案

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

韧小钊

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值