DiskBlockManager

1.1、DiskBlockManager创建

在BlockManager中创建

    val diskBlockManager = {
        // Only perform cleanup if an external service is not serving our shuffle files.
        val deleteFilesOnStop =
            !externalShuffleServiceEnabled || executorId == SparkContext.DRIVER_IDENTIFIER
        new DiskBlockManager(conf, deleteFilesOnStop)
    }

1.2、DiskBlockManger初始化

1.2.1、创建临时目录,

用于在进程关闭时创建的线程, 通过调用DiskBlockManger的stop方法,清除一些临时目录

    /**
     * Create local directories for storing block data. These directories are
     * located inside configured local directories and won't
     * be deleted on JVM exit when using the external shuffle service.
     */
    private def createLocalDirs(conf: SparkConf): Array[File] = {
        Utils.getConfiguredLocalDirs(conf).flatMap { rootDir =>
            try {
                val localDir = Utils.createDirectory(rootDir, "blockmgr")
                logInfo(s"Created local directory at $localDir")
                Some(localDir)
            } catch {
                case e: IOException =>
                    logError(s"Failed to create local dir in $rootDir. Ignoring this directory.", e)
                    None
            }
        }
    }
1.2.1.1、 一级目录, spark-UUID
    /**
     * Create a directory inside the given parent directory. The directory is guaranteed to be
     * newly created, and is not marked for automatic deletion.
     */
    def createDirectory(root: String, namePrefix: String = "spark"): File = {
        var attempts = 0
        val maxAttempts = MAX_DIR_CREATION_ATTEMPTS
        var dir: File = null
        while (dir == null) {
            attempts += 1
            if (attempts > maxAttempts) {
                throw new IOException("Failed to create a temp directory (under " + root + ") after " +
                        maxAttempts + " attempts!")
            }
            try {
                //创建目录 spark-UUID
                dir = new File(root, namePrefix + "-" + UUID.randomUUID.toString)
                if (dir.exists() || !dir.mkdirs()) {
                    dir = null
                }
            } catch {
                case e: SecurityException => dir = null;
            }
        }

        dir.getCanonicalFile
    }
1.2.1.2、 二级目录
1.2.2、添加运行环境结束的钩子

用于在进程关闭时,通过调用DiskBlockManager的stop方法, 清除临时目录

    private val shutdownHook = addShutdownHook()

	private def addShutdownHook(): AnyRef = {
        logDebug("Adding shutdown hook") // force eager creation of logger
        ShutdownHookManager.addShutdownHook(ShutdownHookManager.TEMP_DIR_SHUTDOWN_PRIORITY + 1) { () =>
            logInfo("Shutdown hook called")
            //shutdown时,会调用doStop()
            DiskBlockManager.this.doStop()
        }
    }
    

1.3、获取磁盘文件

  • 1、根据文件名计算哈希值;
  • 2、根据哈希值与本地文件一级目录的总数求余,记为dirId;
  • 3、根据哈希值与本地文件一级目录的总数求商,此商再与耳机目录的数目求余,记为subDirId;
  • 4、如果dirId/subDirId存在,则获取dirId/subDirId目录下的文件,否则新建dirId/subDirId目录
    /** Looks up a file by hashing it into one of our local subdirectories. */
    // This method should be kept in sync with
    // org.apache.spark.network.shuffle.ExternalShuffleBlockResolver#getFile().
    def getFile(filename: String): File = {
        // Figure out which local directory it hashes to, and which subdirectory in that
        val hash = Utils.nonNegativeHash(filename)
        val dirId = hash % localDirs.length
        val subDirId = (hash / localDirs.length) % subDirsPerLocalDir

        // Create the subdirectory if it doesn't already exist
        val subDir = subDirs(dirId).synchronized {
            val old = subDirs(dirId)(subDirId)
            if (old != null) {
                old
            } else {
                val newDir = new File(localDirs(dirId), "%02x".format(subDirId))
                if (!newDir.exists() && !newDir.mkdir()) {
                    throw new IOException(s"Failed to create local dir in $newDir.")
                }
                subDirs(dirId)(subDirId) = newDir
                newDir
            }
        }

        new File(subDir, filename)
    }

1.4、创建临时Block文件

DiskBlockManager会为本地数据创建临时文件和ShuffleMapTask运行结束的中间结果创建临时文件。


    /** Produces a unique block id and File suitable for storing local intermediate results. */
    def createTempLocalBlock(): (TempLocalBlockId, File) = {
        var blockId = new TempLocalBlockId(UUID.randomUUID())
        while (getFile(blockId).exists()) {
            blockId = new TempLocalBlockId(UUID.randomUUID())
        }
        (blockId, getFile(blockId))
    }

    /** Produces a unique block id and File suitable for storing shuffled intermediate results. */
    def createTempShuffleBlock(): (TempShuffleBlockId, File) = {
        var blockId = new TempShuffleBlockId(UUID.randomUUID())
        while (getFile(blockId).exists()) {
            blockId = new TempShuffleBlockId(UUID.randomUUID())
        }
        (blockId, getFile(blockId))
    }
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值