知识蒸馏 循环蒸馏_通过蒸馏提取战略档案

知识蒸馏 循环蒸馏

Perhaps you are building an application which depends on archives; for example, you constantly have to download archives and extract files from them. There are many libraries out there that can help you get files extracted from an archive, and a new player in town capable of doing this job is Distill.

也许您正在构建一个依赖档案的应用程序; 例如,您经常需要下载档案并从中提取文件。 那里有很多图书馆可以帮助您从档案库中提取文件,而Distill则是该镇中能够做到这一点的新玩家。

alt

With Distill, you can easily extract an archive into a specified directory. You can also give multiple archives to Distill and let it pick the most optimal one, as per a strategy you define yourself. Let’s dive into the code to see what we can achieve with Distill.

使用Distill,您可以轻松地将存档提取到指定目录中。 您还可以根据自己定义的策略,为Distill提供多个存档,并让其选择最佳存档。 让我们深入研究代码,看看使用Distill可以实现什么。

If you want to follow along, you can have a look at this Github repository to check out the code.

如果您想继续学习,可以查看这个 Github存储库以检查代码。

建立 (Setup)

Before we can start using Distill, do note that at the moment of writing, it only supports Unix based systems. The reason for this is that Distill uses command line tools which are currently only available on Unix based systems.

在开始使用Distill之前,请注意,在撰写本文时,它仅支持基于Unix的系统。 原因是Distill使用了命令行工具,这些工具目前仅在基于Unix的系统上可用。

In the supported formats section, you can clearly see which commands need to be available on the command line.

在“ 支持的格式”部分中,您可以清楚地看到哪些命令需要在命令行上可用。

To add Distill to your project, we expect that you already have a project up and running with composer. You can install Distill by running:

要将Distill添加到您的项目中,我们希望您已经有一个使用composer运行的项目。 您可以通过运行以下命令安装Distill:

composer require raulfraile/distill:~0.6.*

用法 (Usage)

First and foremost, we need to have files archived in several different formats. If you have downloaded the above mentioned repository, you will have 3 archives within the files directory.

首先,我们需要将文件归档为几种不同的格式。 如果您已经下载了上述存储库,那么在files目录中将有3个归档files

We start off by creating the extractor class. We create a file in src/SitePoint/Extractor named Extractor with the following content.

我们首先创建提取器类。 我们在src/SitePoint/Extractor创建一个名为Extractor ,其内容如下。

namespace SitePoint\Extractor;

use Distill\Distill;

/**
 * Class to extract archived files
 */
class Extractor
{    
    /**
     * @var Distill
     */
    private $distiller;

    /**
     * Constructor
     */
    public function __construct()
    {
        $this->distiller = new Distill();
    }
}

We start off by creating a method to extract all files from an archive. We need an actual file for that and a directory to extract to. The method itself does nothing special for now. You could expand it later on with checks if the file is valid for example.

我们首先创建一种从存档中提取所有文件的方法。 我们需要一个实际的文件和一个要提取的目录。 该方法本身暂时没有什么特别的。 例如,您可以稍后通过检查文件是否有效来对其进行扩展。

The method should look something like this.

该方法应如下所示。

/**
 * Extract files into directory
 *
 * @param string $fromFile
 * @param string $toDirectory
 */
public function extract($fromFile, $toDirectory)
{
    $this->distiller->extract($fromFile, $toDirectory);
}

The fromFile variable can be a path (absolute or relative) or a URL where the file is located. The toDirectory variable can be any directory to extract to, absolute as well as relative. Distill will do the rest for you.

fromFile变量可以是文件所在的路径(绝对路径或相对路径)或URL。 toDirectory变量可以是要提取到的任何目录,绝对目录也可以是相对目录。 蒸馏将为您完成其余的工作。

Extracting an archive is something that multiple libraries can do. What is special about Distill is that you can throw in an array of files in which Distill will make the most optimal selection. To create this method, we first are going to add some constants to the class.

提取档案是多个库可以执行的操作。 Distill的特别之处在于,您可以放入一个文件数组,Distill将在其中进行最佳选择。 要创建此方法,我们首先将向类添加一些常量。

/**
 * Minimum size strategy
 */
const MINIMUM_SIZE = "\Distill\Strategy\MinimumSize";

/**
 * Uncompression speed strategy
 */
const UNCOMPRESSION_SPEED = "\Distill\Strategy\UncompressionSpeed";

/**
 * Random strategy
 */
const RANDOM = "\Distill\Strategy\Random";

When supplying Distill with multiple archived files, Distill will select which archive suits you best based on the chosen strategy. With the minimum size strategy, distill will check which file is the smallest and use that one. You would use this strategy when you want to save bandwidth, for example.

当为Distill提供多个存档文件时,Distill将根据所选策略选择最适合您的存档。 使用minimum size策略,Distill将检查哪个文件最小并使用该文件。 例如,当您想节省带宽时,可以使用此策略。

When speed is important for you, you should use the uncompression speed strategy. Distill will check which file it can extract the quickest and will use that file.

当速度对您很重要时,您应该使用uncompression speed策略。 Distill将检查可以最快提取的文件并使用该文件。

If you don’t care about which file it uses, you can use the random strategy to have a file randomly selected for you.

如果您不关心它使用哪个文件,则可以使用random策略为您随机选择一个文件。

Since we also want to extract the file immediately, we can reuse the already created extract method for this. This is what your method could look like.

由于我们也想立即提取文件,因此我们可以为此使用已创建的提取方法。 这就是您的方法的外观。

/**
 * Choose one of the files within the array and extract it to the given directory
 *
 * @param array  $fromFiles
 * @param string $toDirectory
 * @param string $preferredStrategy
 */
public function chooseAndExtract(array $fromFiles, $toDirectory, $preferredStrategy = self::MINIMUM_SIZE)
{
    $preferredFile = $this->distiller
        ->getChooser()
        ->setStrategy(new $preferredStrategy())
        ->setFiles($fromFiles)
        ->getPreferredFile();

    self::extract($preferredFile, $toDirectory);
}

Based on the array of files Distill is getting, it will choose automatically which file is the preferred file. This file will then be extracted to your chosen directory. If you followed along accordingly, you should now have a class which looks like this.

根据Distill获取的文件数组,它将自动选择哪个文件为首选文件。 然后,该文件将解压缩到您选择的目录中。 如果遵循相应的步骤,那么现在应该有一个类似于以下的类。

namespace SitePoint\Extractor;

use Distill\Distill;

/**
 * Class to extract archived files
 */
class Extractor
{
    /**
     * Minimum size strategy
     */
    const MINIMUM_SIZE = "\Distill\Strategy\MinimumSize";

    /**
     * Uncompression speed strategy
     */
    const UNCOMPRESSION_SPEED = "\Distill\Strategy\UncompressionSpeed";

    /**
     * Random strategy
     */
    const RANDOM = "\Distill\Strategy\Random";

    /**
     * @var Distill
     */
    private $distiller;

    /**
     * Constructor
     */
    public function __construct()
    {
        $this->distiller = new Distill();
    }

    /**
     * Extract files into directory
     *
     * @param string $fromFile
     * @param string $toDirectory
     */
    public function extract($fromFile, $toDirectory)
    {
        $this->distiller->extract($fromFile, $toDirectory);
    }

    /**
     * Choose one of the files within the array and extract it to the given directory
     *
     * @param array  $fromFiles
     * @param string $toDirectory
     * @param string $preferredStrategy
     */
    public function chooseAndExtract(array $fromFiles, $toDirectory, $preferredStrategy = self::MINIMUM_SIZE)
    {
        $preferredFile = $this->distiller
            ->getChooser()
            ->setStrategy(new $preferredStrategy())
            ->setFiles($fromFiles)
            ->getPreferredFile();

        self::extract($preferredFile, $toDirectory);
    }
}

Let’s try out if the class works correctly. We create an index.php file within the root of our project with the following content.

让我们尝试一下该类是否正常工作。 我们在项目的根目录下创建带有以下内容的index.php文件。

require_once __DIR__ . '/vendor/autoload.php';

$files = array(
    'files/sitepoint.zip',
    'files/sitepoint.tar.gz',
    'files/sitepoint.tar'
);

$extractor = new \SitePoint\Extractor\Extractor();
$extractor->extract(current($files), 'files/extracted/simple');
$extractor->chooseAndExtract($files, 'files/extracted/advanced', \SitePoint\Extractor\Extractor::RANDOM);

If we run php index.php within our terminal, we will see the SitePoint logo being extracted from an archive.

如果在终端中运行php index.php ,则会看到SitePoint徽标已从档案中提取。

结论 (Conclusion)

Distill is a very specific library, and might appear lacking in features when compared to other archive manipulation tools. But in this niche it focuses on, it excels. If you are looking for a lightweight extractor which can help you save bandwidth and/or time, Distill might be the library you are looking for. Maybe you can even combine it with a compressor and make an excellent hybrid package for your app’s archive manipulation features? Give it a go and let us know how it worked out for you.

Distill是一个非常特殊的库,与其他存档操作工具相比,它似乎缺乏功能。 但是,在这个利基市场中,它擅长于。 如果您正在寻找可以帮助您节省带宽和/或时间的轻量级提取器,则Distill可能是您正在寻找的库。 也许您甚至可以将其与压缩器结合使用,并为您的应用程序的存档操作功能制作出色的混合程序包? 试试看,让我们知道如何为您解决。

翻译自: https://www.sitepoint.com/strategic-archive-extraction-distill/

知识蒸馏 循环蒸馏

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值