【TensorFlow2.0】tf.keras.preprocessing.image.ImageDataGenerator#flow_from_diectory

flow_from_diectory是ImageGenerator类下的函数,从函数名,就可以明白其就是从文件夹中读取图像。

定义

  def flow_from_directory(self,
                          directory,
                          target_size=(256, 256),
                          color_mode='rgb',
                          classes=None,
                          class_mode='categorical',
                          batch_size=32,
                          shuffle=True,
                          seed=None,
                          save_to_dir=None,
                          save_prefix='',
                          save_format='png',
                          follow_links=False,
                          subset=None,
                          interpolation='nearest'):
    """Takes the path to a directory & generates batches of augmented data.

    Args:
        directory: string, path to the target directory. It should contain one
          subdirectory per class. Any PNG, JPG, BMP, PPM or TIF images inside
          each of the subdirectories directory tree will be included in the
          generator. See [this script](
            https://gist.github.com/fchollet/0830affa1f7f19fd47b06d4cf89ed44d)
              for more details.
        target_size: Tuple of integers `(height, width)`, defaults to `(256,
          256)`. The dimensions to which all images found will be resized.
        color_mode: One of "grayscale", "rgb", "rgba". Default: "rgb". Whether
          the images will be converted to have 1, 3, or 4 channels.
        classes: Optional list of class subdirectories
            (e.g. `['dogs', 'cats']`). Default: None. If not provided, the list
              of classes will be automatically inferred from the subdirectory
              names/structure under `directory`, where each subdirectory will be
              treated as a different class (and the order of the classes, which
              will map to the label indices, will be alphanumeric). The
              dictionary containing the mapping from class names to class
              indices can be obtained via the attribute `class_indices`.
        class_mode: One of "categorical", "binary", "sparse",
            "input", or None. Default: "categorical".
            Determines the type of label arrays that are returned:
            - "categorical" will be 2D one-hot encoded labels,
            - "binary" will be 1D binary labels,
            - "sparse" will be 1D integer labels,
            - "input"  will be images identical to input images (mainly used to
              work with autoencoders).
            - If None, no labels are returned (the generator will only yield
              batches of image data, which is useful to use with
              `model.predict()`).
            Please note that in case of class_mode None, the data still needs to
            reside in a subdirectory of `directory` for it to work correctly.
        batch_size: Size of the batches of data (default: 32).
        shuffle: Whether to shuffle the data (default: True) If set to False,
          sorts the data in alphanumeric order.
        seed: Optional random seed for shuffling and transformations.
        save_to_dir: None or str (default: None). This allows you to optionally
          specify a directory to which to save the augmented pictures being
          generated (useful for visualizing what you are doing).
        save_prefix: Str. Prefix to use for filenames of saved pictures (only
          relevant if `save_to_dir` is set).
        save_format: one of "png", "jpeg", "bmp", "pdf", "ppm", "gif",
            "tif", "jpg"
            (only relevant if `save_to_dir` is set). Default: "png".
        follow_links: Whether to follow symlinks inside
            class subdirectories (default: False).
        subset: Subset of data (`"training"` or `"validation"`) if
          `validation_split` is set in `ImageDataGenerator`.
        interpolation: Interpolation method used to resample the image if the
          target size is different from that of the loaded image. Supported
          methods are `"nearest"`, `"bilinear"`, and `"bicubic"`. If PIL version
          1.1.3 or newer is installed, `"lanczos"` is also supported. If PIL
          version 3.4.0 or newer is installed, `"box"` and `"hamming"` are also
          supported. By default, `"nearest"` is used.

    Returns:
        A `DirectoryIterator` yielding tuples of `(x, y)`
            where `x` is a numpy array containing a batch
            of images with shape `(batch_size, *target_size, channels)`
            and `y` is a numpy array of corresponding labels.
    """

flow_from_diectory中参数含义:


directory:目标文件夹路径,对于每一个类,该文件夹都要包含一个子文件夹。

target_size:整数tuple,默认为(256, 256)。图像将被resize成该尺寸

color_mode:颜色模式,为"grayscale""rgb"之一,默认为"rgb",代表这些图片是否会被转换为单通道或三通道的图片。

classes:可选参数,为子文件夹的列表,如['cat','dog'],默认为None。若未提供,则该类别列表将从directory下的子文件夹名称/结构自动推断。每一个子文件夹都会被认为是一个新的类。(类别的顺序将按照字母表顺序映射到标签值)。

class_mode: "categorical", "binary", "sparse"None之一。默认为"categorical。该参数决定了返回的标签数组的形式, "categorical"会返回2D的one-hot编码标签,"binary"返回1D的二值标签。"sparse"返回1D的整数标签,如果为None则不返回任何标签,生成器将仅仅生成batch数据。

batch_size:batch数据的大小,默认32。

shuffle:是否打乱数据,默认为True。

seed:可选参数,打乱数据和进行变换时的随机数种子。

save_to_dir:None或字符串,该参数能让你将数据增强后的图片保存起来,用以可视化。

save_prefix:字符串,保存数据增强后图片时使用的前缀, 仅当设置了save_to_dir时生效。

save_format:"png""jpeg"之一,指定保存图片的数据格式,默认"jpeg"

这些参数中的directory一定要弄清楚,它是指类别文件夹的上一层文件夹,在该数据集中,类别文件夹为cat和dog,它的上一级文件夹是train。所以director为 r"D://Learning//tensorflow_2.0//animal//data//train"

另外,class这个参数也要注意,通常我们就采用默认None,directory的子文件夹就是标签。在该分类任务中标签就是smile和neutral。

  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值