【Tensorflow】Tensorflow对图像的基本操作

最新推荐文章于 2024-08-13 14:39:04 发布

Day-yong

最新推荐文章于 2024-08-13 14:39:04 发布

阅读量2.2k

点赞数

分类专栏： Tensorflow 文章标签： Tensorflow 图像处理

本文链接：https://blog.csdn.net/Daycym/article/details/90169245

版权

Tensorflow 专栏收录该内容

8 篇文章 1 订阅

订阅专栏

前言

图像处理的Python库：OpenCV、PIL、matplotlib、tensorflow等，本篇主要使用 tensorflow 来对图像进行格式转换、大小重置、剪切或填充、旋转、颜色转换、图像调整、加噪音。

tensorflow版本：1.9

本篇代码可见：Github

一、读取图像

API：

tf.read_file(filename, name=None)

filename：路径
name：操作的名称

将图像数据转换为像素点的数据格式，返回对象为： [height, width, num_channels]，如果是gif的图像返回：[num_frames, height, width, num_channels]

height: 图片的高度的像素大小
width: 图片的水平宽度的像素大小
num_channels: 图像的通道数，也就是API中的channels的值
num_frames: 因为gif的图像是一个动态图像，可以将每一个动的画面看成一个静态图像，num_frames相当于在这个gif图像中有多少个静态图像

参数channels：可选值：0 1 3 4，默认为0，一般使用0 1 3，不建议使用4

0：使用图像的默认通道，也就是图像是几通道的就使用几通道
1：使用灰度级别的图像数据作为返回值（只有一个通道：黑白）
3：使用RGB三通道读取数据
4：使用RGBA四通道读取数据(R：红色，G：绿色，B：蓝色，A：透明度)

# 读取数据
file_contents = tf.read_file(image_path)

二、图像格式的转换

API：

# 将PNG编码的图像解码为uint8或uint16张量
tf.image.decode_png(contents, channels=0, dtype=tf.uint8, name=None)

contents：A Tensor型string，0-d，PNG编码的图像。
channels：可选int，默认为0，解码图像的颜色通道数。
dtype：可选tf.DType来自：tf.uint8, tf.uint16。默认为tf.uint8。
name：操作的名称（可选）。

# 为了方便功能decode_bmp，decode_gif，decode_jpeg，和decode_png。
# 检测图像是BMP，GIF，JPEG还是PNG，并执行适当的操作将输入字节string转换Tensor为类型uint8
tf.image.decode_image(contents, channels=None, name=None)
# 将JPEG编码的图像解码为uint8张量
tf.image.decode_jpeg(contents,channels=0,ratio=1,fancy_upscaling=True,
	try_recover_truncated=False, acceptable_fraction=1,dct_method='',name=None)

三、图像大小重置

API：

# 调整图像大小为指定大小
tf.image.resize_images( images, size, method=ResizeMethod.BILINEAR, align_corners=False)

images：4-D形状张量 [batch, height, width, channels] 或3-D形状张量 [height, width, channels]。
size：new_height, new_width. 图像的新大小。
method：ResizeMethod。默认为ResizeMethod.BILINEAR。
align_corners：布尔，如果为True，则输入和输出张量的4个角像素的中心对齐，保留角点像素处的值。默认为False。

"""
BILINEAR = 0 线性插值，默认
NEAREST_NEIGHBOR = 1 最近邻插值，失真最小
BICUBIC = 2 三次插值
AREA = 3 面积插值
"""
# images: 给定需要进行大小转换的图像对应的tensor对象，格式为：[height, width, num_channels]
#	      或者[batch, height, width, num_channels]
# API返回值和images格式一样，唯一区别是height和width变化为给定的值
resize_image_tensor = tf.image.resize_images(images=image_tensor, size=(200, 200),
                                             method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)

四、图片的剪切&填充

1、剪切&填充

API：

# 裁剪或将图像填充到目标宽度和高度
tf.image.resize_image_with_crop_or_pad( image, target_height,  target_width)

image：4-D形状张量[batch, height, width, channels ]或3-D形状张量 [height, width, channels]。
target_height：目标高度。
target_width：目标宽度。

# 图片重置大小，通过图片的剪切或者填充（从中间开始计算新图片的大小）
crop_or_pad_image_tensor = tf.image.resize_image_with_crop_or_pad(image_tensor, 200, 200)  # 剪切
# crop_or_pad_image_tensor = tf.image.resize_image_with_crop_or_pad(image_tensor, 1000, 1000) # 填充
show_image_tensor(crop_or_pad_image_tensor)

在这里插入图片描述

2、中间等比例剪切

API：

# 裁剪图像的中心区域
tf.image.central_crop( image,  central_fraction)

image：3-D形状张量[height，width，depth] 或4-D形状张量 [batch_size，height，width，depth]。
central_fraction：float（0,1），要裁剪的大小的比例

# 中间等比例剪切
central_crop_image_tensor = tf.image.central_crop(image_tensor, central_fraction=0.2)
# show_image_tensor(central_crop_image_tensor)

在这里插入图片描述

3、给定位置开始填充数据

API：

tf.image.pad_to_bounding_box( image,  offset_height, offset_width,  target_height,  target_width)

image：4-D形状张量 [batch, height, width, channels] 或3-D形状张量 [height, width, channels]。
offset_height：要添加到顶部的零行数。
offset_width：要在左侧添加的零列数。
target_height：输出图像的高度。
target_width：输出图像的宽度。

# 填充数据（给定位置开始填充）
pad_to_bounding_box_image_tensor = tf.image.pad_to_bounding_box(
									image_tensor, 
									offset_height=400, 
									offset_width=490,
									target_height=1000,
									target_width=1000)
show_image_tensor(pad_to_bounding_box_image_tensor)

在这里插入图片描述

4、给定位置开始剪切数据

API：

tf.image.crop_to_bounding_box(image,  offset_height, offset_width,  target_height,  target_width )

image：4-D形状张量 [batch, height, width, channels] 或3-D形状张量 [height, width, channels]。
offset_height：输入中结果左上角的垂直坐标。
offset_width：输入中结果左上角的水平坐标。
target_height：结果的高度。
target_width：结果的宽度。

# 剪切数据（给定位置开始剪切）
crop_to_bounding_box_image_tensor = tf.image.crop_to_bounding_box(
									image_tensor, 
									offset_height=10, 
									offset_width=40,
									target_height=200, 
									target_width=300)
show_image_tensor(crop_to_bounding_box_image_tensor)

在这里插入图片描述

五、图像旋转

1、上下交换

API：

# 垂直翻转图像（上下）
tf.image.flip_up_down(image)

image：4-D形状张量 [batch, height, width, channels] 或3-D形状张量 [height, width, channels]。

# 上下交换
flip_up_down_image_tensor = tf.image.flip_up_down(image_tensor)
show_image_tensor(flip_up_down_image_tensor)

在这里插入图片描述

2、左右交换

API：

# 水平翻转图像（左右）
tf.image.flip_left_right(image)

image：4-D形状张量 [batch, height, width, channels] 或3-D形状张量 [height, width, channels]。

# 左右交换
flip_left_right_image_tensor = tf.image.flip_left_right(image_tensor)
show_image_tensor(flip_left_right_image_tensor)

在这里插入图片描述

3、转置

API：

# 通过交换高度和宽度尺寸来转置图像
tf.image.transpose_image(image)

image：4-D形状张量 [batch, height, width, channels] 或3-D形状张量 [height, width, channels]。

# 转置
transpose_image_tensor = tf.image.transpose_image(image_tensor)
show_image_tensor(transpose_image_tensor)

在这里插入图片描述

4、选择（90度、180度、270度、360度…）

API：

# k*90度旋转，逆时针旋转
tf.image.rot90(image, k=1, name=None)

image：4-D形状张量 [batch, height, width, channels] 或3-D形状张量 [height, width, channels]。
k：标量整数。图像旋转90度的次数。
name：此操作的名称（可选）。

# 旋转（90度、180度、270度....）
# k*90度旋转，逆时针旋转
k_rot90_image_tensor = tf.image.rot90(image_tensor, k=1)
show_image_tensor(k_rot90_image_tensor)

在这里插入图片描述

六、颜色空间的转换

颜色空间的转换必须将image的值转换为float32类型，不能使用unit8类型：

float32_image_tensor = tf.image.convert_image_dtype(image_tensor, dtype=tf.float32)

1、rgb -> hsv

hsv表示：h: 图像的色彩/色度，s:图像的饱和度，v：图像的亮度

API：

tf.image.rgb_to_hsv( images, name=None)

image：必须是下列类型之一：half，bfloat16，float32，float64，1-D或更高，要转换的RGB数据，最后一个维度必须为3
name：操作的名称（可选）。

# rgb -> hsv（h: 图像的色彩/色度，s:图像的饱和度，v：图像的亮度）
hsv_image_tensor = tf.image.rgb_to_hsv(float32_image_tensor)
show_image_tensor(hsv_image_tensor)

在这里插入图片描述

2、hsv -> rgb

# hsv -> rgb
rgb_image_tensor = tf.image.hsv_to_rgb(hsv_image_tensor)
show_image_tensor(rgb_image_tensor)

在这里插入图片描述

3、rgb -> gray

# rgb -> gray
gray_image_tensor = tf.image.rgb_to_grayscale(rgb_image_tensor)
show_image_tensor(gray_image_tensor)

在这里插入图片描述

4、图像的二值化（从颜色空间中提取图像的轮廓信息）

# 可以从颜色空间中提取图像的轮廓信息(图像的二值化)
a = gray_image_tensor
b = tf.less_equal(a, 0.9)
# 0是黑，1是白
# condition?true:false
# condition、x、y格式必须一模一样，当condition中的值为true的之后，返回x对应位置的值，否则返回y对应位置的值
# 对于a中所有大于0.9的像素值，设置为0
c = tf.where(condition=b, x=a, y=a - a)
# 对于a中所有小于等于0.9的像素值，设置为1
d = tf.where(condition=b, x=c - c + 1, y=c)
show_image_tensor(d)

在这里插入图片描述

七、图像的调整

1、亮度调整

API：

# 调整RGB或灰度图像的亮度
tf.image.adjust_brightness( image, delta )

image：一个张量。
delta：一个标量。要添加到像素值的数量。

# 亮度调整
# image: RGB图像信息，设置为float类型和unit8类型的效果不一样，一般建议设置为float类型
# delta: 取值范围(-1,1）之间的float类型的值，表示对于亮度的减弱或者增强的系数值
# 底层执行：rgb -> hsv -> h,s,v*delta -> rgb
adjust_brightness_image_tensor = tf.image.adjust_brightness(image=image_tensor, delta=0.8)
show_image_tensor(adjust_brightness_image_tensor)

在这里插入图片描述

2、色度调整

API：

# 调整RGB图像的色调
tf.image.adjust_hue(image, delta, name=None)

image：RGB图像或图像。最后一个维度的大小必须为3
delta：浮点型，添加多少到色调通道
name：此操作的名称（可选）

# 色调调整
# image: RGB图像信息，设置为float类型和unit8类型的效果不一样，一般建议设置为float类型
# delta: 取值范围(-1,1）之间的float类型的值，表示对于色调的减弱或者增强的系数值
# 底层执行：rgb -> hsv -> h*delta,s,v -> rgb
adjust_hue_image_tensor = tf.image.adjust_hue(image_tensor, delta=-0.8)
show_image_tensor(adjust_hue_image_tensor)

在这里插入图片描述

3、饱和度调整

API：

tf.image.adjust_saturation(image, saturation_factor, name=None)

image：RGB图像或图像。最后一个维度的大小必须为3。
saturation_factor：浮点型。将饱和度乘以的因子。
name：此操作的名称（可选）。

# 饱和度调整
# image: RGB图像信息，设置为float类型和unit8类型的效果不一样，一般建议设置为float类型
# saturation_factor: 一个float类型的值，表示对于饱和度的减弱或者增强的系数值，饱和因子
# 底层执行：rgb -> hsv -> h,s*saturation_factor,v -> rgb
adjust_saturation_image_tensor = tf.image.adjust_saturation(image_tensor, saturation_factor=20)
show_image_tensor(adjust_saturation_image_tensor)

在这里插入图片描述

4、对比度调整

API：

tf.image.adjust_contrast(image, contrast_factor)

images：调整图像，至少3-D。
contrast_factor：用于调整对比度的浮点乘数。

# 对比度调整，公式：(x-mean) * contrast_factor + mean
adjust_contrast_image_tensor = tf.image.adjust_contrast(image_tensor, contrast_factor=10)
show_image_tensor(adjust_contrast_image_tensor)

在这里插入图片描述

5、图像的gamma校正

API：

tf.image.adjust_gamma(image, gamma=1, gain=1)

image：一个张量。
gamma：标量或张量。非负实数。
gain：标量或张量。常数乘数。

# 图像的gamma校正
# images: 要求必须是float类型的数据
# gamma：任意值，Oup = In * Gamma
adjust_gamma_image_tensor = tf.image.adjust_gamma(float32_image_tensor, gamma=100)
show_image_tensor(adjust_gamma_image_tensor)

在这里插入图片描述

6、图像的归一化

API：

tf.image.per_image_standardization(image)

image：3-D张量的形状 [height, width, channels]

# 图像的归一化(x-mean)/adjusted_sttdev, adjusted_sttdev=max(stddev, 1.0/sqrt(image.NumElements()))
per_image_standardization_image_tensor = tf.image.per_image_standardization(image_tensor)
show_image_tensor(per_image_standardization_image_tensor)

在这里插入图片描述

八、加入噪音

noisy_image_tensor = image_tensor + tf.cast(5 * tf.random_normal(shape=[600, 510, 3], mean=0, stddev=0.1), tf.uint8)
show_image_tensor(noisy_image_tensor)