知乎博主: https://zhuanlan.zhihu.com/p/500659643
原博客: https://pyimagesearch.com/2018/07/19/opencv-tutorial-a-guide-to-learn-opencv/
显示图像
# import the necessary packages
#import imutils
import cv2
import os # 导入os模块
os.chdir('/xxx/py-practice' ) # 更改当前工作目录
# load the input image and show its dimensions, keeping in mind that
# images are represented as a multi-dimensional NumPy array with
# shape no. rows (height) x no. columns (width) x no. channels (depth)
image = cv2.imread("2.jpg") #a NumPy array
(h, w, d) = image.shape # to extract the height, width, and depth
# the height comes before the width because:
# We describe matrices by # of rows x # of columns
# The number of rows is our height
# And the number of columns is our width
# Depth is the number of channels
# in our case this is three since we’re working with 3 color channels: Blue, Green, and Red.
print("width={}, height={}, depth={}".format(w, h, d))
# display the image to our screen -- we will need to click the window
# open by OpenCV and press a key on our keyboard to continue execution
# "width={}, height={}, depth={}" 是一个格式化字符串
# {} 是占位符,用于在字符串中插入变量的值
# .format(w, h, d) 是对格式化字符串进行格式化的方法
cv2.imshow("Image", image)
cv2.waitKey(0) # waits for a keypress
# cv2.waitKey(0) 是一个用于等待键盘输入的函数。
# 0 表示无限期地等待用户的按键输入。
# 在按下键盘上的任意键之前,程序会一直停在这一行。
# 程序会在显示图像的窗口中等待用户按下任意键后继续执行下一行代码
# 通常用于保持图像显示窗口的打开状态
为什么height在width前:用几行几列描述矩阵,height是行数,width是列数
depth深度是通道数量,本例即三个三色通道
Accessing individual pixels–访问单个像素
What is a pixel?–像素
- All images consist of pixels which are the raw building blocks of images. 图像的原始构建块
- Images are made of pixels in a grid. 网格中的像素
- Each pixel in a grayscale image has a value representing the shade of gray.表示灰度阴影的值
- a grayscale image would have a grayscale value associated with each pixel.
- Pixels in a color image have additional information.彩色图像的像素有附加信息
- in opencv,BGR NOT RGB
how to retrieve the value of an individual pixel in the image–检索图像中单个像素值
# access the RGB pixel located at x=50, y=100, keepind in mind that
# OpenCV stores images in BGR order rather than RGB
(B, G, R) = image[100, 50]
print("R={}, G={}, B={}".format(R, G, B))
Array slicing and cropping–阵列切片和裁剪
- Extracting “regions of interest” (ROIs) is an important skill for image processing. 提取感兴趣区域
array slicing(manually extract an ROI)–阵列切片(手动提取ROI)
# extract a 100x100 pixel square ROI (Region of Interest) from the
# input image starting at x=320,y=60 at ending at x=420,y=160
roi = image[60:160, 320:420]
cv2.imshow("ROI", roi)
cv2.waitKey(0)
Resizing images–调整图像大小
- resize a large image to fit on your screen.
- Image processing is faster on smaller images.
- in deep learning, the volume fits into a network
resize our original image to 200 x 200 pixels–调整为200*200像素
# resize the image to 200x200px, ignoring aspect ratio 此时忽略横纵比
resized = cv2.resize(image, (200, 200))
cv2.imshow("Fixed Resizing", resized)
cv2.waitKey(0)
let’s calculate the aspect ratio of the original image and use it to resize an image so that it doesn’t appear squished and distorted–计算原始图像横纵比,用其调整图像大小避免压扁和扭曲
300是像素宽
# fixed resizing and distort aspect ratio so let's resize the width
# to be 300px but compute the new height based on the aspect ratio
r = 300.0 / w #计算新宽度与旧款度的比例
dim = (300, int(h * r)) # 指定新图像的尺寸
resized = cv2.resize(image, dim)
cv2.imshow("Aspect Ratio Resize", resized)
cv2.waitKey(0)
如何更简单呢:使用imutils内置函数
# manually computing the aspect ratio can be a pain so let's use the
# imutils library instead
resized = imutils.resize(image, width=300)#提供目标宽度或高度作为关键字参数
cv2.imshow("Imutils Resize", resized)
cv2.waitKey(0)
即:cv2.resize()函数需要指定w,h; imutils.resize()可以指定
Rotating an image–旋转
# let's rotate an image 45 degrees clockwise using OpenCV by first
# computing the image center, then constructing the rotation matrix,
# and then finally applying the affine warp
center = (w // 2, h // 2) #calculate the center
# use // to perform integer math
M = cv2.getRotationMatrix2D(center, -45, 1.0)#calculate a rotation matrix
# cv2.getRotationMatrix2D()用于生成图像的旋转仿射变换矩阵
# center 是旋转中心点的坐标,通常是一个元组 (x, y),表示图像中心的位置
# -45: rotate the image 45 degrees clockwise.(负数即顺时针)
# 1.0 是缩放因子,表示保持旋转后图像的尺寸不变
# positive正 angles are counterclockwise逆 and negative负 angles are clockwise顺
rotated = cv2.warpAffine(image, M, (w, h))# wrap the image using the matrix(用旋转矩阵扭曲图像)
cv2.imshow("OpenCV Rotation", rotated)
cv2.waitKey(0)
简化–imutils
# rotation can also be easily accomplished via imutils with less code
rotated = imutils.rotate(image, -45)
cv2.imshow("Imutils Rotation", rotated)
cv2.waitKey(0)
However, why in the world is the image clipped?
show entire image after rotated
# OpenCV doesn't "care" if our rotated image is clipped after rotation不关心是否被裁减
# so we can instead use another imutils convenience function to help
# us out
rotated = imutils.rotate_bound(image, 45)
cv2.imshow("Imutils Bound Rotation", rotated)
cv2.waitKey(0)
Smoothing an image–平滑
- blur an image to reduce high-frequency noise
通过对图像中的每个像素周围区域进行加权平均来减小图像中的噪声和细节,从而产生一个模糊效果
# apply a Gaussian blur with a 11x11 kernel to the image to smooth it,
# useful when reducing high frequency noise
blurred = cv2.GaussianBlur(image, (11, 11), 0)#11*11的内核执行高斯模糊
# (11, 11) 是高斯核的大小,指定了在模糊过程中应用的卷积核的尺寸。
# 这里的 (11, 11) 表示一个 11x11 的卷积核,用于计算每个像素周围区域的模糊值
# 0 是高斯核的标准差,用于控制模糊的程度。
# 标准差为 0,表示函数会自动根据卷积核的大小计算合适的标准差。
# Larger kernels would yield a more blurry image.
# 更大的内核会产生更模糊的图像
cv2.imshow("Blurred", blurred)
cv2.waitKey(0)
Drawing on an image
- draw a rectangle, circle, and line on an input image
- overlay text on an image(叠加文本)
- drawing operations on images are performed in-place.
- in this case, we make a copy of the original image storing the copy as output.
rectangle
# draw a 2px thick red rectangle surrounding the face
output = image.copy()
cv2.rectangle(output, (320, 60), (420, 160), (0, 0, 255), 2)#Using pre-calculated coordinates
# image; Our starting pixel coordinate which is the top-left; The ending pixel — bottom-right; BGR tuple; Line thickness (a negative value will make a solid rectangle)
# 图像,左上角右下角坐标,BGR元组,线条粗细(负-实心图型)
cv2.imshow("Rectangle", output)
cv2.waitKey(0)
a solid blue circle
# draw a blue 20px (filled in) circle on the image centered at
# x=300,y=150
output = image.copy()
cv2.circle(output, (300, 150), 20, (255, 0, 0), -1)
# image; circle’s center coordinate; The circle radius in pixels; color; The line thickness
# 图像; 圆心; 像素半径; 颜色; 线条粗细
cv2.imshow("Circle", output)
cv2.waitKey(0)
a red line
# draw a 5px thick red line from x=60,y=20 to x=400,y=200
output = image.copy()
cv2.line(output, (60, 20), (400, 200), (0, 0, 255), 5)
cv2.imshow("Line", output)
cv2.waitKey(0)
putText
# draw green text on the image
output = image.copy()
cv2.putText(output, "OpenCV + Jurassic Park!!!", (10, 25),
cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 0), 2)
# #参数:图像,插入的文本,文本开始的像素点,字体,字体放大器Font size multiplier,文本颜色,笔画厚度
cv2.imshow("Text", output)
cv2.waitKey(0)
上述总代码
import cv2
import os # 导入os模块
os.chdir('/xxx/practice' ) # 更改当前工作目录
image = cv2.imread("2.jpg") # read image
(h,w,d) =image.shape
cv2.imshow("Image",image)
cv2.waitKey(0) #wait butter
(B,G,R)=image[100,50]#255->white;0->black
print("R={},G={},B={}".format(R,G,B))
#Regions of interest
roi = image[400:760,320:720]#划分区域image[startY:endY,startX:endX]
cv2.imshow("ROI",roi)
cv2.waitKey(0)
#重构图像大小
resized = cv2.resize(image,(800,800))
cv2.imshow("Fixed Resizing",resized)
cv2.waitKey(0)
#保持长宽比例重构
r = 800.0/w #计算伸缩比例
dim = (800,int(h*r))
resized = cv2.resize(image,dim)
cv2.imshow("Aspect Ratio Resize",resized)
cv2.waitKey(0)
#旋转图像
center = (w//2,h//2)#使用// 以获得整数 M是由Opencv计算所得的旋转矩阵
M = cv2.getRotationMatrix2D(center,-45,1.0)#-45 是指顺时针方向旋转45度
rotated = cv2.warpAffine(image,M,(w,h))
cv2.imshow("Rotation Image",rotated)
cv2.waitKey(0)
#使图像更平滑(模糊)
blurred = cv2.GaussianBlur(image,(11,11),0)
cv2.imshow("Blurred",blurred)
cv2.waitKey(0)
#在图片上画画
output = image.copy() #创建副本
cv2.rectangle(output,(320,60),(420,160),(0,0,255),2)
#参数:绘画所用到的图形,左上方的开始绘制的第一个像素点,右下方结束绘制的像素点,BGR元组,线宽(负值时将得到一个实心的矩形)
cv2.imshow("Rectangle",output)
cv2.waitKey(0)
#画实心蓝圆
output2 = image.copy()
cv2.circle(output2,(300,150),20,(255,0,0),-1)
#参数:图像,圆圈中心,圆圈半径,圆圈颜色,线宽
cv2.imshow("Rectangle",output2)
cv2.waitKey(0)
#画红线
output3 = image.copy()
cv2.line(output3,(60,20),(400,200),(0,0,255),5)
cv2.imshow("Line",output3)
cv2.waitKey(0)
#插入文本
output4 = image.copy()
cv2.putText(output4,"xxxxxxxxxxxxx",(10,25),
cv2.FONT_HERSHEY_SIMPLEX,0.7,(0,255,0),2)
#参数:图像,插入的文本,文本开始的像素点,字体,字体放大器,文本颜色,厚度
cv2.imshow("Text,output",output4)
cv2.waitKey(0)
Counting objects-计数
GOAL:
- Count the number of Tetris blocks计算俄罗斯方块数量
- Convert images to grayscale(转换成灰度图)
- Performing edge detection
- Thresholding a grayscale image(灰度图阈值处理)
- Finding, counting, and drawing contours(轮廓)
- Conducting erosion and dilation(腐蚀和膨胀)
- Masking an image(遮蔽图像)
需要从外部输入
# import the necessary packages
import argparse # 命令行参数解析包
# a command line arguments parsing package which comes with all installations of Python
import imutils
import cv2
import os
os.chdir('/Users/huahua/Documents/code/py-practice' ) # 更改当前工作目录
# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()#将对象ArgumentParser实例化为ap
ap.add_argument("-i", "--image", required=True,
help="path to input image")
# we add our only argument, --name 添加一个参数name
#we must specify both shorthand (-i) and longhand versions (--image) where either flag could be used in the command line.必须提供简写版本和长版本的,任何一个标志都可以在命令行使用,即-i与-image写任一个即可。
#help后的string会在你需要时显示,是一些帮助的附加信息,当在终端输入xx.py --help时会显示
args = vars(ap.parse_args())#指示python和库去解析命令行参数
#本例子还调用了vars,将解析后的命令行参数转换为python字典,字典的键是命令行参数的名称,值是命为命令行参数提供字典的值
# display a friendly message to the user
print("Hi there {}, it's nice to meet you!".format(args["image"]))
#指定args["image"]
# 终端输入:$ python example.py --image YourNameHere
#输出:Hi there YourNameHere, it's nice to meet you!
Converting an image to grayscale
# args["image"]:the path to the input image
# load the input image (whose path was supplied via command line
# argument) and display the image to our screen
image = cv2.imread(args["image"])
cv2.imshow("Image", image)
cv2.waitKey(0)
# convert the image to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
cv2.imshow("Gray", gray)
cv2.waitKey(0)
终端输入下列代码即可得到灰度图像
python opencv-study2.py --image a.png
edge detection
- edge detection is useful for finding boundaries of objects in an image-- it is effective for segmentation purposes
- in this case, we use: Canny algorithm
#边缘检测
# applying edge detection we can find the outlines of objects in
# images
edged = cv2.Canny(gray, 30, 150) #Canny algorithm
# parameters:The gray image;a minimum threshold;The maximum threshold;The Sobel kernel size(default 3)
# Different values for the minimum and maximum thresholds will return different edge maps.
# 不同的最小、最大阈值将返回不同边缘映射
# 一个多阶段的过程,其中包括应用梯度计算、非极大值抑制和双阈值处理等步骤。
# 阈值参数用于控制边缘的检测灵敏度和准确性。
# 30 是低阈值,用于确定弱边缘的最小梯度值。低于该阈值的梯度值被视为非边缘点。
# 150 是高阈值,用于确定强边缘的最小梯度值。高于该阈值的梯度值被视为强边缘点。
# 介于低阈值和高阈值之间的梯度值,如果其与强边缘相连,则将其视为边缘点;否则将其视为非边缘点
# 较小的阈值将产生更多的边缘,但可能包含更多的噪声。
# 较大的阈值将过滤掉较弱的边缘,但可能会丢失一些边缘细节。
# 可以通过调整阈值参数来达到较好的边缘检测效果
cv2.imshow("Edged", edged)
cv2.waitKey(0)
- notice how edges of Tetris blocks themselves are revealed along with sub-blocks that make up the Tetris block(俄罗斯方块本身的边缘是如何与组成俄罗斯方块的子方块一起显示的)
Thresholding
- Thresholding can help us to remove lighter or darker regions and contours of images(去除图像中较亮或较暗的区域和轮廓)
# 阈值Thresholding
# threshold the image by setting all pixel values less than 225
# to 255 (white; foreground) and all pixel values >= 225 to 255
# (black; background), thereby segmenting the image
thresh = cv2.threshold(gray, 225, 255, cv2.THRESH_BINARY_INV)[1]
# 大于255的像素设为黑色-背景;小于255的像素设置白色-前景-俄罗斯方块
# Grabbing选取;抓 all pixels in the gray image greater than 225
# and setting them to 0 (black) which corresponds to the background of the image
# Setting pixel vales less than 225 to 255 (white)
# which corresponds to the foreground of the image (i.e., the Tetris blocks themselves).
# 将像素值大于阈值的像素置为最大像素值(白色),小于等于阈值的像素置为0(黑色),得到一个二值图像
# [1] 是对函数返回的结果进行索引,表示获取二值化后的图像。
# 较亮的区域表示目标物体,较暗的区域表示背景
cv2.imshow("Thresh", thresh)
cv2.waitKey(0)
Detecting and drawing contours–检测和绘制轮廓
#使用阈值图,检测图像轮廓,找前景像素(white)
# finding all foreground (white) pixels
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
# cv2.findContours() 是 OpenCV 中的轮廓检测函数
# cv2.RETR_EXTERNAL 是轮廓检测的模式,表示只检测外部轮廓。
# cv2.CHAIN_APPROX_SIMPLE 是轮廓近似方法,表示对轮廓的边界点进行简化,以节省内存。
# 检测输入的二值图像中的轮廓,并返回检测到的轮廓列表 cnts
cnts = imutils.grab_contours(cnts)#compatibility lineb(不同版本opencv)
# 用于提取轮廓的辅助函数
# imutils.grab_contours() 是 imutils 库中的一个函数,用于提取轮廓。
# 将轮廓列表 cnts 中的轮廓提取出来,并赋值给 cnts 变量。
# 这样可以确保在不同的 OpenCV 版本中,获取轮廓的方式保持一致。
output = image.copy()
# loop over the contours
for c in cnts: #we draw each 'c' from the 'cnts' list on the image
# draw each contour on the output image with a 3px thick purple
# outline, then display the output contours one at a time
cv2.drawContours(output, [c], -1, (240, 0, 159), 3)
# 绘制轮廓函数,output 是要在其上绘制轮廓的图像。
# [c] 是包含要绘制的轮廓的列表。-1 表示绘制所有的轮廓。
# (240, 0, 159) 是绘制轮廓的颜色,以 BGR 格式表示,这里为紫色。3 是绘制轮廓的线宽。
#紫色(240, 0, 159)
cv2.imshow("Contours", output)
cv2.waitKey(0)#每次画都要点击,逐个画
# draw the total number of contours found in purple
text = "I found {} objects!".format(len(cnts))
#一个包含形状轮廓的数量的文本字符串
# a text string containing the number of shape contours.
# len(cnts):Counting the total number of objects in this image
cv2.putText(output, text, (10, 25), cv2.FONT_HERSHEY_SIMPLEX, 0.7,
(240, 0, 159), 2)
cv2.imshow("Contours", output)
cv2.waitKey(0)
Erosions and dilations–侵蚀和膨胀
- Erosions and dilations are typically used to reduce noise in binary images (a side effect of thresholding).减少二值图像中的噪声(阈值化的副作用)
- To reduce the size of foreground objects we can erode away pixels given a number of iterations(通过多次迭代消除像素,来减小前景对象的大小)
- apply erosion to reduce the size of foreground objects mask(应用侵蚀来减小前景对象的大小)
Erosion:erosion轮廓,有效地使它们变小,或者通过足够的迭代使其消失,对于移除遮罩图像中的小斑点通常很有用。cv2.erode
腐蚀是图像形态学操作的一种,它通过将每个像素与其周围像素进行比较,并根据定义的腐蚀核来更新像素的值。腐蚀操作可以用于去除图像中的噪声、缩小物体的大小或断开物体之间的连接。
# 腐蚀和膨胀
# we apply erosions to reduce the size of foreground objects
# 侵蚀来减小前景对象的大小
mask = thresh.copy()
mask = cv2.erode(mask, None, iterations=5)
# cv2.erode() 是 OpenCV 中的腐蚀函数。
# None 表示使用默认的腐蚀核,也可以自定义腐蚀核
# iterations腐蚀次数。5次迭代减小轮廓尺寸
# reduce the contour sizes with 5 iterations (Line 60).
cv2.imshow("Eroded", mask)
cv2.waitKey(0)
# 经过多次腐蚀,可以使图像中的白色区域缩小或分离
#效果:白色区域变小,由边缘向内变小
Dilations:如果需要连接附近的轮廓,你可以对图像进行膨胀cv2.dilate
膨胀是图像形态学操作的一种,它通过将每个像素与其周围像素进行比较,并根据定义的膨胀核来更新像素的值。膨胀操作可以用于填充图像中的空洞、扩大物体的大小或连接相邻的物体。
#扩大前景区域,膨胀
# similarly, dilations can increase the size of the ground objects
mask = thresh.copy()
mask = cv2.dilate(mask, None, iterations=5)
# cv2.dilate() 是 OpenCV 中的膨胀函数。
# None 表示使用默认的膨胀核,也可以自定义膨胀核。
# 经过多次膨胀,可以使图像中的白色区域扩大或连接
cv2.imshow("Dilated", mask)
cv2.waitKey(0)
Masking and bitwise operations–遮蔽和逐位操作
Masks allow us to “mask out” regions of an image we are uninterested in. We call them “masks” because they will hide regions of images we do not care about.(遮蔽我们不感兴趣的区域)
When using the thresholded image as the mask in comparison to our original image, the colored regions reappear as the rest of the image is “masked out”. (当使用阈值图像作为mask时,彩色区域会随着图像的其余部分被“遮罩”而重新出现。)
- masking:根据一个二值掩码图像(或称为遮罩图像)来选择图像的特定区域或像素。将一个二值图像(通常为灰度图像)应用于原始图像,使得只有掩码图像中对应位置为非零的像素才会在输出图像中保留。实现图像的兴趣区域提取、像素选取或遮罩合成。
- Bitwise Operations:像素进行逐个位,通常用于图像融合、掩蔽、分离通道等处理。通过按位与(AND)、按位或(OR)、按位异或(XOR)和按位取反(NOT)等操作。通常用于图像的合并、分离、掩蔽、修复等场景,可以对图像进行像素级别的细粒度控制和处理。
#取mask,对输入图像应用逐位and,只保留mask
# a typical operation we may want to apply is to take our mask and
# apply a bitwise AND to our input image, keeping only the masked
# regions
mask = thresh.copy()
output = cv2.bitwise_and(image, image, mask=mask) # 将两个图像中的像素进行逐位AND运算
# cv2.bitwise_and() 是 OpenCV 中的位与运算函数。
# mask 是用于掩模操作的二值图像,它决定了哪些像素应该保留
# 将原始图像 image 与掩模图像 mask 进行位与运算,得到掩模后的图像 output
# 位与运算的规则是,只有当对应位置的像素在原始图像和掩模图像中都是白色(非零值)时,
# 对应位置的像素在输出图像中才会保留。其他位置的像素将被置为黑色(零值)
# bitwise AND the pixels from both images together using cv2.bitwise_and
cv2.imshow("Output", output)
cv2.waitKey(0)
# 掩模操作可以用于根据掩模图像提取出原始图像中感兴趣的区域,
# 即只保留掩模图像中白色部分所对应的原始图像像素
new output: the background is black now and our foreground consists of colored pixels — any pixels masked by our mask image.