之前一直用c++开发opencv,pythton opencv的程序看过一些,但是没有自己动手编过。在这里总结一下。
图片格式
OpenCV的Python API基于Numpy库,其核心数据结构为ndarray。图像数据一般分为单通道(灰度图,二值图等)和三通道(RGB颜色模式图等),下面一张图说明数据结构:
导入下面的图片观察一下:
代码:
import numpy as np #导入numpy
import cv2 #导入opencv
img = cv2.imread("1.jpg")
print("数据类型","图片像素点数 = 行数(h) * 列数(w)","数据维度")
print(img.dtype,'\t',img.size,'\t',img.shape,'\t',img.ndim)
cv2.imshow("original image",img)
cv2.waitKey(0)
cv2.destroyAllWindows() #需要释放窗口,否则程序会卡死
结果:
自己生成一张图,验证一下:
img = np.zeros((320,240),dtype=np.uint8) #无符号8位整数类型
img = cv2.cvtColor(img,cv2.COLOR_GRAY2BGR)
print("数据类型","图片像素点数 = 行数(h) * 列数(w)","数据维度")
print(img.dtype,'\t',img.size,'\t',img.shape,'\t',img.ndim)
cv2.imshow("black image",img)
cv2.waitKey(0)
cv2.destroyAllWindows()
结果:
关于像素的数据格式:
- 图像的组织形式为:H*W*C
- 图片的像素范围在[0,255]之间
- 使用无符号数据类型
- RGB图片的通道顺序为BGR
上边用到了imread
函数:
retval = cv.imread( filename[, flags] )
flag在c++中和python中是一样的,c++的枚举类型如下:
//! Imread flags
enum ImreadModes {
IMREAD_UNCHANGED = -1, //!< If set, return the loaded image as is (with alpha channel, otherwise it gets cropped).
IMREAD_GRAYSCALE = 0, //!< If set, always convert image to the single channel grayscale image.
IMREAD_COLOR = 1, //!< If set, always convert image to the 3 channel BGR color image.
IMREAD_ANYDEPTH = 2, //!< If set, return 16-bit/32-bit image when the input has the corresponding depth, otherwise convert it to 8-bit.
IMREAD_ANYCOLOR = 4, //!< If set, the image is read in any possible color format.
IMREAD_LOAD_GDAL = 8, //!< If set, use the gdal driver for loading the image.
IMREAD_REDUCED_GRAYSCALE_2 = 16, //!< If set, always convert image to the single channel grayscale image and the image size reduced 1/2.
IMREAD_REDUCED_COLOR_2 = 17, //!< If set, always convert image to the 3 channel BGR color image and the image size reduced 1/2.
IMREAD_REDUCED_GRAYSCALE_4 = 32, //!< If set, always convert image to the single channel grayscale image and the image size reduced 1/4.
IMREAD_REDUCED_COLOR_4 = 33, //!< If set, always convert image to the 3 channel BGR color image and the image size reduced 1/4.
IMREAD_REDUCED_GRAYSCALE_8 = 64, //!< If set, always convert image to the single channel grayscale image and the image size reduced 1/8.
IMREAD_REDUCED_COLOR_8 = 65, //!< If set, always convert image to the 3 channel BGR color image and the image size reduced 1/8.
IMREAD_IGNORE_ORIENTATION = 128 //!< If set, do not rotate the image according to EXIF's orientation flag.
};
单像素操作
image = cv2.read("1.jpg")
grayImage = cv2.cv2Color(image,cv2.COLOR_BGR2GRAY)
#对于灰度图
grayImage[0,0] = 255 //左上角的像素点涂成白色
# 等价于 image.item((0,0))=255 或 image.itemset((0,0),255)
# 对于彩色图,蓝色通道变为255
image[0,0,0]=255
#image.item((0,0,0))=255 或 image.itemset((0,0,0),255)
#左上角的像素点涂成白色
image[0,0]=[255,255,255]
修改通道
python 和 numpy的slice使得代码非常优雅
#绿色通道设为0
image[:,:,1]=0
提取ROI
c++的ROI是通过Rect
实现的:
//Rect四个形参分别是:x坐标,y坐标,长,高;注意(x,y)指的是矩形的左上角点
imageROI = image(Rect(500, 250, 100, 150));
python还是用slice:
imgROI = image[500:600,250:400]
视频的读/写
c++中:
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
using namespace cv;
void main()
{
VideoCapture capture("1.mp4");
VideoWriter writer("VideoTest.avi", CV_FOURCC('M', 'J', 'P', 'G'),
25.0, Size(640, 480),true);
Mat frame;
while (capture.isOpened())
{
capture >> frame;
// 对帧进行操作
// ...
writer << newframe;
imshow("video", newframe);
if (cvWaitKey(20) == 27)
{
break;
}
}
}
python类似:
#按帧读取mp4格式的video,采用YUV颜色编码(I420)存入另一帧
videoCapture=cv2.VideoCapture('1.mp4')
fps=videoCapture.get(cv2.CAP_PROP_FPS)
print(fps)
size=(int(videoCapture.get(cv2.CAP_PROP_FRAME_WIDTH)),
int(videoCapture.get(cv2.CAP_PROP_FRAME_HEIGHT)))
videoWriter=cv2.VideoWriter(
'MyOutputVid.avi',cv2.VideoWriter_fourcc('I','4','2','0'),fps,size)
success,frame=videoCapture.read()
print (success)
while success: #Loop until there are no more frames
# 对帧进行操作
# ...
videoWriter.write(frame)
success,frame=videoCapture.read()
目前支持的主流格式:
- I420-> .avi; # 未压缩的YUV颜色编码
- PIMI-> .avi; # MPEG-1编码
- XVID->.avi; #MPEG-4编码
- THEO-> .ogv; #Ogg Vorbis
- FLV1-> .flv #flash视频
I420和MPEG区别:
- I420:无压缩图像格式的视频(raw格式,原始图像),系统资源占用少(因为不用解码),不需要解码器,缺点是帧率稍慢(受限于USB分配的带宽),生成文件大。
- MPEG:相当于JPEG图像压缩格式,优点是帧率高(视频开启快,曝光快),缺点是影像有马赛克,并且需要解码器,会占用PC系统资源。
MPEG1是VCD的视频图像压缩标准;MPEG2是DVD/超级VCD的视频图像压缩标准,MPEG4是网络视频图像压缩标准之一。
捕获摄像头
将视频文件名换成设备索引即可调用摄像头。
c++:
VideoCapture capture(0);//如果是笔记本,0打开的是自带的摄像头,1 打开外接的相机
python:
CameraCapture=cv2.VideoCapture(0)
例子:使用摄像头捕获10s视频,存入avi文件
cameraCapture=cv2.VideoCapture(0)
fps=30 #这里的帧率是一个假设值
# fps=videoCapture.get(cv2.CAP_PROP_FPS) 有时会返回0,这说明设备不支持此功能,此时可以使用计数器来测量
size=(int(cameraCapture.get(cv2.CAP_PROP_FRAME_WIDTH)),
int(cameraCapture.get(cv2.CAP_PROP_FRAME_HEIGHT)))
videoWriter=cv2.VideoWriter(
'MyOutputVid.avi',cv2.VideoWriter_fourcc('I','4','2','0'),fps,size)
success,frame=cameraCapture.read()
numFramesRemaining=10*fps-1
while success and numFramesRemaining > 0:
videoWriter.write(frame)
success,frame=cameraCapture.read()
numFramesRemaining -= 1
cameraCapture.release()
鼠标事件
和c++一样,利用回调函数。
直接上例子:
# 实时显示摄像头帧,鼠标点击或键盘输入任意键停止
import numpy as np #导入numpy
import cv2 #导入opencv
clicked=False
def onMouse(event,x,y,flags,param):
global clicked #全局变量声明
if event==cv2.EVENT_LBUTTONUP:
clicked=True
cameraCapture=cv2.VideoCapture(0)
cv2.namedWindow('MyVideo')
cv2.setMouseCallback('MyVideo',onMouse)
print ('show camera feed. click window or press any key to stop.')
success,frame=cameraCapture.read()
while success and cv2.waitKey(1) == -1 and not clicked:
cv2.imshow('MyVideo',frame)
success,frame=cameraCapture.read()
cv2.destroyWindow('MyVideo')
cameraCapture.release()
global和nonlocal关键字:
- Python中的作用域及global用法
https://www.cnblogs.com/summer-cool/p/3884595.html - Python中关键字global与nonlocal的区别
https://blog.csdn.net/xCyansun/article/details/79672634
waitKey
的参数为等待键盘触发的时间,单位为毫秒,返回值是-1(表示没有被按下)或ascii码(如27,表示esc键)。
python提供了ord
函数,该函数可以将字符转换为ascii码。例如,ord('a')
会返回97。
在一些系统中waitKey
的返回值可能比ascii码的值大,可以通过读取返回值的最后一个字节来保证只获取ascii码:
keycode=cv2.waitKey(1)
if keycode!=-1:
keycode&=0xFF
python opencv的回调函数onMouse(event,x,y,flags,param)
c++版的回调函数void onMouse(int event, int x, int y, int flags, void* param);
二者参数都是一样的:
- param:传递到
cv2.setMouseCallback
函数调用的参数 - x、y:鼠标指针在图像坐标系的坐标
- event:
enum
{
CV_EVENT_MOUSEMOVE =0,//滑动
CV_EVENT_LBUTTONDOWN =1,//左键点击
CV_EVENT_RBUTTONDOWN =2,//右键点击
CV_EVENT_MBUTTONDOWN =3,//中键点击
CV_EVENT_LBUTTONUP =4,//左键放开
CV_EVENT_RBUTTONUP =5,//右键放开
CV_EVENT_MBUTTONUP =6,//中键放开
CV_EVENT_LBUTTONDBLCLK =7,//左键双击
CV_EVENT_RBUTTONDBLCLK =8,//右键双击
CV_EVENT_MBUTTONDBLCLK =9//中键双击
};
- flag:
enum
{
CV_EVENT_FLAG_LBUTTON =1,//左键拖曳
CV_EVENT_FLAG_RBUTTON =2,//右键拖曳
CV_EVENT_FLAG_MBUTTON =4,//中键拖曳
CV_EVENT_FLAG_CTRLKEY =8,//按CTRL
CV_EVENT_FLAG_SHIFTKEY =16,//按SHIFT
CV_EVENT_FLAG_ALTKEY =32//按ALT
};
知乎上回调函数的解释:
https://www.zhihu.com/question/19801131
opencv本身不提供任何处理窗口事件的方法。例如,当单击窗口的关闭按钮时,并不能关闭应用程序。由于opencv有限的事件处理能力和GUI处理能力,许多开发人员更喜欢将opencv集成到其他应用程序框架中。
其它参考文档
- OpenCV-Python Tutorials(根据自己的python-opencv版本号选择恰当的文档)
https://docs.opencv.org/3.4.2/d6/d00/tutorial_py_root.html - Opencv3计算机视觉 python语言实现(原书第二版)
- Python 调用OpenCV读取(操作)图片时,图片的数据的组织形式
https://blog.csdn.net/sinat_34165087/article/details/83051152 - Python的各种imread函数在实现方式和读取速度上有何区别?
https://www.zhihu.com/question/48762352?from=profile_question_card - Opencv读取视频、打开摄像头、保存视频和视频倒放
https://blog.csdn.net/KeZeng2015/article/details/80309105