python opencv基础_opencv python image roi-CSDN博客

本文链接：https://blog.csdn.net/weixin_36049506/article/details/86107532

之前一直用c++开发opencv，pythton opencv的程序看过一些，但是没有自己动手编过。在这里总结一下。

python opencv基础

图片格式
单像素操作
修改通道
提取ROI
视频的读/写
捕获摄像头
鼠标事件
其它参考文档

图片格式

OpenCV的Python API基于Numpy库，其核心数据结构为ndarray。图像数据一般分为单通道（灰度图，二值图等）和三通道（RGB颜色模式图等），下面一张图说明数据结构：
在这里插入图片描述
导入下面的图片观察一下：

代码：

import numpy as np   #导入numpy
import cv2    #导入opencv

img = cv2.imread("1.jpg")
print("数据类型","图片像素点数 = 行数(h) * 列数(w)","数据维度")
print(img.dtype,'\t',img.size,'\t',img.shape,'\t',img.ndim)
cv2.imshow("original image",img)
cv2.waitKey(0)
cv2.destroyAllWindows() #需要释放窗口，否则程序会卡死

结果：

自己生成一张图，验证一下：

img = np.zeros((320,240),dtype=np.uint8)  #无符号8位整数类型
img = cv2.cvtColor(img,cv2.COLOR_GRAY2BGR)
print("数据类型","图片像素点数 = 行数(h) * 列数(w)","数据维度")
print(img.dtype,'\t',img.size,'\t',img.shape,'\t',img.ndim)
cv2.imshow("black image",img)
cv2.waitKey(0)
cv2.destroyAllWindows()

结果：
在这里插入图片描述

关于像素的数据格式：

图像的组织形式为：H*W*C
图片的像素范围在[0,255]之间
使用无符号数据类型
RGB图片的通道顺序为BGR

上边用到了imread函数：
retval = cv.imread( filename[, flags] )
flag在c++中和python中是一样的，c++的枚举类型如下：

//! Imread flags
enum ImreadModes {
       IMREAD_UNCHANGED            = -1, //!< If set, return the loaded image as is (with alpha channel, otherwise it gets cropped).
       IMREAD_GRAYSCALE            = 0,  //!< If set, always convert image to the single channel grayscale image.
       IMREAD_COLOR                = 1,  //!< If set, always convert image to the 3 channel BGR color image.
       IMREAD_ANYDEPTH             = 2,  //!< If set, return 16-bit/32-bit image when the input has the corresponding depth, otherwise convert it to 8-bit.
       IMREAD_ANYCOLOR             = 4,  //!< If set, the image is read in any possible color format.
       IMREAD_LOAD_GDAL            = 8,  //!< If set, use the gdal driver for loading the image.
       IMREAD_REDUCED_GRAYSCALE_2  = 16, //!< If set, always convert image to the single channel grayscale image and the image size reduced 1/2.
       IMREAD_REDUCED_COLOR_2      = 17, //!< If set, always convert image to the 3 channel BGR color image and the image size reduced 1/2.
       IMREAD_REDUCED_GRAYSCALE_4  = 32, //!< If set, always convert image to the single channel grayscale image and the image size reduced 1/4.
       IMREAD_REDUCED_COLOR_4      = 33, //!< If set, always convert image to the 3 channel BGR color image and the image size reduced 1/4.
       IMREAD_REDUCED_GRAYSCALE_8  = 64, //!< If set, always convert image to the single channel grayscale image and the image size reduced 1/8.
       IMREAD_REDUCED_COLOR_8      = 65, //!< If set, always convert image to the 3 channel BGR color image and the image size reduced 1/8.
       IMREAD_IGNORE_ORIENTATION   = 128 //!< If set, do not rotate the image according to EXIF's orientation flag.
     };

单像素操作

image = cv2.read("1.jpg")
grayImage = cv2.cv2Color(image,cv2.COLOR_BGR2GRAY)
#对于灰度图
grayImage[0,0] = 255 //左上角的像素点涂成白色
# 等价于 image.item((0,0))=255 或 image.itemset((0,0),255)

# 对于彩色图,蓝色通道变为255
image[0,0,0]=255 
#image.item((0,0,0))=255 或 image.itemset((0,0,0),255)
#左上角的像素点涂成白色
image[0,0]=[255,255,255]

修改通道

python 和 numpy的slice使得代码非常优雅

#绿色通道设为0
image[:,:,1]=0

提取ROI

c++的ROI是通过Rect实现的：

//Rect四个形参分别是：x坐标，y坐标，长，高；注意(x,y)指的是矩形的左上角点
imageROI = image(Rect(500, 250, 100, 150));

python还是用slice：

imgROI = image[500:600,250:400]

视频的读/写

c++中：

#include <opencv2/core/core.hpp>  
#include <opencv2/highgui/highgui.hpp>  
  
using namespace cv;  
  
void main()  
{  
    VideoCapture capture("1.mp4");  
    VideoWriter writer("VideoTest.avi", CV_FOURCC('M', 'J', 'P', 'G'), 
    	25.0, Size(640, 480),true);  
    Mat frame;  
      
    while (capture.isOpened())  
    {  
        capture >> frame;  
    	// 对帧进行操作 
    	// ...    
	
        writer << newframe;  
        imshow("video", newframe);  
        if (cvWaitKey(20) == 27)  
        {  
            break;  
        }  
    }  
}

python类似：

#按帧读取mp4格式的video，采用YUV颜色编码(I420)存入另一帧
videoCapture=cv2.VideoCapture('1.mp4')
fps=videoCapture.get(cv2.CAP_PROP_FPS)
print(fps)
size=(int(videoCapture.get(cv2.CAP_PROP_FRAME_WIDTH)),
     int(videoCapture.get(cv2.CAP_PROP_FRAME_HEIGHT)))
videoWriter=cv2.VideoWriter(
    'MyOutputVid.avi',cv2.VideoWriter_fourcc('I','4','2','0'),fps,size)
success,frame=videoCapture.read()
print (success)
while success: #Loop until there are no more frames
	# 对帧进行操作
	# ...
    videoWriter.write(frame)
    success,frame=videoCapture.read()

目前支持的主流格式：

I420-> .avi; # 未压缩的YUV颜色编码
PIMI-> .avi; # MPEG-1编码
XVID->.avi; #MPEG-4编码
THEO-> .ogv; #Ogg Vorbis
FLV1-> .flv #flash视频

I420和MPEG区别：

I420：无压缩图像格式的视频(raw格式，原始图像)，系统资源占用少（因为不用解码），不需要解码器，缺点是帧率稍慢（受限于USB分配的带宽），生成文件大。
MPEG：相当于JPEG图像压缩格式，优点是帧率高（视频开启快，曝光快），缺点是影像有马赛克，并且需要解码器，会占用PC系统资源。

MPEG1是VCD的视频图像压缩标准；MPEG2是DVD/超级VCD的视频图像压缩标准，MPEG4是网络视频图像压缩标准之一。

捕获摄像头

将视频文件名换成设备索引即可调用摄像头。
c++：

VideoCapture capture(0);//如果是笔记本，0打开的是自带的摄像头，1 打开外接的相机

python：

CameraCapture=cv2.VideoCapture(0)

例子：使用摄像头捕获10s视频，存入avi文件

cameraCapture=cv2.VideoCapture(0)
fps=30 #这里的帧率是一个假设值
# fps=videoCapture.get(cv2.CAP_PROP_FPS) 有时会返回0，这说明设备不支持此功能，此时可以使用计数器来测量
size=(int(cameraCapture.get(cv2.CAP_PROP_FRAME_WIDTH)),
     int(cameraCapture.get(cv2.CAP_PROP_FRAME_HEIGHT)))
videoWriter=cv2.VideoWriter(
    'MyOutputVid.avi',cv2.VideoWriter_fourcc('I','4','2','0'),fps,size)
success,frame=cameraCapture.read()
numFramesRemaining=10*fps-1
while success and numFramesRemaining > 0:
    videoWriter.write(frame)
    success,frame=cameraCapture.read()
    numFramesRemaining -= 1
cameraCapture.release()

鼠标事件

和c++一样，利用回调函数。
直接上例子：

# 实时显示摄像头帧，鼠标点击或键盘输入任意键停止
import numpy as np   #导入numpy
import cv2    #导入opencv

clicked=False
def onMouse(event,x,y,flags,param):
    global clicked #全局变量声明
    if event==cv2.EVENT_LBUTTONUP:
        clicked=True
        
cameraCapture=cv2.VideoCapture(0)
cv2.namedWindow('MyVideo')
cv2.setMouseCallback('MyVideo',onMouse)

print ('show camera feed. click window or press any key to stop.')
success,frame=cameraCapture.read()
while success and cv2.waitKey(1) == -1 and not clicked:
    cv2.imshow('MyVideo',frame)
    success,frame=cameraCapture.read()
    
cv2.destroyWindow('MyVideo')
cameraCapture.release()

global和nonlocal关键字：

Python中的作用域及global用法
https://www.cnblogs.com/summer-cool/p/3884595.html
Python中关键字global与nonlocal的区别
https://blog.csdn.net/xCyansun/article/details/79672634

waitKey的参数为等待键盘触发的时间，单位为毫秒，返回值是-1（表示没有被按下）或ascii码（如27，表示esc键）。
python提供了ord函数，该函数可以将字符转换为ascii码。例如，ord('a')会返回97。
在一些系统中waitKey的返回值可能比ascii码的值大，可以通过读取返回值的最后一个字节来保证只获取ascii码：

keycode=cv2.waitKey(1)
if keycode!=-1:
	keycode&=0xFF

python opencv的回调函数onMouse(event,x,y,flags,param)
c++版的回调函数void onMouse(int event, int x, int y, int flags, void* param);
二者参数都是一样的：

param:传递到cv2.setMouseCallback函数调用的参数
x、y：鼠标指针在图像坐标系的坐标
event：

enum
{
    CV_EVENT_MOUSEMOVE         =0,//滑动
    CV_EVENT_LBUTTONDOWN    =1,//左键点击
    CV_EVENT_RBUTTONDOWN    =2,//右键点击
    CV_EVENT_MBUTTONDOWN    =3,//中键点击
    CV_EVENT_LBUTTONUP             =4,//左键放开
    CV_EVENT_RBUTTONUP             =5,//右键放开
    CV_EVENT_MBUTTONUP            =6,//中键放开
    CV_EVENT_LBUTTONDBLCLK   =7,//左键双击
    CV_EVENT_RBUTTONDBLCLK   =8,//右键双击
    CV_EVENT_MBUTTONDBLCLK  =9//中键双击
};

flag:

enum
{
    CV_EVENT_FLAG_LBUTTON   =1,//左键拖曳
    CV_EVENT_FLAG_RBUTTON   =2,//右键拖曳
    CV_EVENT_FLAG_MBUTTON  =4,//中键拖曳
    CV_EVENT_FLAG_CTRLKEY    =8,//按CTRL
    CV_EVENT_FLAG_SHIFTKEY   =16,//按SHIFT
    CV_EVENT_FLAG_ALTKEY      =32//按ALT
};

知乎上回调函数的解释：
https://www.zhihu.com/question/19801131

opencv本身不提供任何处理窗口事件的方法。例如，当单击窗口的关闭按钮时，并不能关闭应用程序。由于opencv有限的事件处理能力和GUI处理能力，许多开发人员更喜欢将opencv集成到其他应用程序框架中。

其它参考文档

OpenCV-Python Tutorials(根据自己的python-opencv版本号选择恰当的文档)
https://docs.opencv.org/3.4.2/d6/d00/tutorial_py_root.html
Opencv3计算机视觉 python语言实现（原书第二版）
Python 调用OpenCV读取（操作）图片时，图片的数据的组织形式
https://blog.csdn.net/sinat_34165087/article/details/83051152
Python的各种imread函数在实现方式和读取速度上有何区别？
https://www.zhihu.com/question/48762352?from=profile_question_card
Opencv读取视频、打开摄像头、保存视频和视频倒放
https://blog.csdn.net/KeZeng2015/article/details/80309105