根据视差图输出深度边沿, 并且将多个图片打包成 x_train,y_train

tubesystem

已于 2022-12-05 22:43:02 修改

阅读量566

点赞数

分类专栏：工具软件文章标签：计算机视觉深度学习人工智能

于 2022-11-12 23:07:25 首次发布

无版权

本文链接：https://blog.csdn.net/u012065954/article/details/127826978

版权

工具软件专栏收录该内容

9 篇文章 0 订阅

订阅专栏

视差图是神经网络里的x--feature,分左右眼

计算过程是先将左眼(或右眼)图片作为主视图(基准),然后用另外一眼图片缓慢的水平划过主视图,

这时候会有部分区域重合,相当于人看东西的时候双目同时注视.

通过水平移动的距离(视差)就可以计算出景深.计算公式还包含焦距,双眼距离.

( CSDN博客_视差和深度的关系)

(双目测距原理 -CSDN博客 )

( 基于YoloV3深度学习的双目距离测量方法 )

( 初级视觉皮层 - 知乎)

( 深度估计 - 知乎)

小技巧,为了提高计算速度,可以将彩色图片转化为灰度图,如下.

小技巧,为了提高计算速度,可以将多张滑动重合图堆叠在一起,构成一张3D矩阵图,这里将3D矩阵图做个2D切片:

然后从层,行,列3个维度中,选取行的维度进行切片,再卷积,最后得到重合位置,这里做个2D切片:

用到的卷积核见程序中的 ---kernel. 卷积核还有继续改进的空间.

注意图片上边沿附近的2个小白点:

将多个重合位置重新降维到2D得到结果. 这样就得到了深度图, 可以作为神经网络里的y--label, 可惜简单的算法只能边沿涂色(越远越蓝,越近越红),需要后续处理才能将整个物体涂色.

下面是手动涂色的结果( 根据上图颜色将原始图片(主视图)重新涂色, 越远越蓝,越近越红)

# 改成 320 x 240 分辨率

# 用 卷积去除彩虹， 卷积核的作用是时间轴上的边界重叠位置，计算重合时刻的视差，

# todo  现在是边界有深度信息，面没有深度信息, 计算的结果仅仅是边沿， todo 将视差向右传播，方法是？？（直到另外一个极值？）


# todo 滑动距离需要改变





''' 伪代码：

frame2 [0:240,  0:320]                              左右眼照片，320x240

cv2.cvtColor(imgL,cv2.COLOR_BGR2GRAY)              灰度化

cv2.addWeighted()                                  融合左右眼算法，

                                                  #  将shift=1  shift=200 堆叠成3D 阵列

                                                  #  将3D 转换为2D，选择极值，并记录层数 



                                                  #  将2D转换为深度图，

                                                  #  将深度图转换为色彩图

'''

##############      

import cv2
import numpy as np
import time

start = time.time()

dx, dy = 0, -1                                                            # 左眼图片滑动的dx  向右偏移量,  dy=-1  制造偏差修正，固定的偏移量
  #  dx=-3 窗外的塔吊重合
  #  dx=-5 办公室远处的绿植重合
  #  dx=-99 距离摄像头15cm的鼻子重合



DATA3D  =np.zeros((30,240,320), dtype=np.uint8)                              # 存储             # 2图相减的值，堆叠DATA3D。         每个单元格里是：左右眼差异            所有的视差融合图  高 x 宽 x 视差（像素），   --一开始申请连续空间，数据搬运工作量少，比较节约cpu时间，  
DATA3D2 =np.zeros((30,240,320), dtype=np.uint8)                            # 存储            视差融合点 

DATA2D=np.zeros((240,320), dtype=np.uint8 )                                    # 存储            高 x 宽 ，每个单元格里是：  融合后最小值的视差（像素）
# print(DATA3D.shape)

keyInput=0


########################################################################### 

kernel = np.array((                                             #  卷积核，用于查找左右眼重合点
        [-1, -1,  1,  1,  0],
        [-1, -1,  1,  0, -1],
        [-1, -1,  0, -1, -1],
        [-1,  0,  1, -1, -1],
        [ 0,  1,  1, -1, -1]),
        dtype="float32") / 1

##############   获取图片 ############################


for k in range(0, 1 ):  
    print("读取左右眼图片   -------------")
    imgL =cv2.imread("imgL00009.png")                                                            #读取 图片L ,图片R
    imgR =cv2.imread("imgR00009.png") 
#########################################################

    imgL_gray=cv2.cvtColor(imgL,cv2.COLOR_BGR2GRAY)                        #将彩色图片L转换至灰度图片
    ##  rows, cols, ch = imgL.shape                                                                        # for 彩色图片
    rows, cols    = imgL_gray.shape
    ## imgL_gray=imgL_gray.astype(np.int16)

    imgR_gray=cv2.cvtColor(imgR,cv2.COLOR_BGR2GRAY)                         #将彩色图片R转换至灰度图片
    cv2.imshow('imgR_gray',imgR_gray)
    
###########################################################################################   滑动 + 堆叠   

    for i in range(2,31):                                                                                              # <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<  todo 滑动距离需要改变
        print("视差图相减,再堆叠-------------------",i)
        MAT = np.float32([[1, 0, -2*i], [0, 1, dy]])                                                                                  # 构造平移变换矩阵 ,[dx， dy]像素移动 
        imgL_gray_shift = cv2.warpAffine(imgL_gray, MAT, (cols, rows))                                # 仿射变换，移动图片。默认为黑色填充
        # imgL_gray_shift = cv2.warpAffine(imgL_gray, MAT, (cols, rows), borderValue=(255,255,255))   # 白色填充

        imgMerge=cv2.absdiff(imgR_gray ,   imgL_gray_shift )               # 2图相减的值，   

        '''                                                                                                                           # <<<<<<< debug  注释符号' ' ' 需要对齐
        ## 将 imgMorphClose0    imgMorphClose1  做与运算

        # imgMerge= cv2.bitwise_xor(imgR_Threshold ,imgL_Threshold)   ## 做与或运算   xor
        # imgMerge= cv2.bitwise_and(imgR_Threshold ,imgL_Threshold)   ## 做与运算     and
        # imgMerge= cv2.bitwise_or(imgR_Threshold ,imgL_Threshold)      ## 做或运算     or
        
        # imgMerge= cv2.add(imgR_Threshold ,imgL_Threshold)           ## 做加运算     add，效果等同于 or
        # imgMerge= cv2.bitwise_not(imgR_Threshold)   ## 做非运算     not
        '''
        DATA3D[i-2,:,:] = imgMerge                                                                    # 2图相减的值，--------------------------------------------堆叠DATA3D。 

###########################################################################################    卷积 + 堆叠 

    for i in range(0,240):
        probe2D= DATA3D[:, i, :]                                                                                                               # 切片 -------------- 切片DATA3D, 方向在 层,行,列  的 行维度
        imgProbe2DMin = cv2.filter2D(probe2D, -1, kernel)                                                       #  卷积  求重合点-------------- cv.filter2D(源图像,输出数据类型  ,卷积核)   , 输出数据类型 =-1 表示输出类型和输入相同 
        DATA3D2[:,i,:] = imgProbe2DMin                                                                                             #  堆叠 --------------堆叠DATA3D2,  方向同上 

    ############################################  #   显示 3D矩阵的剖面    
    imgTemp = DATA3D[:,200,:] *90                                              ##   未经卷积处理过的3D矩阵的剖面， *90 表示增强信号，否则看不到
    cv2.imshow( 'imgTemp', imgTemp  )

    imgTemp1 = DATA3D2[:,200,:] *90                                          ##   已经卷积处理过的3D矩阵的剖面， *90 表示增强信号，否则看不到
    cv2.imshow( 'imgTemp1', imgTemp1  )


   ########################### # 求极值

    DATA2D  = np.argmax( DATA3D2,axis = 0 ).astype(np.uint8) * 9                                    # np.argmin( ) 返回了最小值的位置 
                                                                                                                                                                      # debug   <<<<<<<<<<<<<<<<<<<程序会将uint8 自动更改为 int64
        # print(DATA2D.dtype)                         
        # print( DATA2D.shape )                     

    outPutColor = cv2.applyColorMap(DATA2D, cv2.COLORMAP_JET)                      # COLORMAP_JET = 2, 蓝到红
                                                                                # COLORMAP_RAINBOW = 4,红到蓝

    cv2.imshow( 'outPutColor',  outPutColor  )                         # 如果 DATA2D 声明的时候没有指定数据类型，或者在计算过程中数据类型改变，可以在这里指定为 uint8


#########  键盘响应 ，兼计时器  ########################################3 
    keyInput       = cv2.waitKey( 0)   & 0xFF                                #等待键盘输入,间隔  xx us                         #  debu g <<<<<<<<<<<<<<< 设置 waitKey(0) , 则表示程序会无限制的等待用户的按键事件
    if (keyInput == ord('q')) | (keyInput == 27 ) :            # 键盘上的按键  --q   --ESC
        break

cv2.destroyAllWindows()

#   (y_train ).shape ========================  (100, 240, 320, 1)
#  (x_train ).shape ========================  (200, 240, 320, 1)


# 遮挡关系因远近引起，
# 生成训练集， 生成10张 图片 
# 每张图片里有5个矩形，实心，彩色
#   输出图片深度信息 (label) 
#             todo   太多数据类型转换       --- .astype(    np.uint8  )                  --- z_depth  = int(z_depth_sort[i] *3 )
#             todo z_depth  = int(z_depth_sort[i] *3 )           是否会溢出 ??   看样子伪色彩已经削峰了
#        todo  视差数据,转换深度数据 , 转换公式 =??


import cv2
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from tensorflow import keras
from tensorflow.keras.callbacks import LambdaCallback   
from PIL import Image













################################################################
# 没有gpu的计算机不需要本段
# 这一段的作用是在用gpu计算时,debug gpu内存报错  UnknownError:  Failed to get convolution algorithm. 
    # This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
    #  [[node sequential/cnn_layer/Conv2D (defined at tmp/ipykernel_13733/2736373417.py:104) ]] [Op:__inference_distributed_function_799]
from tensorflow.compat.v1.keras.backend  import set_session
config=tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True 
sess=tf.compat.v1.Session(config=config) 
set_session(sess)
tf.keras.backend.clear_session() #清理session
###############################################################

GENERATE_NUMBER=100  #  =======================================================生成图片的数量

x_train=  np.empty([GENERATE_NUMBER*2 ,240,320,1], dtype = np.uint8 )    ######################################==================================================================
y_train=  np.empty([GENERATE_NUMBER     ,240,320,1], dtype = np.uint8 )    ######################################==================================================================


for n in range (1,GENERATE_NUMBER+1):                                 #  =======================================================生成图片的数量
    print(n)
    
    x=[0]*GENERATE_NUMBER*2     # debug   <<<<<<<<<<<<<<<<  x=x[] invalid syntax ,    debug  <<<<<<<<<<<<<<<<  x= []  list assignment index out of range
    y=[0]*GENERATE_NUMBER*2
    z_depth=[0]*GENERATE_NUMBER*2
    # print( z_depth)
        
        
    imgL             =np.zeros(( 240,320,1),np.uint8)                                                         # ========准备画布 240,320
    imgR            =np.zeros(( 240,320,1),np.uint8)
    imgLabelR=np.zeros(( 240,320,1),np.uint8)


   
    for i in range(1,GENERATE_NUMBER+1,2):                              # ==========================================================每张图片中矩形数量     准备2 x n个点,代表n个矩形
        x[i], y[i] = np.random.randint(300, size=2 )                                                   # debug   x[i],   y[i] = np.random.randint(100, size=2  )  <<<<<<<<<<<<<<<< IndexError: list assignment index out of range
        x[i+1], y[i+1]= np.random.randint(600, size=2 )                                          #      x(i+1),   y(i+1)= np.random.randint(400, size=2  )  SyntaxError: can't assign to function call
        # print( x[i], y[i ] )
        print("==========",i+1)  
        z_depth[(i-1)], z_depth[(i)]      = np.random.randint(100, size=2) .astype(    np.uint8  )            ## 对于每个矩形,准备一个深度(  与摄像头的距离 )              
                                                                                                                                                                                                ##debug <<<<<<<<<<<<<<<<<<<<<<IndexError: list assignment index out of range
                                                                                                                                                                                                ##  debug  单个数字不能用.astype(    np.uint8  )  , 一个数组可以用 .   <<<<<<<<<<<<<< AttributeError: 'int' object has no attribute 'astype'     ---      z_depth[(i)]      = np.random.randint(100) .astype(    np.uint8  )   
    print (z_depth)
    z_depth_sort=np.sort(z_depth)
    print("这是整理以后的深度排序============",z_depth_sort)

    for i in range(1,GENERATE_NUMBER+1,2):                                                                                                     # ============在画布上画出所有矩形
        #G_color  = np.random.randint(  255, size=3 ).astype(    np.uint8  )     #  随机产生一种颜色 ,也可以写成 np.random.randint(  255, size=3 ,dtype=int )  
        Grey_color =  np.random.randint(  255, size=1 ).astype(np.uint8)  
        Grey_color  = int(Grey_color)
        print("x[i],-----x[i]-z_depth_sort[i]----------",   x[i],x[i]-z_depth_sort[i]  )
        cv2.rectangle(imgL,  ( x[i], y[i] ),( x[i+1], y[i+1 ] ),                                                                   (Grey_color),  -1)                  # 图片名,    左上顶点和右下顶点，颜色，线宽
        cv2.rectangle(imgR,  ( x[i]-z_depth_sort[i], y[i]  ),( x[i+1]-z_depth_sort[i], y[i+1]),  (Grey_color ),-1)                 # 图片名,    左上顶点和右下顶点，颜色，线宽

        
        
        
        
                                      # debug <<<<<<<<<<<<<<, - Scalar value for argument 'color' is not numeric      color值超出 np.uint8  （0，255）,     [200, 399]列表形式时，也会引发该错误，转成tuple
        # print(" type(z_depth_sort[i])----------", type(z_depth_sort[i]   )  )
        # print("type(Grey_color )----------",   type(Grey_color) )
        z_depth  = int(z_depth_sort[i] *3 )                                                             #  <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<todo 是否会溢出 ??
        cv2.rectangle( imgLabelR,        ( x[i]-z_depth_sort[i], y[i]  ),       ( x[i+1]-z_depth_sort[i], y[i+1]),          ( z_depth ),       -1) 
                   #    参数(       图片名                 x1                                 y1                   x2                                           y2                      grey               线宽=色块 )
         

    
    
    '''
    cv2.imwrite (("imgL"+ str(n).zfill(5) +".png"), imgL )                                      # ===================================保存左眼图片
    cv2.imwrite (("imgR"+ str(n).zfill(5) +".png"), imgR )                                     # ===================================保存右眼图片
    cv2.imwrite (("depthColorR "+ str(n).zfill(5) +".png"),  imgLabelR )     # ===================================保存右眼  深度label 图片
    '''
    
    # 将图片转换为队列 list , 并保存为文件
    array_imgL = np.array(imgL)   # img = Image.fromarray(  array_img  )        # np.asarray(imgL) 
    array_imgR = np.array(imgR)   #  numpy.array
    array_imgLabelR = np.array(imgLabelR)
    
    x_train[(2*i), :]                = array_imgL                                                                   #  debug <<<<<<<<<<<<<<<<<<<<<<<<<<<<<IndexError: index 198 is out of bounds for axis 0 with size 20
    x_train[(2*i+1), :]           = array_imgR 
    y_train[i, :]                  = array_imgLabelR  
    
    
    # print (' y_train ==========',  y_train)
    print(  ' (y_train ).shape ======================== '   ,  np.array(y_train ).shape          )

np.save("x_train.npy",x_train)                                                  # 保存文件                          # b = np.load("filename.npy")  # 读取文件
np.save("y_train.npy",y_train)        

print(  ' (x_train ).shape ======================== '   ,  np.array(x_train ).shape          )
print(   ' (y_train ).shape========================== '   ,   np.array(y_train ).shape             )
'''    
    cv2.imshow("imageL", imgL)                          # 显示图片
    cv2.imshow("imageR", imgR)


    cv2.waitKey (0)                                                       # 等待键盘输入（显示图片的时间）
    cv2.destroyAllWindows()                                    # 回收资源
'''



'''
#数据预处理               
#Reshape
x_train4D = x_train.reshape(x_train.shape[0],28,28,1).astype('float32') 
x_test4D = x_test.reshape(x_test.shape[0],28,28,1).astype('float32') 
#像素标准化
x_train, x_test = x_train4D / 255.0, x_test4D / 255.0
 
 
#模型搭建
model = tf.keras.models.Sequential([
    # tf.keras.layers.Conv2D(filters=16 , kernel_size=(20,20), padding='VALID',input_shape=(28,28,1),  activation='relu',name="cnn_layer"),      #  <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<卷积核大小  数量
    #tf.keras.layers.Conv2D(filters=16 , kernel_size=(28,28), padding='VALID',input_shape=(28,28,1),  activation='sigmoid',name="cnn_layer"),      #  <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<卷积核大小  数量
    tf.keras.layers.Conv2D(filters=16 , kernel_size=(25,25), padding='VALID',input_shape=(28,28,1), dilation_rate=1,  activation='sigmoid',name="cnn_layer"),      #  <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<卷积核大小  数量
 
    # VALID  valid
    tf.keras.layers.Flatten(),                                                                     #  卷积层输出二维数据，而全连接层接收一维数据，faltten降维数据
    #tf.keras.layers.Dense(10,activation='softmax')
    #tf.keras.layers.Dense(10,activation='sigmoid')             #   relu不行, 学不到东西.        sigmoid   ,  softmax  能够学到东西
    
])
 
 
 
 
#打印模型
print(model.summary())                 # print  模型
 
#训练配置
model.compile(loss='sparse_categorical_crossentropy',optimizer='adam', metrics=['accuracy']) 
 
 
'''