目录💨💨💨
判断两条线段是否相交(向量叉乘法+Python实现)
理论推导参考(C++):https://zhuanlan.zhihu.com/p/598112630
注意点:线段是否共线,是否有交点,计算交点。
def cross2d(ptA, ptB):
"""
计算向量叉乘
"""
return ptA[0] * ptB[1] - ptA[1] * ptB[0]
def isInsert(ptA, ptB, ptC, ptD):
"""
判断两条线段AB和CD是否相交
https://zhuanlan.zhihu.com/p/598112630
ptA, ptB, ptC, ptD = (0,0), (0,2), (1,1), (0,1) True, 交点(0.0, 1.0)
ptA, ptB, ptC, ptD = (0,0), (0,2), (1,1), (1e-8,1) False
"""
AB = (ptB[0]-ptA[0], ptB[1]-ptA[1]) # AB = B - A
CD = (ptD[0]-ptC[0], ptD[1]-ptC[1]) # CD = D - C
det = cross2d(CD, AB)
# parallel lines
if abs(det) < 1e-14:
return False
AC = (ptC[0]-ptA[0], ptC[1]-ptA[1]) # AC = C - A
t = cross2d(CD, AC) / det
u = cross2d(AB, AC) / det
if 0 <= t <= 1 and 0 <= u <= 1:
# intr1 = (ptA[0]+AB[0]*t, ptA[1]+AB[1]*t)
# intr2 = (ptC[0]+CD[0]*u, ptC[1]+CD[1]*u)
# print(intr1, intr2)
return True # 交点P = A + AB * t = C + CD * u;
return False
根据三点坐标计算夹角(角ABC)
原文链接:https://blog.csdn.net/shengyutou/article/details/119670615
数学原理:设m,n是两个不为0的向量,它们的夹角为<m,n> (或用α ,β, θ ,…,字母表示)
① 由向量公式:cos<m,n>=m.n/|m||n|
② 若向量用坐标表示,m=(x1,y1,z1), n=(x2,y2,z2),
则,m.n=(x1x2+y1y2+z1z2).
|m|=√(x12+y12+z1^2), |n|=√(x22+y22+z2^2).
将这些代入②得到:
cos<m,n>=(x1x2+y1y2+z1z2)/[√(x12+y12+z12)*√(x22+y22+z22)]
上述公式是以空间三维坐标给出的,令坐标中的z=0,则得平面向量的计算公式。
两个向量夹角的取值范围是:[0,π].
夹角为锐角时,cosθ>0;夹角为钝角时,cosθ<0.
def cal_angle(point_a, point_b, point_c):
"""
根据三点坐标计算夹角
点a
点b ∠
点c
:param point_a、point_b、point_c: 数据类型为list,二维坐标形式[x、y]或三维坐标形式[x、y、z]
:return: 返回角点b的夹角值
"""
a_x, b_x, c_x = point_a[0], point_b[0], point_c[0] # 点a、b、c的x坐标
a_y, b_y, c_y = point_a[1], point_b[1], point_c[1] # 点a、b、c的y坐标
if len(point_a) == 3 == len(point_b) == len(point_c):
# print("坐标点为3维坐标形式")
a_z, b_z, c_z = point_a[2], point_b[2], point_c[2] # 点a、b、c的z坐标
else:
a_z, b_z, c_z = 0, 0, 0 # 坐标点为2维坐标形式,z 坐标默认值设为0
# print("坐标点为2维坐标形式,z 坐标默认值设为0")
# 向量 m=(x1,y1,z1), n=(x2,y2,z2)
x1, y1, z1 = (a_x - b_x), (a_y - b_y), (a_z - b_z)
x2, y2, z2 = (c_x - b_x), (c_y - b_y), (c_z - b_z)
# 两个向量的夹角,即角点b的夹角余弦值
cos_b = (x1 * x2 + y1 * y2 + z1 * z2) / (math.sqrt(x1 ** 2 + y1 ** 2 + z1 ** 2) * (math.sqrt(x2 ** 2 + y2 ** 2 + z2 ** 2))) # 角点b的夹角余弦值
B = math.degrees(math.acos(cos_b)) # 角点b的夹角值(角度值)
return B
判断两个区间的“距离”(相交<=0,挨得足够近<=pad/2)
原理:若两个区间不相交,那么最大的开始端一定大于最小的结束端。两区间同时外扩pad之后再做判断也同理。
def isOverlap(s1, e1, s2, e2, pad=None):
"""
判断两区间是否有交集或是否挨得非常近,包含端点。
若两个区间不相交,那么最大的开始端一定大于最小的结束端
Args:
s1: 区间1起点
e1: 区间1终点
s2: 区间2起点
e2: 区间2终点
pad: 两区间外扩数量
Returns: Bool
"""
if s1 > e1:
s1, e1 = e1, s1
if s2 > e2:
s2, e2 = e2, s2
if pad is not None:
s1, e1 = s1-pad, e1+pad
s2, e2 = s2-pad, e2+pad
return max(s1, s2) <= min(e1, e2)
区间中心外扩(限定边界)
区间太小怎么办?那就左右padding!在图像处理中务必要注意限定边界,避免越界。
def get_min_length_size(size, min_length, size_range):
"""
给定区间[n,m]中的两个点l和r,求满足min_length且仍在区间内的中心外扩点l'和r'
Args:
size: (l,r)
min_length: 最小长度
size_range: (n,m)
"""
l, r = size
length = r-l
if length < min_length:
l = max(size_range[0], l-(min_length-length)//2)
r = l + min_length
if r > size_range[1]:
r = size_range[1]
l = max(size_range[0], r-min_length)
return int(l), int(r)
# 太小的框外扩,至少20x20
px, px1 = get_min_length_size((px, px+pw), 20, (1,width-2))
py, py1 = get_min_length_size((py, py+ph), 20, (1,height-2))
crack_boxs.append((px, py, px1, py1)) # (x1,y1,x2,y2)
bbox中心外扩的最佳RoI(裁图)
指定中心点和裁图宽高,获得裁图位置xyxy坐标(最佳),便于在图像裁剪。
def get_best_crop_position_of_center(center_xy, img_w, img_h, crop_w, crop_h):
pt = center_xy
x0, y0 = max(0, pt[0] - crop_w // 2), max(0, pt[1] - crop_h // 2) # 左上角 >= (0,0)
x1, y1 = min(x0 + crop_w, img_w), min(y0 + crop_h, img_h) # 右下角
return [int(x1-crop_w), int(y1-crop_h), int(x1), int(y1)]
求两个bbox的交集面积(IoU)
def intersect(rec1, rec2):
"""
computing intersect area of 2 rectangles.
(both with up-left to down-right points)
Args:
rec1: [x0, y0, x1, y1]
rec2: [x0, y0, x1, y1]
"""
return max(min(rec1[2], rec2[2]) - max(rec1[0], rec2[0]), 0) * \
max(min(rec1[3], rec2[3]) - max(rec1[1], rec2[1]), 0)
根据两个坐标点求缩放+偏移后的RoI
def get_xyxy_scale_shift(pt1, pt2, scale_xy=1.0, shift=0, imgW=0, imgH=0):
"""
给定两个坐标点,返回缩放+偏移后的RoI坐标
:param pt1, pt2: 两个坐标点
:param scale_xy: 缩放比例,还原到原图
:param shift: 短边的放大偏移量(长边不变)
:param imgW: RoI坐标限宽
:param imgH: RoI坐标限高
"""
x0, y0, x1, y1 = min(pt1[0], pt2[0]), min(pt1[1], pt2[1]), max(pt1[0], pt2[0]), max(pt1[1], pt2[1]) # 左上, 右下
x0, y0, x1, y1 = round(x0 * scale_xy), round(y0 * scale_xy), round(x1 * scale_xy), round(y1 * scale_xy) # 缩放,取整
if x1 - x0 == y1 - y0:
x0, x1 = x0 - shift, x1 + shift
y0, y1 = y0 - shift, y1 + shift
elif x1 - x0 < y1 - y0:
x0, x1 = x0 - shift, x1 + shift
else:
y0, y1 = y0 - shift, y1 + shift
if imgW > 0:
x0 = min(max(0, x0), imgW)
x1 = min(max(0, x1), imgW)
if imgH > 0:
y0 = min(max(0, y0), imgH)
y1 = min(max(0, y1), imgH)
return int(x0+0.5), int(y0+0.5), int(x1+0.5), int(y1+0.5)
上面函数可以应用在图像上画矩形框:
def draw_RoI(img: np.ndarray, pt1, pt2, scale_xy=1.0, shift=0, color=None, thickness=None):
if color is None: color = (0,255,0)
imgH, imgW = img.shape[:2]
x0, y0, x1, y1 = get_xyxy_scale_shift(pt1, pt2, scale_xy, shift, imgW, imgH)
cv2.rectangle(img, (x0, y0), (x1, y1), color, thickness)
return x0, y0, x1, y1
自定义RGB2BGR颜色解析小函数
def rgb2bgr(rgb):
if isinstance(rgb, (list, tuple)):
rgb_list = []
for val in rgb:
if isinstance(val, str) and val.strip() != '':
rgb_list.append(int(val.strip()))
elif isinstance(val, int):
rgb_list.append(val)
return rgb_list[::-1]
elif isinstance(rgb, str):
bgr = [int(val.strip()) for val in rgb.split(',') if val.strip() != ''][::-1]
return bgr
else:
raise ValueError("error in converting RGB[" + str(rgb) + "] to BGR")
滑窗切片(sliding window crops)
指定横向和纵向的Windows数,自适应计算每个Window的宽和高,以及滑窗步长,居中对齐,返回每个Window的坐标。
def make_grids(img, grid_x, grid_y, dx=0, dy=0):
"""
make grids in x-axis and y-axis
指定横向和纵向的Windows数,自适应计算每个Window的宽和高,居中对齐,返回每个Window的坐标
Args:
img: ndarray
grid_x: number of grids in x-axis,指定横向窗口数
grid_y: number of grids in y-axis,指定纵向窗口数
dx: shrinking size in x-axis,横向窗口间隔的一半
dy: shrinking size in y-axis,纵向窗口间隔的一半
Returns:
[[grid_box]], where
grid_box = (upleft_pt, downright_pt) = ((x0, y0), (x1, y1))
"""
grid_boxs = []
h, w = img.shape[:2]
left_pad, up_pad = (w % grid_x) // 2, (h % grid_y) // 2
box_w, box_h = w // grid_x, h // grid_y
for hi in range(grid_y):
row_boxs = [((left_pad+wi*box_w+dx, up_pad+hi*box_h+dy),
(left_pad+(wi+1)*box_w-dx, up_pad+(hi+1)*box_h-dy))
for wi in range(grid_x)]
grid_boxs.append(row_boxs)
return grid_boxs
def make_grids_sliding(img, grid_x, grid_y, box_w, box_h):
"""
指定横向和纵向的Windows数 以及窗口大小,有overlapping的滑窗,左右上下紧贴边
Args:
img: ndarray
grid_x: number of grids in x-axis,指定横向窗口数
grid_y: number of grids in y-axis,指定纵向窗口数
box_w: width of each box,窗口横向宽度
box_h: height of each box,窗口纵向高度
Returns:
[[grid_box]], where
grid_box = (upleft_pt, downright_pt) = ((x0, y0), (x1, y1))
Examples:
[:append]
grid_boxs = make_grids_sliding(srcImg, 4, 3, 1280, 1280)
for idy, row_boxs in enumerate(grid_boxs):
for idx, ((x0, y0), (x1, y1)) in enumerate(row_boxs):
cv2.circle(srcImg, ((x0+x1)//2, (y0+y1)//2), 20, color, -1)
[:extend]
grid_boxs = make_grids_sliding(srcImg, 4, 3, 1280, 1280)
for (x0, y0), (x1, y1) in grid_boxs:
cv2.circle(srcImg, ((x0+x1)//2, (y0+y1)//2), 20, color, -1)
"""
grid_boxs = []
h, w = img.shape[:2]
# box_h, box_w = min(h, box_h), min(w, box_w) # 保证:窗口大小 <= 原图大小
lt_x0y0, rd_x0y0 = (0, 0), (max(0, w-box_w), max(0, h-box_h)) # 左上角窗口、右下角窗口的左上角坐标
x0linspace = [int(x0) for x0 in np.linspace(lt_x0y0[0], rd_x0y0[0], grid_x)]
y0linspace = [int(y0) for y0 in np.linspace(lt_x0y0[1], rd_x0y0[1], grid_y)]
for y0 in y0linspace:
row_boxs = [((x0, y0), (x0+box_w, y0+box_h))
for x0 in x0linspace] # 左上角、右下角
grid_boxs.extend(row_boxs) # .append
return grid_boxs
if __name__ == '__main__':
srcImg = np.zeros((2000, 2000, 3), dtype=np.uint8)
grid_boxs = make_grids_sliding(srcImg, 2, 2, 1280, 1280)
print(grid_boxs)
# 在crop_srcImg上滑动窗口裁图,将grid_boxs从crop_srcImg映射回srcImg
px0, py0 = 10, 10
for idy, row_boxs in enumerate(grid_boxs):
(x0, y0), (x1, y1) = row_boxs
grid_boxs[idy] = ((x0 + px0, y0 + py0), (x1 + px0, y1 + py0))
print(grid_boxs)
VOC的颜色+调色板
通过位运算,巧妙地生成有梯度(相差128个灰度值)的RGB颜色表,相比打表可快多了。
def create_pascal_label_colormap():
"""
PASCAL VOC 分割数据集的类别标签颜色映射label colormap
返回:
可视化分割结果的颜色映射Colormap
Examples:
colormap = create_pascal_label_colormap()
color = colormap[idx].tolist() # [b, g, r]
# 分割结果label.shape=(1024,1024),渲染图vis.shape=(1024,1024,3)
vis = colormap[label]
"""
colormap = np.zeros((256, 3), dtype=int)
ind = np.arange(256, dtype=int)
for shift in reversed(range(8)):
for channel in range(3):
colormap[:, channel] |= ((ind >> channel) & 1) << shift
ind >>= 3
return colormap
to be continue
…