python裁剪图片边缘,使用Python自动裁剪图像以提取内部黑色边框ROI

I’ve been looking into OpenCV and Pillow (and ImageMagick outside of Python, especially Fred's Image Magick Scripts) to achieve the following:

Auto identification of an inner black border in scanned Images and cropping of the images to this said border. Here is a blacked out example Image, the first is the "original" and the second one with a red highlight around the black border is what I am looking to achieve:

f2bf0a80f9e6c93fed01ba5ca72989a9.png

3dd858624640a421cd4e51fd1f16772f.png

The problem is that the border is not on the outside of the images and the scans differ greatly in quality, meaning the border is never on the same spot and it’s not possible to crop by means of pixels.

Edit: I’m looking for a way to crop the image only keeping everything inside the black border (what is blurred right now)

I’m looking for help on how a) if it is possible to do such a cropping and b) how to do it preferably with Python.

Thanks!

解决方案

Here is a pretty simple way to do that in Imagemagick.

Get the center coordinates

Clone the image and do the following on the clone

Threshold the image so that the inside of the black lines is white.

(If necessary use -connected-components to merge smaller black features into the white in the center)

Apply some morphology open to make sure that the black lines are continuous

Floodfill the image with red starting in the center

Convert non-red to black and red to white

Put the processed clone into the alpha channel of the input

Input:

c97cbcb41d2bb45b5170d3017f39b443.png

center=$(convert img.jpg -format "%[fx:w/2],%[fx:h/2]\n" info:)

convert img.jpg \

\( +clone -auto-level -threshold 35% \

-morphology open disk:5 \

-fill red -draw "color $center floodfill" -alpha off \

-fill black +opaque red -fill white -opaque red \) \

-alpha off -compose copy_opacity -composite result.png

3e6f56d8674c5f1c2687a0110161c366.png

Here is Python Wand code that is the equivalent to the above:

#!/bin/python3.7

from wand.image import Image

from wand.drawing import Drawing

from wand.color import Color

from wand.display import display

with Image(filename='black_rect.jpg') as img:

with img.clone() as copied:

copied.auto_level()

copied.threshold(threshold=0.35)

copied.morphology(method='open', kernel='disk:5')

centx=round(0.5*copied.width)

centy=round(0.5*copied.height)

with Drawing() as draw:

draw.fill_color='red'

draw.color(x=centx, y=centy, paint_method='floodfill')

draw(copied)

copied.opaque_paint(target='red', fill='black', fuzz=0.0, invert=True)

copied.opaque_paint(target='red', fill='white', fuzz=0.0, invert=False)

display(copied)

copied.alpha_channel = 'copy'

img.composite(copied, left=0, top=0, operator='copy_alpha')

img.format='png'

display(img)

img.save(filename='black_rect_interior.png')

For OpenCV, I would suggest that the following processing could be one way to do it. Sorry, I am not proficient with OpenCV

Threshold the image so that the inside of the black lines is white.

Apply some morphology open to make sure that the black lines are continuous

Get the contours of the white regions.

Get the largest interior contour and fill the inside with white

Put that result into the alpha channel of the input

ADDITION:

For those interested, here is a longer method that would be conducive to perspective rectification. I do something similar to what nathancy has done, but in Imagemagick.

First, threshold the image and do morphology open to be sure the black lines are continuous.

Then do connected components to get the ID number of the largest white region

Then extract that region

id=$(convert img.jpg -auto-level -threshold 35% \

-morphology open disk:5 -type bilevel \

-define connected-components:mean-color=true \

-define connected-components:verbose=true \

-connected-components 8 null: | grep "gray(255)" | head -n 1 | awk '{print $1}' | sed 's/[:]*$//')

echo $id

convert img.jpg -auto-level -threshold 35% \

-morphology open disk:5 -type bilevel \

-define connected-components:mean-color=true \

-define connected-components:keep=$id \

-connected-components 8 \

-alpha extract -morphology erode disk:5 \

region.png

e56d2301396feef8f5ef18d15ec3bab1.png

Now do Canny edge detection and hough line transform. Here I save the canny image, the hough lines as red lines and the lines overlaid on the image and the line information, which is saved in the .mvg file.

convert region.png \

\( +clone -canny 0x1+10%+30% +write region_canny.png \

-background none -fill red -stroke red -strokewidth 2 \

-hough-lines 9x9+400 +write region_lines.png +write lines.mvg \) \

-compose over -composite region_hough.png

convert region_lines.png -alpha extract region_bw_lines.png

# Hough line transform: 9x9+400

viewbox 0 0 2000 2829

# x1,y1 x2,y2 # count angle distance

line 0,202.862 2000,272.704 # 763 92 824

line 204.881,0 106.09,2829 # 990 2 1156

line 1783.84,0 1685.05,2829 # 450 2 2734

line 0,2620.34 2000,2690.18 # 604 92 3240

21706b29f0543d0e8819c9487129f3e1.png

276635493afcf9725a29563fc7b53c56.png

d93e49ad389739344ab16bc75078130e.png

6b27c4f97bb221a44bf5456d28229411.png

Next I use a script that I wrote to do corner detection. Here I use the Harris detector.

corners=$(corners -m harris -t 40 -d 5 -p yes region_bw_lines.png region_bw_lines_corners.png)

echo "$corners"

pt=1 coords=195.8,207.8

pt=2 coords=1772.8,262.8

pt=3 coords=111.5,2622.5

pt=4 coords=1688.5,2677.5

Next I extract and sort just the corners in clockwise fashion. The following is some code I wrote that I converted from here

list=$(echo "$corners" | sed -n 's/^.*=\(.*\)$/\1/p' | tr "\n" " " | sed 's/[ ]*$//' )

echo "$list"

195.8,207.8 1772.8,262.8 111.5,2622.5 1688.5,2677.5

# sort on x

xlist=`echo "$list" | tr " " "\n" | sort -n -t "," -k1,1`

leftmost=`echo "$xlist" | head -n 2`

rightmost=`echo "$xlist" | tail -n +3`

rightmost1=`echo "$rightmost" | head -n 1`

rightmost2=`echo "$rightmost" | tail -n +2`

# sort leftmost on y

leftmost2=`echo "$leftmost" | sort -n -t "," -k2,2`

topleft=`echo "$leftmost2" | head -n 1`

btmleft=`echo "$leftmost2" | tail -n +2`

# get distance from topleft to rightmost1 and rightmost2; largest is bottom right

topleftx=`echo "$topleft" | cut -d, -f1`

toplefty=`echo "$topleft" | cut -d, -f2`

rightmost1x=`echo "$rightmost1" | cut -d, -f1`

rightmost1y=`echo "$rightmost1" | cut -d, -f2`

rightmost2x=`echo "$rightmost2" | cut -d, -f1`

rightmost2y=`echo "$rightmost2" | cut -d, -f2`

dist1=`convert xc: -format "%[fx:hypot(($topleftx-$rightmost1x),($toplefty-$rightmost1y))]" info:`

dist2=`convert xc: -format "%[fx:hypot(($topleftx-$rightmost2x),($toplefty-$rightmost2y))]" info:`

test=`convert xc: -format "%[fx:$dist1>$dist2?1:0]" info:`

if [ $test -eq 1 ]; then

btmright=$rightmost1

topright=$rightmost2

else

btmright=$rightmost2

topright=$rightmost1

fi

sort_corners="$topleft $topright $btmright $btmleft"

echo $sort_corners

195.8,207.8 1772.8,262.8 1688.5,2677.5 111.5,2622.5

Finally, I use the corner coordinates to draw a white filled polygon on a black background and put that result into the alpha channel of the input image.

convert img.jpg \

\( +clone -fill black -colorize 100 \

-fill white -draw "polygon $sort_corners" \) \

-alpha off -compose copy_opacity -composite result.png

74c30d13582a89296de7b0a663f77bf8.png

<think>嗯,用户想要用Python写一个实时识别电脑屏幕上的二维码并放大显示的软件。首先,我需要考虑如何捕捉屏幕画面。可能用到pyautogui或者mss库,这两个库都能截图,但mss可能更快一些,适合实时处理。 接下来是二维码识别,常用的库有pyzbar和opencv。Pyzbar比较轻量,适合快速解码。然后,处理每一帧图像,用pyzbar解码,找到二维码的位置和内容。 然后,需要实时显示处理后的画面,可能需要opencv的imshow函数,或者用pyqt这样的GUI框架来创建窗口,把原画面和放大后的二维码并排显示。实时显示的话,opencv比较直接,但界面可能不够美观;如果用PyQt,可以更灵活地布局,但需要处理多线程,避免界面卡顿。 放大二维码部分,可能需要定位二维码的位置,然后裁剪出来,用resize函数放大。要注意的是,二维码可能出现在屏幕的任何位置,所以每一帧都要全屏扫描。另外,如果同时有多个二维码,可能需要处理多个区域,但用户可能只需要显示其中一个,或者最大的那个。 性能方面,全屏截图和处理可能会比较耗资源,可以适当降低截图的频率,比如每0.5秒一次,或者根据系统性能调整。或者只截取屏幕的一部分,比如用户指定区域,减少处理的数据量。 代码结构大概是这样:循环截图,解码,如果有二维码就提取位置,裁剪并放大显示。用opencv的窗口来展示原图和放大图。可能需要调整颜色空间,因为pyzbar处理的是RGB,而opencv默认是BGR,要注意转换。 另外,要考虑退出机制,比如按某个键退出循环。还有错误处理,比如没有检测到二维码时的处理,避免程序崩溃。 测试时可能会遇到识别延迟,或者截图速度跟不上,需要优化代码,比如多线程截图和解码,但Python的GIL可能会限制多线程性能,可以考虑用多进程或者异步处理。不过对于简单应用,可能不需要这么复杂,先实现基本功能再优化。 现在,综合这些思路,写出示例代码,用mss截图,pyzbar解码,opencv显示。放大功能用resize实现,并显示在旁边窗口。可能需要调整窗口布局,或者创建两个独立的窗口分别显示。 相关问题的话,可能会涉及如何提高识别速度、如何处理多个二维码、如何选择屏幕区域、如何用GUI框架优化界面,或者如何将识别内容保存等。</think><< 以下是基于Python实现的实时二维码识别及放大显示方案。核心功能包括屏幕捕获、二维码识别、图像处理等模块: ```python import cv2 import numpy as np import pyzbar.pyzbar as pyzbar from mss import mss from PIL import Image # 配置屏幕捕获区域(全屏) monitor = {"top": 0, "left": 0, "width": 1920, "height": 1080} # 根据实际屏幕分辨率修改 # 创建两个显示窗口 cv2.namedWindow("Screen Capture", cv2.WINDOW_NORMAL) cv2.namedWindow("QR Code", cv2.WINDOW_NORMAL) with mss() as sct: while True: # 屏幕捕获 img = sct.grab(monitor) img = Image.frombytes("RGB", img.size, img.rgb) frame = np.array(img) frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR) # 二维码识别 decoded = pyzbar.decode(frame) # 处理识别结果 for obj in decoded: # 获取二维码位置 points = obj.polygon if len(points) > 4: hull = cv2.convexHull(np.array([point for point in points], dtype=np.float32)) hull = list(map(tuple, np.squeeze(hull))) else: hull = points # 绘制边界框 n = len(hull) for j in range(n): cv2.line(frame, hull[j], hull[(j+1)%n], (0,255,0), 3) # 提取二维码区域 x_min = min([p.x for p in obj.polygon]) x_max = max([p.x for p in obj.polygon]) y_min = min([p.y for p in obj.polygon]) y_max = max([p.y for p in obj.polygon]) # 裁剪并放大二维码区域 qr_roi = frame[y_min:y_max, x_min:x_max] try: # 放大显示(4倍大小) enlarged = cv2.resize(qr_roi, None, fx=4, fy=4, interpolation=cv2.INTER_CUBIC) cv2.imshow("QR Code", enlarged) except: pass # 显示实时画面 cv2.imshow("Screen Capture", frame) # 退出条件 if cv2.waitKey(1) & 0xFF == ord('q'): break cv2.destroyAllWindows() ``` 代码解释: 1. 使用mss库进行高效屏幕捕获(比pyautogui更快) 2. 通过pyzbar库实现二维码识别 3. 使用OpenCV进行图像处理和显示: - 绿色边框标记二维码位置 - 单独窗口显示放大后的二维码(4倍放大) 4. 按Q键退出程序 注意事项: 1. 需要安装依赖库:`pip install pyzbar mss opencv-python numpy pillow` 2. 修改monitor参数匹配屏幕分辨率 3. 实时性能取决于屏幕分辨率,建议在1080p分辨率下运行 4. 放大倍数可根据需要调整fx/fy参数 优化建议: 1. 添加多线程处理以提高帧率 2. 增加区域选择功能(不扫描全屏) 3. 添加识别结果日志记录 4. 实现声音提示功能 >>
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值