首先需要用到一个网站:构建自定义人脸识别数据集的三种训练方法 - 腾讯云开发者社区-腾讯云
一个build_face_detect.py文档
from imutils.video import VideoStream import argparse import imutils import time import cv2 import os # construct the argument parser and parse the arguments ap = argparse.ArgumentParser() ap.add_argument("-c", "--cascade", required=True, help = "path to where the face cascade resides") ap.add_argument("-o", "--output", required=True, help="path to output directory") args = vars(ap.parse_args()) # load OpenCV's Haar cascade for face detection from disk detector = cv2.CascadeClassifier('') # initialize the video stream, allow the camera sensor to warm up, # and initialize the total number of example faces written to disk # thus far print("[INFO] starting video stream...") vs = VideoStream(src=0).start() # vs = VideoStream(usePiCamera=True).start() time.sleep(2.0) total = 0 # loop over the frames from the video stream while True: # grab the frame from the threaded video stream, clone it, (just # in case we want to write it to disk), and then resize the frame # so we can apply face detection faster frame = vs.read() orig = frame.copy() frame = imutils.resize(frame, width=400) # detect faces in the grayscale frame rects = detector.detectMultiScale( cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY), scaleFactor=1.1, minNeighbors=5, minSize=(30, 30)) # loop over the face detections and draw them on the frame for (x, y, w, h) in rects: cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2) # show the output frame cv2.imshow("Frame", frame) key = cv2.waitKey(1) & 0xFF # if the `k` key was pressed, write the *original* frame to disk # so we can later process it and use it for face recognition if key == ord("k"): p = os.path.sep.join([args["output"], "{}.png".format( str(total).zfill(5))]) cv2.imwrite(p, orig) total += 1 # if the `q` key was pressed, break from the loop elif key == ord("q"): break
json to yolov5的python文档
import json import os name2id = {} def convert(img_size, box): dw = 1./(img_size[0]) dh = 1./(img_size[1]) x = (box[0] + box[2])/2.0 - 1 y = (box[1] + box[3])/2.0 - 1 w = box[2] - box[0] h = box[3] - box[1] x = x*dw w = w*dw y = y*dh h = h*dh return (x,y,w,h) def decode_json(json_floder_path,json_name): txt_name = '' + json_name[0:8] + '.txt' txt_file = open(txt_name, 'w') json_path = os.path.join(json_floder_path, json_name) data = json.load(open(json_path, 'r', encoding='gb2312')) img_w = data['imageWidth'] img_h = data['imageHeight'] for i in data['shapes']: label_name = i['label'] if (i['shape_type'] == 'rectangle'): x1 = int(i['points'][0][0]) y1 = int(i['points'][0][1]) x2 = int(i['points'][1][0]) y2 = int(i['points'][1][1]) bb = (x1,y1,x2,y2) bbox = convert((img_w,img_h),bb) txt_file.write(str(name2id[label_name]) + " " + " ".join([str(a) for a in bbox]) + '\n') if __name__ == "__main__": json_floder_path = '' json_names = os.listdir(json_floder_path) for json_name in json_names: decode_json(json_floder_path,json_name) 以及熟练的终端编程能力。 python train.py --batch 64 --data mydata_path.yaml --weights yolov5s.pt --device 0 python detect.py --weights yourown_model.pt --source 0
注意一点,在搞数据的时候一定给一个valid集,否则yolo运行不起来,可以不给detect
不想讲的那么详细了,有问题可以直接在评论区留言。这是博主的毕业论文的一部分,希望大家批评指正,友好交流。