Aligning face images

Aligning face images

Some people have asked me how I've aligned the face images in my articles. Giving people my ImageMagick hack is embarrassing for me, so I've decided to rewrite it into a Python script. You don't need to copy and paste it, as the script comes with the source folder of my Guide on Face Recognition: crop_face.py.

The code is really easy to use. To scale, rotate and crop the face image you just need to callCropFace(image, eye_left, eye_right, offset_pct, dest_sz), where:

  • eye_left is the position of the left eye
  • eye_right is the position of the right eye
  • offset_pct is the percent of the image you want to keep next to the eyes (horizontal, vertical direction)
  • dest_sz is the size of the output image

If you are using the same offset_pct and dest_sz for your images, they are all aligned at the eyes.

example

Imagine we are given this photo of Arnold Schwarzenegger, which is under a Public Domain license. The(x,y)-position of the eyes are approximately(252,364) for the left and(420,366) for the right eye. Now you only need to define the horizontal offset, vertical offset and the size your scaled, rotated & cropped face should have. Here are some examples:

ConfigurationCropped, Scaled, Rotated Face
0.1 (10%), 0.1 (10%), (200,200)configuration
0.2 (20%), 0.2 (20%), (200,200)configuration
0.3 (30%), 0.3 (30%), (200,200)configuration
0.2 (20%), 0.2 (20%), (70,70)configuration

crop_face.py

#!/usr/bin/env python
# Software License Agreement (BSD License)
#
# Copyright (c) 2012, Philipp Wagner
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
#
#  * Redistributions of source code must retain the above copyright
#    notice, this list of conditions and the following disclaimer.
#  * Redistributions in binary form must reproduce the above
#    copyright notice, this list of conditions and the following
#    disclaimer in the documentation and/or other materials provided
#    with the distribution.
#  * Neither the name of the author(s) nor the names of its
#    contributors may be used to endorse or promote products derived
#    from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
# FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
# COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
# BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
# ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.

import sys, math, Image

def Distance(p1,p2):
  dx = p2[0] - p1[0]
  dy = p2[1] - p1[1]
  return math.sqrt(dx*dx+dy*dy)

def ScaleRotateTranslate(image, angle, center = None, new_center = None, scale = None, resample=Image.BICUBIC):
  if (scale is None) and (center is None):
    return image.rotate(angle=angle, resample=resample)
  nx,ny = x,y = center
  sx=sy=1.0
  if new_center:
    (nx,ny) = new_center
  if scale:
    (sx,sy) = (scale, scale)
  cosine = math.cos(angle)
  sine = math.sin(angle)
  a = cosine/sx
  b = sine/sx
  c = x-nx*a-ny*b
  d = -sine/sy
  e = cosine/sy
  f = y-nx*d-ny*e
  return image.transform(image.size, Image.AFFINE, (a,b,c,d,e,f), resample=resample)

def CropFace(image, eye_left=(0,0), eye_right=(0,0), offset_pct=(0.2,0.2), dest_sz = (70,70)):
  # calculate offsets in original image
  offset_h = math.floor(float(offset_pct[0])*dest_sz[0])
  offset_v = math.floor(float(offset_pct[1])*dest_sz[1])
  # get the direction
  eye_direction = (eye_right[0] - eye_left[0], eye_right[1] - eye_left[1])
  # calc rotation angle in radians
  rotation = -math.atan2(float(eye_direction[1]),float(eye_direction[0]))
  # distance between them
  dist = Distance(eye_left, eye_right)
  # calculate the reference eye-width
  reference = dest_sz[0] - 2.0*offset_h
  # scale factor
  scale = float(dist)/float(reference)
  # rotate original around the left eye
  image = ScaleRotateTranslate(image, center=eye_left, angle=rotation)
  # crop the rotated image
  crop_xy = (eye_left[0] - scale*offset_h, eye_left[1] - scale*offset_v)
  crop_size = (dest_sz[0]*scale, dest_sz[1]*scale)
  image = image.crop((int(crop_xy[0]), int(crop_xy[1]), int(crop_xy[0]+crop_size[0]), int(crop_xy[1]+crop_size[1])))
  # resize it
  image = image.resize(dest_sz, Image.ANTIALIAS)
  return image

if __name__ == "__main__":
  image =  Image.open("arnie.jpg")
  CropFace(image, eye_left=(252,364), eye_right=(420,366), offset_pct=(0.1,0.1), dest_sz=(200,200)).save("arnie_10_10_200_200.jpg")
  CropFace(image, eye_left=(252,364), eye_right=(420,366), offset_pct=(0.2,0.2), dest_sz=(200,200)).save("arnie_20_20_200_200.jpg")
  CropFace(image, eye_left=(252,364), eye_right=(420,366), offset_pct=(0.3,0.3), dest_sz=(200,200)).save("arnie_30_30_200_200.jpg")
  CropFace(image, eye_left=(252,364), eye_right=(420,366), offset_pct=(0.2,0.2)).save("arnie_20_20_70_70.jpg")

opencv 仿射变换

http://www.opencv.org.cn/opencvdoc/2.3.2/html/doc/tutorials/imgproc/imgtrans/warp_affine/warp_affine.html

c#code

        Image<Bgr, byte> cutImg(Image<Bgr, byte> img,Point eyeleft,Point eyeright)
        {
            double[] offset = { 0.2, 0.2 };
            int dest_w = 100;
            int dest_h = 100;         
            double angle = -Math.Atan2(eyeright.Y - eyeleft.Y, eyeright.X -eyeleft.X);
            double Dik = Math.Sqrt(Math.Pow(eyeleft.X - eyeright.X, 2) + Math.Pow(eyeleft.Y - eyeright.Y, 2));
            int offset_h = Convert.ToInt16(offset[0] * dest_w);
            int offset_v = Convert.ToInt16(offset[1] * dest_h);
            double reference = dest_w - 2.0 * offset_h;
            double scale = Dik / reference;
            img = img.Rotate(angle / Math.PI * 180, eyeleft, INTER.CV_INTER_AREA, new Bgr(Color.Black), true);
            Rectangle rect = new Rectangle(new Point(Convert.ToInt16(eyeleft.X - scale * offset_h), Convert.ToInt16(eyeleft.Y - scale * offset_v)), new Size(Convert.ToInt16(dest_w * scale), Convert.ToInt16(dest_w * scale)));
            img.ROI = rect;
            img = img.Resize(dest_w, dest_h, Emgu.CV.CvEnum.INTER.CV_INTER_AREA);
            return img;
        }

C++code

Mat cutImage(Mat src,Point2f eyeleft,Point2f eyeright)  
{  
	Mat dst;
	dst=Mat::zeros( src.rows, src.cols, src.type());
	float offset[2] = { 0.2, 0.2 };  
	int dest_w = 100;  
	int dest_h =100;           
	float angle = atan2(eyeright.y - eyeleft.y, eyeright.x -eyeleft.x);  
	float Dik = sqrt(pow(eyeleft.x - eyeright.x, 2) + pow(eyeleft.y - eyeright.y, 2));  
	int offset_h =offset[0] * dest_w;  
	int offset_v =offset[1] * dest_h;  
	float reference = dest_w - 2.0 * offset_h;  
	float scale = Dik / reference; 
	Mat rot_mat(2,3,CV_32FC1);
	angle=angle/CV_PI*180.0;
	rot_mat=getRotationMatrix2D(eyeleft,angle,scale);
	warpAffine(src,dst,rot_mat,src.size());
	Rect rect((int)(eyeleft.x-scale*offset_h),(int)(eyeleft.y - scale * offset_v),(int)(dest_w * scale), (int)(dest_w * scale));   
	dst=dst(rect);
	Mat dst2=Mat(Size(dest_w,dest_h),src.type());
	resize(dst,dst2,dst2.size());
	//namedWindow("warp_window",CV_WINDOW_AUTOSIZE);
	//imshow("warp_window",dst2);
	//namedWindow("src_window",CV_WINDOW_AUTOSIZE);
	//imshow("src_window",dst);	
	//waitKey(0);
	return dst2;
}

内容概要:本文是《目标检测入门指南》系列的第二部分,重点介绍用于图像分类的经典卷积神经网络(CNN)架构及其在目标检测中的基础作用。文章详细讲解了卷积操作的基本原理,并以AlexNet、VGG和ResNet为例,阐述了不同CNN模型的结构特点与创新点,如深层网络设计、小滤波器堆叠和残差连接机制。同时介绍了目标检测常用的评估指标mAP(平均精度均值),解释了其计算方式和意义。此外,文章还回顾了传统的可变形部件模型(DPM),分析其基于根滤波器、部件滤波器和空间形变代价的检测机制,并指出DPM可通过展开推理过程转化为等效的CNN结构。最后,介绍了Overfeat模型,作为首个将分类、定位与检测统一于CNN框架的先驱工作,展示了如何通过滑动窗口进行多尺度分类并结合回归器预测边界框。; 适合人群:具备一定计算机视觉和深度学习基础,从事或学习图像识别、目标检测相关方向的研发人员与学生;适合希望理解经典CNN模型演进及目标检测早期发展脉络的技术爱好者。; 使用场景及目标:①理解CNN在图像分类中的核心架构演变及其对后续目标检测模型的影响;②掌握mAP等关键评估指标的含义与计算方法;③了解DPM与Overfeat的设计思想,为深入学习R-CNN系列等现代检测器打下理论基础。; 阅读建议:此资源以综述形式串联多个经典模型,建议结合原文图表与参考文献进行延伸阅读,并通过复现典型模型结构加深对卷积、池化、残差连接等操作的理解,从而建立从传统方法到深度学习的完整认知链条。
自然语言处理技术日益发展,就如何让机器能够理解和执行人类指令这一问题,成为了研究的重要方向。这涉及到如何将自然语言和机器学习模型有效地融合,从而让机器能够通过自然语言理解人类的指令,并按照指令进行行动。 据目前的研究表明,最有效的方法是对语言模型进行训练,通过数据驱动的方法,让机器能够理解人类的指令,从而完成特定的任务。要实现这一目标,需要采用一定的语言模型和机器学习算法。 其中,最流行的算法包括序列标注、文本生成、神经机器翻译等。这些算法都能通过对文本进行深度学习来训练模型,从而让机器能够更好地理解指令和完成任务。 然而,一些挑战依然存在。首先,不同的语言之间存在巨大的差异,因此需要针对不同的语言训练不同的模型。其次,语言模型需要和任务场景建立紧密的联系,才能更好地理解和执行指令。 最后,持续的技术进步也需要不断地改进和更新语言模型,以保证在不同的场景下,机器能够更好地理解和执行人类的指令。这意味着,对数据的收集和处理,对算法和模型的优化,都需要不断地实践和创新。 总之,针对语言模型的训练和优化,是实现机器按照人类指令执行的关键。只有通过不断的探索和实践,才能让机器更好地理解我们的指令,并更加有效地完成任务。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值