Dlib:shape_predictor 与 full_object_detection

dlib::shape_predictor , dlib::full_object_detection 是dlib里面的两个两个类,最近在做一个人脸的项目的时候需要用到它们,也是第一次使用Dlib,感觉这是一个比较新的一个开源机器学习库。

本人也是刚开始学习,后续会更新

先上一段源码,shape_predictor:

class shape_predictor
    {
        /*!
            WHAT THIS OBJECT REPRESENTS
                This object is a tool that takes in an image region containing some object
                and outputs a set of point locations that define the pose of the object.
                The classic example of this is human face pose prediction, where you take
                an image of a human face as input and are expected to identify the
                locations of important facial landmarks such as the corners of the mouth
                and eyes, tip of the nose, and so forth. 
                // 可见,shape_predictor的作用是:以图像的某块区域为输入,输出一系列的点(point location)以表示此图像region里object的姿势pose。所以,shape_predictor主要用于表示object的姿势

                To create useful instantiations of this object you need to use the
                shape_predictor_trainer object defined below to train a shape_predictor
                using a set of training images, each annotated with shapes you want to
                predict.

            THREAD SAFETY
                No synchronization is required when using this object.  In particular, a
                single instance of this object can be used from multiple threads at the
                same time.  
        !*/

    public:

        shape_predictor (
        );
        /*!
            ensures
                - #num_parts() == 0
                - #num_features() == 0
        !*/

        unsigned long num_parts (
        ) const;
        /*!
            ensures
                - returns the number of parts in the shapes predicted by this object.
        !*/

        unsigned long num_features (
        ) const;
        /*!
            ensures
                - Returns the dimensionality of the feature vector output by operator().
                  This number is the total number of trees in this object times the number
                  of leaves on each tree.  
        !*/

        // 这个方法比较重要!!!!
        //该方法其实是重载了()运算符,使得shape_predictor对象成为函数对象,输出full_object_detection类型对象,这个类后面会说明
        template <typename image_type, typename T, typename U>
        full_object_detection operator()(
            const image_type& img,
            const rectangle& rect,
            std::vector<std::pair<T,U> >& feats
        ) const;
        /*!
            requires
                - image_type == an image object that implements the interface defined in
                  dlib/image_processing/generic_image.h 
                - T is some unsigned integral type (e.g. unsigned int).
                - U is any scalar type capable of storing the value 1 (e.g. float).
            ensures
                - Runs the shape prediction algorithm on the part of the image contained in
                  the given bounding rectangle.  So it will try and fit the shape model to
                  the contents of the given rectangle in the image.  For example, if there
                  is a human face inside the rectangle and you use a face landmarking shape
                  model then this function will return the locations of the face landmarks
                  as the parts.  So the return value is a full_object_detection DET such
                  that:
                    - DET.get_rect() == rect
                    - DET.num_parts() == num_parts()
                    - for all valid i:
                        - DET.part(i) == the location in img for the i-th part of the shape
                          predicted by this object.
                - #feats == a sparse vector that records which leaf each tree used to make
                  the shape prediction.   Moreover, it is an indicator vector, Therefore,
                  for all valid i:
                    - #feats[i].second == 1
                  Further, #feats is a vector from the space of num_features() dimensional
                  vectors.  The output shape positions can be represented as the dot
                  product between #feats and a weight vector.  Therefore, #feats encodes
                  all the information from img that was used to predict the returned shape
                  object.
                  //这里说的很清楚:将一种shape预测算法应用在给定矩形区域的图像上,是为了拟合图像中各part的shape;以人脸为例,如果输入的矩形框所包含的内容是一张人脸,那么就会返回人脸的landmarks并且作为人脸这个bject的part,所以,由此可见,该方法的返回值其实就是一个full_object_detection对象。上述注释中,将两个类的相关属性做了一个联系。
        !*/

        template <typename image_type>
        full_object_detection operator()(
            const image_type& img,
            const rectangle& rect
        ) const;
        /*!
            requires
                - image_type == an image object that implements the interface defined in
                  dlib/image_processing/generic_image.h 
            ensures
                - Calling this function is equivalent to calling (*this)(img, rect, ignored)
                  where the 3d argument is discarded.
        !*/

    };

    void serialize (const shape_predictor& item, std::ostream& out);
    void deserialize (shape_predictor& item, std::istream& in);

然后是full_object_detection

class full_object_detection
    {
        /*!
            WHAT THIS OBJECT REPRESENTS
                This object represents the location of an object in an image along with the
                positions of each of its constituent parts.
            // full_object_detection是Dlib的Object detection部分的内容,所以 full_object_detection肯定包含一个跟bounding box相关的属性,即rect;除此之外,dlib::full_object_detection 同时还能表示object的组成part的位置,所以dlib::full_object_detection会包含另外一个属性 parts
        !*/

    public:

        full_object_detection(
            const rectangle& rect,
            const std::vector<point>& parts
        );
        /*!
            ensures
                - #get_rect() == rect
                - #num_parts() == parts.size()
                - for all valid i:
                    - part(i) == parts[i]
        !*/

        full_object_detection(
        );
        /*!
            ensures
                - #get_rect().is_empty() == true
                - #num_parts() == 0
        !*/

        explicit full_object_detection(
            const rectangle& rect
        );
        /*!
            ensures
                - #get_rect() == rect
                - #num_parts() == 0
        !*/

        const rectangle& get_rect(
        ) const;
        /*!
            ensures
                - returns the rectangle that indicates where this object is.  In general,
                  this should be the bounding box for the object.
        !*/

        rectangle& get_rect(
        ); 
        /*!
            ensures
                - returns the rectangle that indicates where this object is.  In general,
                  this should be the bounding box for the object.
        !*/

        unsigned long num_parts(
        ) const;
        /*!
            ensures
                - returns the number of parts in this object.  
        !*/

        const point& part(
            unsigned long idx
        ) const; 
        /*!
            requires
                - idx < num_parts()
            ensures
                - returns the location of the center of the idx-th part of this object.
                  Note that it is valid for a part to be "not present".  This is indicated
                  when the return value of part() is equal to OBJECT_PART_NOT_PRESENT. 
                  This is useful for modeling object parts that are not always observed.
          // 返回object的第idx-th part的中心位置
        !*/

        point& part(
            unsigned long idx
        ); 
        /*!
            requires
                - idx < num_parts()
            ensures
                - returns the location of the center of the idx-th part of this object.
                  Note that it is valid for a part to be "not present".  This is indicated
                  when the return value of part() is equal to OBJECT_PART_NOT_PRESENT. 
                  This is useful for modeling object parts that are not always observed.
        !*/
    };

经常的一个实例就是:
先用一个人脸检测模型得到bounding box, 比如rect;
然后用已经载入的shape模型pose_来得到full_object_detection对象;

shape = (*pose_)(img, rect); 返回一个 full_object_detection对象shape

然后,取shape的各个part的location,也就是bounding box rect里object的各个部分,比如

dlib::point p = shape.part(333);

得到想要的facial points

评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值