Face XY project

The goal of this project is to build an Image Regression project that can predict the X and Y coordinates of a facial feature in a live image.

Interactive Tool Startup Steps

You will implement the project by collecting your own data using a clickable image display tool, training a model to find the XY coordinates of the feature, and then testing and updating your model as needed using images from the live camera. Since you are collecting two values for each category, the model may require more training and data to get a satisfactory result.

Be patient! Building your model is an iterative process.

  • Step 1: Open The Notebook
    To get started, navigate to the regression folder in your JupyterLab interface and double-click the regression_interactive.ipynb notebook to open it.
  • Step 2: Execute All Of The Code Blocks
    The notebook is designed to be reusable for any XY regression task you wish to build. Step through the code blocks and execute them one at a time.
      1. Camera

        This block sets the size of the images and starts the camera. If your camera is already active in this notebook or in another notebook, first shut down the kernel in the active notebook before running this code cell. Make sure that the correct camera type is selected for execution (USB). This cell may take several seconds to execute.

      2. Task

        You get to define your TASK and CATEGORIES parameters here, as well as how many datasets you want to track. For the Face XY Project, this has already been defined for you as the face task with categories of nose, left_eye, and right_eye. Each category for the XY regression tool will require both an X and Y values. Go ahead and execute the cell. Subdirectories for each category are created to store the example images you collect. The file names of the images will contain the XY coordinates that you tag the images with during the data collection step. This cell should only take a few seconds to execute.

      3. Data Collection

        You’ll collect images for your categories with a special clickable image widget set up in this cell. As you click the “nose” or “eye” in the live feed image, the data image filename is automatically annotated and saved using the X and Y coordinates from the click.

      4. Model

        The model is set to the same pre-trained ResNet18 model for this project:

        model = torchvision.models.resnet18(pretrained=True)

        For more information on available PyTorch pre-trained models, see the PyTorch documentation. In addition to choosing the model, the last layer of the model is modified to accept only the number of classes that we are training for. In the case of the Face XY Project, it is twice the number of categories, since each requires both X and Y coordinates (i.e. nose Xnose Yleft_eye Xright_eye X and right_eye Y).

        output_dim = 2 * len(dataset.categories)

        model.fc = torch.nn.Linear(512, output_dim)

        This code cell may take several seconds to execute.

      5. Live Execution

        This code block sets up threading to run the model in the background so that you can view the live camera feed and visualize the model performance in real time. This cell should only take a few seconds to execute. For this project,a blue circle will overlay the model prediction for the location of the feature selected.

      6. Training and Evaluation

        The training code cell sets the hyper-parameters for the model training (number of epochs, batch size, learning rate, momentum) and loads the images for training or evaluation. The regression version is very similar to the simple classification training, though the loss is calculated differently. The mean square error over the X and Y value errors is calculated and used as the loss for backpropagation in training to improve the model. This code cell may take several seconds to execute.

      7. Display the Interactive Tool!

        This is the last code cell. All that's left to do is pack all the widgets into one comprehensive tool and display it. This cell may take several seconds to run and should display the full tool for you to work with. There are three image windows. Initially, only the left camera feed is populated. The middle window will display the most recent annotated snapshot image once you start collecting data. The right-most window will display the live prediction view once the model has been trained.

  • Step 3: Collect Data, Train, Test

    Position the camera in front of your face and collect initial data. Point to the target feature with the mouse cursor that matches the category you've selected (such as the nose). Click to collect data. The annotated snapshot you just collected will appear in the middle display box. As you collect each image, vary your head position and pose:

      1. Add 20 images of your nose with the nose category selected
      2. Add 20 images of your left eye face with the left_eye category selected
      3. Add 20 images of your right eye with the right_eye category selected
      4. Set the number of epochs to 10 and click the train button
      5. Once the training is complete, try the live view and observe the prediction. A blue circle should appear on the feature selected.
  • Step 4: Improve Your Model

    Use the live inference as a guide to improve your model! The live feed shows the model's prediction. As you move your head, does the target circle correctly follow your nose (or left_eye, right_eye)? If not, then click the correct location and add data. After you've added some data for a new scenario, train the model some more. For example:

      • Move the camera so that the face is closer. Is the performance of the predictor still good? If not, try adding some data for each category (10 each) and retrain (5 epochs). Does this help? You can experiment with more data and more training.
      • Move the camera to provide a different background. Is the performance of the predictor still good? If not, try adding some data for each category (10 each) and retrain (5 epochs). Does this help? You can experiment with more data and more training.
      • Are there any other scenarios you think the model might not perform well? Try them out!
      • Can you get a friend to try your model? Does it work the same? You know the drill: more data and training!
  • Step 5: Save Your Model

    When you are satisfied with your model, save it by entering a name in the "model path" box and click "save model".

  • 25
    点赞
  • 18
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值