Face XY project

The goal of this project is to build an Image Regression project that can predict the X and Y coordinates of a facial feature in a live image.

Interactive Tool Startup Steps

You will implement the project by collecting your own data using a clickable image display tool, training a model to find the XY coordinates of the feature, and then testing and updating your model as needed using images from the live camera. Since you are collecting two values for each category, the model may require more training and data to get a satisfactory result.

Be patient! Building your model is an iterative process.

  • Step 1: Open The Notebook
    To get started, navigate to the regression folder in your JupyterLab interface and double-click the regression_interactive.ipynb notebook to open it.
  • Step 2: Execute All Of The Code Blocks
    The notebook is designed to be reusable for any XY regression task you wish to build. Step through the code blocks and execute them one at a time.
      1. Camera

        This block sets the size of the images and starts the camera. If your camera is already active in this notebook or in another notebook, first shut down the kernel in the active notebook before running this code cell. Make sure that the correct camera type is selected for execution (USB). This cell may take several seconds to execute.

      2. Task

        You get to define your TASK and CATEGORIES parameters here, as well as how many datasets you want to track. For the Face XY Project, this has already been defined for you as the face task with categories of nose, left_eye, and right_eye. Each category for the XY regression tool will require both an X and Y values. Go ahead and execute the cell. Subdirectories for each category are created to store the example images you collect. The file names of the images will contain the XY coordinates that you tag the images with during the data collection step. This cell should only take a few seconds to execute.

      3. Data Collection

        You’ll collect images for your categories with a special clickable image widget set up in this cell. As you click the “nose” or “eye” in the live feed image, the data image filename is automatically annotated and saved using the X and Y coordinates from the click.

      4. Model

        The model is set to the same pre-trained ResNet18 model for this project:

        model = torchvision.models.resnet18(pretrained=True)

        For more information on available PyTorch pre-trained models, see the PyTorch documentation. In addition to choosing the model, the last layer of the model is modified to accept only the number of classes that we are training for. In the case of the Face XY Project, it is twice the number of categories, since each requires both X and Y coordinates (i.e. nose Xnose Yleft_eye Xright_eye X and right_eye Y).

        output_dim = 2 * len(dataset.categories)

        model.fc = torch.nn.Linear(512, output_dim)

        This code cell may take several seconds to execute.

      5. Live Execution

        This code block sets up threading to run the model in the background so that you can view the live camera feed and visualize the model performance in real time. This cell should only take a few seconds to execute. For this project,a blue circle will overlay the model prediction for the location of the feature selected.

      6. Training and Evaluation

        The training code cell sets the hyper-parameters for the model training (number of epochs, batch size, learning rate, momentum) and loads the images for training or evaluation. The regression version is very similar to the simple classification training, though the loss is calculated differently. The mean square error over the X and Y value errors is calculated and used as the loss for backpropagation in training to improve the model. This code cell may take several seconds to execute.

      7. Display the Interactive Tool!

        This is the last code cell. All that's left to do is pack all the widgets into one comprehensive tool and display it. This cell may take several seconds to run and should display the full tool for you to work with. There are three image windows. Initially, only the left camera feed is populated. The middle window will display the most recent annotated snapshot image once you start collecting data. The right-most window will display the live prediction view once the model has been trained.

  • Step 3: Collect Data, Train, Test

    Position the camera in front of your face and collect initial data. Point to the target feature with the mouse cursor that matches the category you've selected (such as the nose). Click to collect data. The annotated snapshot you just collected will appear in the middle display box. As you collect each image, vary your head position and pose:

      1. Add 20 images of your nose with the nose category selected
      2. Add 20 images of your left eye face with the left_eye category selected
      3. Add 20 images of your right eye with the right_eye category selected
      4. Set the number of epochs to 10 and click the train button
      5. Once the training is complete, try the live view and observe the prediction. A blue circle should appear on the feature selected.
  • Step 4: Improve Your Model

    Use the live inference as a guide to improve your model! The live feed shows the model's prediction. As you move your head, does the target circle correctly follow your nose (or left_eye, right_eye)? If not, then click the correct location and add data. After you've added some data for a new scenario, train the model some more. For example:

      • Move the camera so that the face is closer. Is the performance of the predictor still good? If not, try adding some data for each category (10 each) and retrain (5 epochs). Does this help? You can experiment with more data and more training.
      • Move the camera to provide a different background. Is the performance of the predictor still good? If not, try adding some data for each category (10 each) and retrain (5 epochs). Does this help? You can experiment with more data and more training.
      • Are there any other scenarios you think the model might not perform well? Try them out!
      • Can you get a friend to try your model? Does it work the same? You know the drill: more data and training!
  • Step 5: Save Your Model

    When you are satisfied with your model, save it by entering a name in the "model path" box and click "save model".

相关推荐

最近更新

  1. TCP协议是安全的吗?

    2024-04-29 01:46:05       17 阅读
  2. 阿里云服务器执行yum,一直下载docker-ce-stable失败

    2024-04-29 01:46:05       16 阅读
  3. 【Python教程】压缩PDF文件大小

    2024-04-29 01:46:05       15 阅读
  4. 通过文章id递归查询所有评论(xml)

    2024-04-29 01:46:05       18 阅读

热门阅读

  1. Ruoyi-Vue前端部署-nginx部署多个vue前端项目

    2024-04-29 01:46:05       10 阅读
  2. pytorch运行物体检测模型 SSD

    2024-04-29 01:46:05       11 阅读
  3. php 姓名加星号

    2024-04-29 01:46:05       10 阅读
  4. c++刷题------ 最长无重复子数组

    2024-04-29 01:46:05       13 阅读
  5. Windows电脑的显存容量查看

    2024-04-29 01:46:05       10 阅读
  6. 设计模式:迪米特法则(Law of Demeter,LoD)介绍

    2024-04-29 01:46:05       10 阅读
  7. Python zerorpc如何使用

    2024-04-29 01:46:05       10 阅读
  8. Linux详解:进程终止、错误码

    2024-04-29 01:46:05       9 阅读
  9. 手写一个民用Tomcat (08)

    2024-04-29 01:46:05       10 阅读
  10. 英语四级之成语

    2024-04-29 01:46:05       12 阅读
  11. 什么是 DNS?DNS设定时常见的问题都有哪些?

    2024-04-29 01:46:05       9 阅读
  12. macos如何安装Tesseract软件

    2024-04-29 01:46:05       10 阅读