ios apple语音性别_如何使用Apple的CoreML和Vision API构建图像识别iOS应用

最新推荐文章于 2024-09-17 13:39:00 发布

cumi6497

最新推荐文章于 2024-09-17 13:39:00 发布

阅读量335

点赞数

文章标签： python java 机器学习深度学习编程语言

原文链接：https://www.freecodecamp.org/news/ios-coreml-vision-image-recognition-3619cf319d0b/

版权

ios apple语音性别

by Mark Mansur

马克·曼苏尔(Mark Mansur)

如何使用Apple的CoreML和Vision API构建图像识别iOS应用 (How to build an image recognition iOS app with Apple’s CoreML and Vision APIs)

With the release of CoreML and new Vision APIs at this year’s Apple World Wide Developers Conference, machine learning has never been easier to get into. Today I’m going to show you how to build a simple image recognition app.

在今年的Apple World Wide Developers Conference上，随着CoreML的发布和新的Vision API的出现，机器学习从未如此简单。今天，我将向您展示如何构建一个简单的图像识别应用程序。

We will learn how to gain access to the iPhone’s camera and how to pass what the camera is seeing into a machine learning model for analysis. We’ll do all this programmatically, without the use of storyboards! Crazy, I know.

我们将学习如何访问iPhone的摄像头，以及如何将摄像头看到的内容传递到机器学习模型中进行分析。我们将以编程方式完成所有这些操作，而无需使用情节提要！疯狂，我知道。

Here is a look at what we are going to accomplish today:

看一下我们今天要完成的工作：

?? 步骤1：创建一个新项目。 (?? Step 1: Create a new project.)

Fire up Xcode and create a new single view application. Give it a name, perhaps “ImageRecognition.” Choose swift as the main language and save your new project.

启动Xcode并创建一个新的单视图应用程序。给它起一个名字，也许是“ ImageRecognition”。选择swift作为主要语言，然后保存新项目。

？第2步：告别分镜脚本。 (? Step 2 : Say goodbye to the storyboard.)

For this tutorial, we are going to do everything programmatically, without the need for the storyboard. Maybe I’ll explain why in another article.

对于本教程，我们将以编程方式进行所有操作，而无需情节提要。也许我会在另一篇文章中解释原因。

Delete main.storyboard.

删除main.storyboard 。

Navigate to info.plist and scroll down to Deployment Info. We need to tell Xcode we are no longer using the storyboard.

导航到info.plist 并向下滚动到“部署信息”。我们需要告诉Xcode我们不再使用情节提要。

Delete the main interface.

删除主界面。

Without the storyboard we need to manually create the app window and root view controller.

没有情节提要，我们需要手动创建应用程序窗口和根视图控制器。

Add the following to the application() function in AppDelegate.swift:

将以下内容添加到AppDelegate.swift的application()函数中：

We manually create the app window with UIWindow(), create our view controller, and tell the window to use it as its root view controller.

我们使用UIWindow()手动创建应用程序窗口，创建我们的视图控制器，并告诉窗口将其用作其根视图控制器。

The app should now build and run without the storyboard ?

该应用程序现在应该在没有情节提要的情况下构建和运行吗？

Step️步骤3：设置AVCaptureSession。 (⚙️ Step 3: Set up AVCaptureSession.)

Before we start, import UIKit, AVFoundation and Vision. The AVCaptureSession object handles capture activity and manages the flow of data between input devices (such as the rear camera) and outputs.

在开始之前，请导入UIKit，AVFoundation和Vision。 AVCaptureSession对象处理捕获活动，并管理输入设备(例如后置摄像头)和输出之间的数据流。

We are going to start by creating a function to setup our capture session.

我们将首先创建一个函数来设置捕获会话。

Create setupCaptureSession() inside ViewController.swift and instantiate a new AVCaptureSession.

在ViewController.swift创建setupCaptureSession() 并实例化一个新的AVCaptureSession 。

Don’t forget to call this new function from ViewDidLoad().

不要忘记从ViewDidLoad()调用此新函数。

Next, we are going to need a reference to the rear view camera. We can use a DiscoverySession to query available capture devices based on our search criteria.

接下来，我们将需要参考后视摄像机。我们可以使用DiscoverySession 根据我们的搜索条件查询可用的捕获设备。

Add the following code:

添加以下代码：

AvailableDevices now contains a list of available devices matching our search criteria.

AvailableDevices 现在包含符合我们搜索条件的可用设备列表。

We now need to gain access to our captureDevice and add it as an input to our captureSession.

现在，我们需要获取对captureDevice访问权，并将其添加为captureSession的输入。

Add an input to the capture session.

将输入添加到捕获会话。

The first available device will be the rear facing camera. We create a new AVCaptureDeviceInput using our capture device and add it to the capture session.

第一个可用的设备将是后置摄像头。我们创建一个新的AVCaptureDeviceInput 使用我们的捕获设备并将其添加到捕获会话。

Now that we have our input setup, we can get started on how to output what the camera is capturing.

现在我们有了输入设置，我们可以开始如何输出相机正在捕获的内容。

Add a video output to our capture session.

将视频输出添加到我们的捕获会话。

AVCaptureVideoDataOutput is an output that captures video. It also provides us access to the frames being captured for processing with a delegate method we will see later.

AVCaptureVideoDataOutput是捕获视频的输出。它还使我们可以访问捕获的帧，并使用稍后将介绍的委托方法进行处理。

Next, we need to add the capture session’s output as a sublayer to our view.

接下来，我们需要将捕获会话的输出作为子层添加到我们的视图中。

Add capture session output as a sublayer to the view controllers’ view.

将捕获会话输出作为子层添加到视图控制器的视图。

We create a layer based on our capture session and add this layer as a sublayer to our view. CaptureSession.startRunning() starts the flow from inputs to the outputs that we connected earlier.

我们基于捕获会话创建一个图层，并将该图层作为子图层添加到视图中。 CaptureSession.startRunning()启动从输入到我们先前连接的输出的流。

？步骤4：可以使用相机吗？许可授予。 (? Step 4: Permission to use the camera? Permission granted.)

Nearly everyone has opened an app for the first time and has been prompted to allow the app to use the camera. Starting in iOS 10, our app will crash if we don’t prompt the user before attempting to access the camera.

几乎每个人都首次打开了一个应用程序，并被提示允许该应用程序使用相机。从iOS 10开始，如果我们在尝试访问相机之前未提示用户，则我们的应用程序将崩溃。

Navigate to info.plist and add a new key named NSCameraUsageDescription. In the value column, simply explain to the user why your app needs camera access.

导航到info.plist 并添加一个名为NSCameraUsageDescription的新密钥。在值列中，向用户简单说明为什么您的应用需要访问摄像头。

Now, when the user launches the app for the first time they will be prompted to allow access to the camera.

现在，当用户首次启动该应用程序时，将提示他们允许访问摄像机。

？步骤5：获取模型。 (? Step 5: Getting the model.)

The heart of this project is most likely the machine learning model. The model must be able to take in an image and give us back a prediction of what the image is. You can find free trained models here. The one I chose is ResNet50.

该项目的核心很可能是机器学习模型。该模型必须能够获取图像并向我们返回图像的预测。您可以在这里找到免费的训练有素的模型。我选择的是ResNet50。

Once you obtain your model, drag and drop it into Xcode. It will automatically generate the necessary classes, providing you an interface to interact with your model.

获得模型后，将其拖放到Xcode中。它将自动生成必要的类，为您提供与模型进行交互的界面。

？步骤6：图像分析。 (? Step 6: Image analysis.)

To analyze what the camera is seeing, we need to somehow gain access to the frames being captured by the camera.

要分析相机所看到的内容，我们需要以某种方式访问相机所捕获的帧。

Conforming to the AVCaptureVideoDataOutputSampleBufferDelegate gives us an interface to interact with and be notified every time a frame is captured by the camera.

符合AVCaptureVideoDataOutputSampleBufferDelegate 为我们提供了一个与相机进行交互的界面，并在每次相机捕获到帧时得到通知。

Conform ViewController to the AVCaptureVideoDataOutputSampleBufferDelegate.

使ViewController符合AVCaptureVideoDataOutputSampleBufferDelegate 。

We need to tell our Video output that ViewController is its sample buffer delegate.

我们需要告诉我们的Video输出，ViewController是其示例缓冲区委托。

Add the following line in SetupCaptureSession():

在SetupCaptureSession()添加以下行：

Add the following function:

添加以下功能：

Each time a frame is captured, the delegate is notified by calling captureOutput(). This is a perfect place to do our image analysis with CoreML.

每次捕获帧时，都会通过调用captureOutput()来通知委托。这是使用CoreML进行图像分析的理想场所。

First, we create a VNCoreMLModel which is essentially a CoreML model used with the vision framework. We create it with a Resnet50 Model.

首先，我们创建一个VNCoreMLModel 它本质上是与视觉框架一起使用的CoreML模型。我们使用Resnet50模型创建它。

Next, we create our vision request. In the completion handler, we update the onscreen UILabel with the identifier returned by the model. We then convert the frame passed to us from a CMSampleBuffer to a CVPixelBuffer. Which is the format our model needs for analysis.

接下来，我们创建愿景请求。在完成处理程序中，我们使用模型返回的标识符更新屏幕上的UILabel。然后，我们将传递给我们的帧从CMSampleBuffer转换为CVPixelBuffer 。我们的模型需要哪种格式进行分析。

Lastly, we perform the Vision request with a VNImageRequestHandler.

最后，我们使用VNImageRequestHandler执行视觉请求。