iOS 调戏 CoreML —这是花瓶？

最新推荐文章于 2024-04-26 10:05:15 发布

Sodaslay

最新推荐文章于 2024-04-26 10:05:15 发布

阅读量1.4k

点赞数 1

分类专栏： iOSX Dev

iOSX Dev 专栏收录该内容

387 篇文章 3 订阅

订阅专栏

原文地址：http://www.jianshu.com/p/0cbf4d17ac88

CoreML 是 Apple 在 WWDC 2017 推出的机器学习框架。但是其到底有什么功能呢，能不能识别花瓶，看看就知道了。

模型

在 CoreML 中， Apple 定义了一套自己的模型格式，后缀名为： mimodel，通过 CoreML 框架，以及模型库，可以在 App 层面进行机器学习的功能研发。

官网已经提供四个模型库供下载。

Demo

官网提供了一个 Demo，要求 XCode 9 + iOS 11 的环境。

下载下来 Run 了一下，不得不说，Apple 对开发者还是非常友好的，直接将模型文件拖到项目中，Xcode 会自动生成接口文件:

 
        import CoreML 
       
        class MarsHabitatPricerInput : MLFeatureProvider { 
       
            var solarPanels: Double 
       
            var greenhouses: Double 
       
            var size: Double 
       
            var featureNames: Set<string> { 
       
                get { 
       
                    return ["solarPanels", "greenhouses", "size"] 
       
                } 
       
            } 
       
            func featureValue(for featureName: String) -> MLFeatureValue? { 
       
                if (featureName == "solarPanels") { 
       
                    return MLFeatureValue(double: solarPanels) 
       
                } 
       
                if (featureName == "greenhouses") { 
       
                    return MLFeatureValue(double: greenhouses) 
       
                } 
       
                if (featureName == "size") { 
       
                    return MLFeatureValue(double: size) 
       
                } 
       
                return nil 
       
            } 
       
            init(solarPanels: Double, greenhouses: Double, size: Double) { 
       
                self.solarPanels = solarPanels 
       
                self.greenhouses = greenhouses 
       
                self.size = size 
       
            } 
       
        } 
       
        class MarsHabitatPricerOutput : MLFeatureProvider { 
       
            let price: Double 
       
            var featureNames: Set<string> { 
       
                get { 
       
                    return ["price"] 
       
                } 
       
            } 
       
            func featureValue(for featureName: String) -> MLFeatureValue? { 
       
                if (featureName == "price") { 
       
                    return MLFeatureValue(double: price) 
       
                } 
       
                return nil 
       
            } 
       
            init(price: Double) { 
       
                self.price = price 
       
            } 
       
        } 
       
        @objc class MarsHabitatPricer:NSObject { 
       
            var model: MLModel 
       
            init(contentsOf url: URL) throws { 
       
                self.model = try MLModel(contentsOf: url) 
       
            } 
       
            convenience override init() { 
       
                let bundle = Bundle(for: MarsHabitatPricer.self) 
       
                let assetPath = bundle.url(forResource: "MarsHabitatPricer", withExtension:"mlmodelc") 
       
                try! self.init(contentsOf: assetPath!) 
       
            } 
       
            func prediction(input: MarsHabitatPricerInput) throws -> MarsHabitatPricerOutput { 
       
                let outFeatures = try model.prediction(from: input) 
       
                let result = MarsHabitatPricerOutput(price: outFeatures.featureValue(for: "price")!.doubleValue) 
       
                return result 
       
            } 
       
            func prediction(solarPanels: Double, greenhouses: Double, size: Double) throws -> MarsHabitatPricerOutput { 
       
                let input_ = MarsHabitatPricerInput(solarPanels: solarPanels, greenhouses: greenhouses, size: size) 
       
                return try self.prediction(input: input_) 
       
            } 
       
        }</string></string>

CoreML 是 Apple 在 WWDC 2017 推出的机器学习框架。但是其到底有什么功能呢，能不能识别花瓶，看看就知道了。

原文发表在个人博客iOS-CoreML-初探，转载请注明出处。

模型

在 CoreML 中， Apple 定义了一套自己的模型格式，后缀名为： mimodel，通过 CoreML 框架，以及模型库，可以在 App 层面进行机器学习的功能研发。

官网已经提供四个模型库供下载。

Demo

官网提供了一个 Demo，要求 XCode 9 + iOS 11 的环境。

下载下来 Run 了一下，不得不说，Apple 对开发者还是非常友好的，直接将模型文件拖到项目中，Xcode 会自动生成接口文件:

可以看到，主要是定义了输入，输出以及预测的格式，调用的时候，也非常简单，传参即可。

但是这些接口文件并没有在 XCode 左边的文件树中出现。

查了一下，是生成在 DerivedData 目录下，估计是想开发者使用起来更简洁。

运行一下，可以看到，主要功能是对价格进行预测。

貌似稍微有点不够高大上...

Resnet50

官网提供的四个模型库，我们还没用呢，当然要看下能用来干啥，看了一下，貌似主要是物体识别，OK，代码走起。

先下载模型库 Resnet50, 然后创建一个新的 Swift 项目，将其拖进去：

从描述里面可以看出来，其实一个神经网络的分类器，输入是一张像素为 (224 * 224) 的图片，输出为分类结果。

自动生成的接口文件：

 
        import CoreML 
       
        class Resnet50Input : MLFeatureProvider { 
       
            var image: CVPixelBuffer 
       
            var featureNames: Set<string> { 
       
                get { 
       
                    return ["image"] 
       
                } 
       
            } 
       
            func featureValue(for featureName: String) -> MLFeatureValue? { 
       
                if (featureName == "image") { 
       
                    return MLFeatureValue(pixelBuffer: image) 
       
                } 
       
                return nil 
       
            } 
       
            init(image: CVPixelBuffer) { 
       
                self.image = image 
       
            } 
       
        } 
       
        class Resnet50Output : MLFeatureProvider { 
       
            let classLabelProbs: [String : Double] 
       
            let classLabel: String 
       
            var featureNames: Set<string> { 
       
                get { 
       
                    return ["classLabelProbs", "classLabel"] 
       
                } 
       
            } 
       
            func featureValue(for featureName: String) -> MLFeatureValue? { 
       
                if (featureName == "classLabelProbs") { 
       
                    return try! MLFeatureValue(dictionary: classLabelProbs as [NSObject : NSNumber]) 
       
                } 
       
                if (featureName == "classLabel") { 
       
                    return MLFeatureValue(string: classLabel) 
       
                } 
       
                return nil 
       
            } 
       
            init(classLabelProbs: [String : Double], classLabel: String) { 
       
                self.classLabelProbs = classLabelProbs 
       
                self.classLabel = classLabel 
       
            } 
       
        } 
       
        @objc class Resnet50:NSObject { 
       
            var model: MLModel 
       
            init(contentsOf url: URL) throws { 
       
                self.model = try MLModel(contentsOf: url) 
       
            } 
       
            convenience override init() { 
       
                let bundle = Bundle(for: Resnet50.self) 
       
                let assetPath = bundle.url(forResource: "Resnet50", withExtension:"mlmodelc") 
       
                try! self.init(contentsOf: assetPath!) 
       
            } 
       
            func prediction(input: Resnet50Input) throws -> Resnet50Output { 
       
                let outFeatures = try model.prediction(from: input) 
       
                let result = Resnet50Output(classLabelProbs: outFeatures.featureValue(for: "classLabelProbs")!.dictionaryValue as! [String : Double], classLabel: outFeatures.featureValue(for: "classLabel")!.stringValue) 
       
                return result 
       
            } 
       
            func prediction(image: CVPixelBuffer) throws -> Resnet50Output { 
       
                let input_ = Resnet50Input(image: image) 
       
                return try self.prediction(input: input_) 
       
            } 
       
        }</string></string>

OK，要照片，而且是 CVPixelBuffer 类型的。

但是每次从相册选太烦了，所以我们直接摄像头走起。将 AVCam 的主要功能类复制到项目中。

然后，禁用 CameraViewController 中一些不必要的按钮：

 
        self.recordButton.isHidden = true 
       
        self.captureModeControl.isHidden = true 
       
        self.livePhotoModeButton.isHidden = true 
       
        self.depthDataDeliveryButton.isHidden = true

由于，AVCapturePhotoCaptureDelegate 拍照完成的回调为：

 
        func photoOutput(_ output: AVCapturePhotoOutput, didFinishProcessingPhoto photo: AVCapturePhoto, error: Error?)

看了下 AVCaputrePhoto 的定义，里面刚好有 CVPixelBuffer 格式的属性：

直接传进去试试：

 
        // Predicte 
       
        if let pixelBuffer = photo.previewPixelBuffer { 
       
            guard let Resnet50CategoryOutput = try? model.prediction(image:pixelBuffer) else { 
       
                fatalError("Unexpected runtime error.") 
       
            } 
       
        }

一切看起来很完美，编译通过，运行起来，点一下拍照按钮，额，Crash了，异常：

 
        [core] Error Domain=com.apple.CoreML Code=1 "Input image feature image does not match model description" UserInfo={NSLocalizedDescription=Input image feature image does not match model description, NSUnderlyingError=0x1c0643420 {Error Domain=com.apple.CoreML Code=1 "Image is not valid width 224, instead is 852" UserInfo={NSLocalizedDescription=Image is not valid width 224, instead is 852}}}

哦，忘记改大小了，找到 photoSetting，加上宽高:

 
        if !photoSettings.availablePreviewPhotoPixelFormatTypes.isEmpty { 
       
            photoSettings.previewPhotoFormat = [kCVPixelBufferPixelFormatTypeKey as String: photoSettings.availablePreviewPhotoPixelFormatTypes.first!, 
       
                   kCVPixelBufferWidthKey as String : NSNumber(value:224), 
       
                   kCVPixelBufferHeightKey as String : NSNumber(value:224)] 
       
        }

重新 Run，WTF，Man，居然又报同样的错，好吧，Google 一下，貌似宽高的属性，在 Swift 里面不生效，额。。

没办法，那我们只能将 CVPixelBuffer 先转换成 UIImage，然后改下大小，再转回 CVPixelBuffer，试试：

 
        photoData = photo.fileDataRepresentation() 
       
        // Change Data to Image 
       
        guard let photoData = photoData else { 
       
            return 
       
        } 
       
        let image = UIImage(data: photoData) 
       
        // Resize 
       
        let newWidth:CGFloat = 224.0 
       
        let newHeight:CGFloat = 224.0 
       
        UIGraphicsBeginImageContext(CGSize(width:newWidth, height:newHeight)) 
       
        image?.draw(in:CGRect(x:0, y:0, width:newWidth, height:newHeight)) 
       
        let newImage = UIGraphicsGetImageFromCurrentImageContext() 
       
         UIGraphicsEndImageContext() 
       
        guard let finalImage = newImage else { 
       
            return 
       
        } 
       
        // Predicte 
       
        guard let Resnet50CategoryOutput = try? model.prediction(image:pixelBufferFromImage(image: finalImage)) else { 
       
            fatalError("Unexpected runtime error.") 
       
        }

重新 Run，OK，一切很完美。

最后，为了用户体验，加上摄像头流的暂停和重启，免得在识别的时候，摄像头还一直在动，另外，识别结果通过提醒框弹出来，具体参考文末的源码。

开始玩啦，找支油笔试一下：

识别成，橡皮擦，好吧，其实是有点像。

再拿小绿植试试：

Sodaslay

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
iOS 调戏 CoreML —这是花瓶？

是不是以后可以使用CoreML识别12306中图片验证码了，让我们拭目以待。原文地址：http://www.jianshu.com/p/0cbf4d17ac88CoreML 是 Apple 在 WWDC 2017 推出的机器学习框架。但是其到底有什么功能呢，能不能识别花瓶，看看就知道了。模型在 CoreML 中， Apple 定义了一套自己的模型格式，后缀名
复制链接

扫一扫