普通视图

发现新文章，点击刷新页面。

今天 — 2025年11月14日首页

掘金前端
VNBarcodeObservation的结果中observation.boundingBox 是什么类型？1024小神
2025年11月14日 15:20

VNBarcodeObservation的结果中observation.boundingBox 是什么类型？

掘金前端

作者 1024小神

2025年11月14日 15:20

大家好，我的开源项目PakePlus可以将网页/Vue/React项目打包为桌面/手机应用并且小于5M只需几分钟，官网地址：pakeplus.com

observation.boundingBox 的类型是 CGRect。

CGRect 结构

CGRect 是 Core Graphics 框架中的结构体，表示一个矩形区域：

public struct CGRect {
    public var origin: CGPoint
    public var size: CGSize
}

在 Vision 框架中的特性

在 Vision 框架中，boundingBox 使用归一化坐标系统：

let barcodeRequest = VNDetectBarcodesRequest { request, error in
    guard let results = request.results as? [VNBarcodeObservation] else { return }
    
    for observation in results {
        let boundingBox: CGRect = observation.boundingBox
        print("boundingBox: \(boundingBox)")
        
        // 访问具体属性
        print("原点: \(boundingBox.origin)")      // CGPoint
        print("尺寸: \(boundingBox.size)")        // CGSize
        print("x: \(boundingBox.origin.x)")      // CGFloat
        print("y: \(boundingBox.origin.y)")      // CGFloat
        print("宽度: \(boundingBox.size.width)")   // CGFloat
        print("高度: \(boundingBox.size.height)")  // CGFloat
        
        // 其他便捷属性
        print("最小X: \(boundingBox.minX)")
        print("最小Y: \(boundingBox.minY)")
        print("最大X: \(boundingBox.maxX)")
        print("最大Y: \(boundingBox.maxY)")
        print("中心X: \(boundingBox.midX)")
        print("中心Y: \(boundingBox.midY)")
    }
}

归一化坐标系统

Vision 框架的 boundingBox 使用归一化坐标：

坐标范围: (0.0, 0.0) 到 (1.0, 1.0)
原点: 左下角 (与 UIKit 的左上角不同！)
x: 从左到右 (0.0 = 左边缘, 1.0 = 右边缘)
y: 从下到上 (0.0 = 下边缘, 1.0 = 上边缘)

// 示例 boundingBox 值
let exampleBox = CGRect(x: 0.2, y: 0.3, width: 0.4, height: 0.3)
// 表示：
// - 从图像左边 20% 的位置开始
// - 从图像底部 30% 的位置开始  
// - 宽度为图像的 40%
// - 高度为图像的 30%

坐标转换

由于归一化坐标和 UIKit 坐标系统不同，需要进行转换：

1. 转换为图像像素坐标

func convertToImageCoordinates(boundingBox: CGRect, imageSize: CGSize) -> CGRect {
    // Vision → 图像坐标 (左下角原点)
    let imageRect = VNImageRectForNormalizedRect(
        boundingBox, 
        Int(imageSize.width), 
        Int(imageSize.height)
    )
    return imageRect
}

2. 转换为 UIKit 视图坐标

func convertToViewCoordinates(boundingBox: CGRect, viewSize: CGSize) -> CGRect {
    // Vision → UIKit 坐标 (左上角原点)
    let viewRect = CGRect(
        x: boundingBox.origin.x * viewSize.width,
        y: (1 - boundingBox.origin.y - boundingBox.size.height) * viewSize.height,
        width: boundingBox.size.width * viewSize.width,
        height: boundingBox.size.height * viewSize.height
    )
    return viewRect
}

完整的使用示例

import Vision
import UIKit

class BarcodeDetector {
    func processBarcodeObservation(_ observation: VNBarcodeObservation, 
                                 imageSize: CGSize, 
                                 targetView: UIView) {
        
        let boundingBox: CGRect = observation.boundingBox
        
        // 1. 打印原始 boundingBox
        print("原始 boundingBox: \(boundingBox)")
        
        // 2. 转换为图像坐标
        let imageRect = VNImageRectForNormalizedRect(
            boundingBox,
            Int(imageSize.width),
            Int(imageSize.height)
        )
        print("图像坐标: \(imageRect)")
        
        // 3. 转换为视图坐标 (用于在屏幕上绘制)
        let viewRect = convertToViewRect(boundingBox: boundingBox, 
                                       viewSize: targetView.bounds.size)
        print("视图坐标: \(viewRect)")
        
        // 4. 在界面上绘制边界框
        drawBoundingBox(on: targetView, rect: viewRect)
    }
    
    private func convertToViewRect(boundingBox: CGRect, viewSize: CGSize) -> CGRect {
        return CGRect(
            x: boundingBox.origin.x * viewSize.width,
            y: (1 - boundingBox.origin.y - boundingBox.size.height) * viewSize.height,
            width: boundingBox.size.width * viewSize.width,
            height: boundingBox.size.height * viewSize.height
        )
    }
    
    private func drawBoundingBox(on view: UIView, rect: CGRect) {
        // 移除之前的边界框
        view.layer.sublayers?.removeAll(where: { $0.name == "boundingBox" })
        
        // 创建新的边界框图层
        let boxLayer = CAShapeLayer()
        boxLayer.name = "boundingBox"
        boxLayer.frame = rect
        boxLayer.borderColor = UIColor.green.cgColor
        boxLayer.borderWidth = 2.0
        boxLayer.backgroundColor = UIColor.clear.cgColor
        
        view.layer.addSublayer(boxLayer)
    }
}

重要注意事项

坐标系统差异: Vision 使用左下角原点，UIKit 使用左上角原点
归一化范围: 坐标值在 0.0-1.0 范围内
空矩形检查: 检查 boundingBox 是否有效
边界处理: 确保转换后的坐标在有效范围内

// 检查 boundingBox 是否有效
if boundingBox.isNull || boundingBox.isInfinite {
    print("无效的 boundingBox")
    return
}

// 检查是否在有效范围内
if boundingBox.minX < 0 || boundingBox.maxX > 1 || 
   boundingBox.minY < 0 || boundingBox.maxY > 1 {
    print("boundingBox 超出有效范围")
}

总结：observation.boundingBox 是 CGRect 类型，使用归一化坐标系统表示检测对象在图像中的位置和大小，需要进行适当的坐标转换才能在 UIKit 界面中使用。

大家好，我是1024小神，技术群 / 私活群 / 股票群或交朋友都可以私信我。如果你觉得本文有用，一键三连 (点赞、评论、关注)，就是对我最大的支持~

掘金前端
在 Vision 框架中，request.results 是什么类型的数据1024小神
2025年11月14日 14:48

在 Vision 框架中，request.results 是什么类型的数据

掘金前端

作者 1024小神

2025年11月14日 14:48

大家好，我的开源项目PakePlus可以将网页/Vue/React项目打包为桌面/手机应用并且小于5M只需几分钟，官网地址：pakeplus.com

在 Vision 框架中，request.results 的类型是 [VNObservation]?（可选的对象数组）。

基本类型

// request.results 的基本类型
let results: [VNObservation]? = request.results

具体的子类类型

根据不同的 Vision 请求，results 数组中的对象会是不同的 VNObservation 子类：

1. 条码检测 - VNDetectBarcodesRequest

let barcodeRequest = VNDetectBarcodesRequest { request, error in
    // 需要向下转型为具体的类型
    guard let results = request.results as? [VNBarcodeObservation] else { return }
    
    for barcode in results {
        print("条码类型: \(barcode.symbology.rawValue)")
        print("条码内容: \(barcode.payloadStringValue ?? "")")
        print("置信度: \(barcode.confidence)")
        print("边界框: \(barcode.boundingBox)")
    }
}

2. 文字识别 - VNRecognizeTextRequest

let textRequest = VNRecognizeTextRequest { request, error in
    guard let results = request.results as? [VNRecognizedTextObservation] else { return }
    
    for observation in results {
        // 获取识别到的文字
        let topCandidates = observation.topCandidates(1)
        if let recognizedText = topCandidates.first {
            print("识别到的文字: \(recognizedText.string)")
            print("置信度: \(recognizedText.confidence)")
        }
    }
}

3. 人脸检测 - VNDetectFaceRectanglesRequest

let faceRequest = VNDetectFaceRectanglesRequest { request, error in
    guard let results = request.results as? [VNFaceObservation] else { return }
    
    for face in results {
        print("人脸位置: \(face.boundingBox)")
        print("置信度: \(face.confidence)")
    }
}

4. 物体检测 - VNDetectRectanglesRequest

let rectangleRequest = VNDetectRectanglesRequest { request, error in
    guard let results = request.results as? [VNRectangleObservation] else { return }
    
    for rectangle in results {
        print("矩形位置: \(rectangle.boundingBox)")
        print("左上角: \(rectangle.topLeft)")
        print("右上角: \(rectangle.topRight)")
        print("左下角: \(rectangle.bottomLeft)")
        print("右下角: \(rectangle.bottomRight)")
    }
}

完整的类型处理示例

func handleVisionResults(request: VNRequest, error: Error?) {
    if let error = error {
        print("Vision 请求错误: \(error)")
        return
    }
    
    // 首先检查是否有结果
    guard let results = request.results, !results.isEmpty else {
        print("未检测到任何内容")
        return
    }
    
    // 根据请求类型处理不同的结果
    switch request {
    case is VNDetectBarcodesRequest:
        handleBarcodeResults(results as! [VNBarcodeObservation])
        
    case is VNRecognizeTextRequest:
        handleTextResults(results as! [VNRecognizedTextObservation])
        
    case is VNDetectFaceRectanglesRequest:
        handleFaceResults(results as! [VNFaceObservation])
        
    case is VNDetectRectanglesRequest:
        handleRectangleResults(results as! [VNRectangleObservation])
        
    default:
        print("未知的请求类型")
        // 通用处理
        for observation in results {
            print("检测到对象 - 置信度: \(observation.confidence), 位置: \(observation.boundingBox)")
        }
    }
}

安全处理类型转换

为了避免强制转型崩溃，建议使用安全的方式：

func safeHandleResults(request: VNRequest) {
    guard let results = request.results else { return }
    
    // 安全的方式：使用条件转型
    if let barcodeResults = results as? [VNBarcodeObservation] {
        handleBarcodes(barcodeResults)
    } else if let textResults = results as? [VNRecognizedTextObservation] {
        handleText(textResults)
    } else if let faceResults = results as? [VNFaceObservation] {
        handleFaces(faceResults)
    } else {
        // 通用处理
        for observation in results {
            print("基础观察对象: \(observation)")
        }
    }
}

VNObservation 的通用属性

所有 VNObservation 子类都有一些通用属性：

for observation in request.results ?? [] {
    print("UUID: \(observation.uuid)")
    print("置信度: \(observation.confidence)") // 0.0 到 1.0
    print("边界框: \(observation.boundingBox)") // 归一化坐标 (0,0 到 1,1)
    
    // 转换边界框到具体图像坐标
    let imageSize = CGSize(width: 1000, height: 800)
    let boundingBoxInPixels = VNImageRectForNormalizedRect(
        observation.boundingBox, 
        Int(imageSize.width), 
        Int(imageSize.height)
    )
    print("像素坐标: \(boundingBoxInPixels)")
}

总结

基本类型: [VNObservation]?
需要向下转型 为具体的子类才能访问特定功能
不同类型请求 返回不同的 VNObservation 子类
总是可选类型，因为可能没有检测到任何内容
包含通用属性 如置信度、边界框等

这种设计让 Vision 框架既保持了类型安全，又提供了统一的接口来处理各种计算机视觉任务。

大家好，我是1024小神，技术群 / 私活群 / 股票群或交朋友都可以私信我。如果你觉得本文有用，一键三连 (点赞、评论、关注)，就是对我最大的支持~

掘金前端
swift中VNDetectBarcodesRequest VNImageRequestHandler 是什么？有什么作用？VN是什么意思1024小神
2025年11月14日 14:31

swift中VNDetectBarcodesRequest VNImageRequestHandler 是什么？有什么作用？VN是什么意思

掘金前端

作者 1024小神

2025年11月14日 14:31

大家好，我的开源项目PakePlus可以将网页/Vue/React项目打包为桌面/手机应用并且小于5M只需几分钟，官网地址：pakeplus.com

在 Swift 中，VNDetectBarcodesRequest 和 VNImageRequestHandler 是 Vision 框架 中的类，用于计算机视觉任务。VN 是 Vision 的缩写。

Vision 框架概述

Vision 框架是 Apple 提供的用于执行计算机视觉任务的框架，包括：

人脸检测
条码识别
文字识别
图像分析
目标跟踪等

VNImageRequestHandler - 图像请求处理器

作用：用于处理图像并执行 Vision 请求

import Vision
import UIKit

// 创建图像请求处理器
let image = UIImage(named: "barcode_image")
guard let cgImage = image?.cgImage else { return }

let requestHandler = VNImageRequestHandler(cgImage: cgImage)

// 也可以从其他来源创建
let requestHandlerFromURL = VNImageRequestHandler(url: imageURL)
let requestHandlerFromCIImage = VNImageRequestHandler(ciImage: ciImage)
let requestHandlerFromBuffer = VNImageRequestHandler(cvPixelBuffer: pixelBuffer)

// 执行请求
do {
    try requestHandler.perform([barcodeRequest, textRequest])
} catch {
    print("处理失败: \(error)")
}

VNDetectBarcodesRequest - 条码检测请求

作用：专门用于检测和识别图像中的条码

// 创建条码检测请求
let barcodeRequest = VNDetectBarcodesRequest { request, error in
    if let error = error {
        print("条码检测错误: \(error)")
        return
    }
    
    // 处理检测结果
    guard let results = request.results as? [VNBarcodeObservation] else { return }
    
    for observation in results {
        print("检测到条码:")
        print("类型: \(observation.symbology.rawValue)")
        print("内容: \(observation.payloadStringValue ?? "无内容")")
        print("位置: \(observation.boundingBox)")
        
        // 获取条码的角点坐标
        if let corners = observation.topLeft, 
           let topRight = observation.topRight,
           let bottomLeft = observation.bottomLeft,
           let bottomRight = observation.bottomRight {
            print("角点坐标: \(corners), \(topRight), \(bottomLeft), \(bottomRight)")
        }
    }
}

// 配置请求选项（可选）
barcodeRequest.revision = VNDetectBarcodesRequestRevision1
// 设置识别的条码类型
barcodeRequest.symbologies = [.QR, .code128, .EAN13]

完整使用示例

import Vision
import UIKit

class BarcodeScanner {
    func detectBarcodes(in image: UIImage) {
        guard let cgImage = image.cgImage else { return }
        
        // 创建条码检测请求
        let barcodeRequest = VNDetectBarcodesRequest { request, error in
            self.handleBarcodeResults(request: request, error: error)
        }
        
        // 配置条码类型
        barcodeRequest.symbologies = [.QR, .PDF417, .code128]
        
        // 创建图像处理器并执行请求
        let requestHandler = VNImageRequestHandler(cgImage: cgImage, options: [:])
        
        do {
            try requestHandler.perform([barcodeRequest])
        } catch {
            print("条码检测失败: \(error)")
        }
    }
    
    private func handleBarcodeResults(request: VNRequest, error: Error?) {
        if let error = error {
            print("处理错误: \(error)")
            return
        }
        
        guard let results = request.results as? [VNBarcodeObservation] else {
            print("未检测到条码")
            return
        }
        
        for barcode in results {
            print("""
            检测到条码:
            类型: \(barcode.symbology.rawValue)
            内容: \(barcode.payloadStringValue ?? "N/A")
            置信度: \(barcode.confidence)
            """)
            
            // 在真实应用中，这里可以处理条码数据
            if let payload = barcode.payloadStringValue {
                processBarcodePayload(payload, type: barcode.symbology)
            }
        }
    }
    
    private func processBarcodePayload(_ payload: String, type: VNBarcodeSymbology) {
        switch type {
        case .QR:
            print("QR码内容: \(payload)")
            // 处理 URL、联系方式等
        case .code128:
            print Code128 内容: \(payload)")
            // 处理商品编码等
        default:
            print("未知类型的条码: \(payload)")
        }
    }
}

支持的条码类型

Vision 框架支持多种条码类型：

let supportedSymbologies: [VNBarcodeSymbology] = [
    .Aztec,        // Aztec 码
    .code39,       // Code 39
    .code93,       // Code 93
    .code128,      // Code 128
    .dataMatrix,   // 数据矩阵码
    .EAN8,         // EAN-8
    .EAN13,        // EAN-13
    .PDF417,       // PDF417
    .QR,           // QR 码
    .UPCE,         // UPC-E
    .ITF14,        // ITF-14
    .codabar       // Codabar
]

其他常用的 Vision 请求

除了条码检测，Vision 框架还提供其他检测功能：

// 文字识别
let textRequest = VNRecognizeTextRequest { request, error in
    // 处理识别到的文字
}

// 人脸检测
let faceRequest = VNDetectFaceRectanglesRequest { request, error in
    // 处理检测到的人脸
}

// 物体检测
let objectRequest = VNDetectRectanglesRequest { request, error in
    // 处理检测到的矩形物体
}

// 同时执行多个请求
do {
    try requestHandler.perform([barcodeRequest, textRequest, faceRequest])
} catch {
    print("请求执行失败: \(error)")
}

优势特点

高性能: 利用设备的神经网络引擎
准确度高: 基于机器学习模型
易于使用: 简单的 API 设计
实时处理: 支持摄像头实时流处理
隐私保护: 在设备端处理，数据不上传

Vision 框架为 iOS/macOS 应用提供了强大的计算机视觉能力，让开发者可以轻松实现条码识别、文字识别等复杂功能。

大家好，我是1024小神，技术群 / 私活群 / 股票群或交朋友都可以私信我。如果你觉得本文有用，一键三连 (点赞、评论、关注)，就是对我最大的支持~