项目实训记录-文件传输ocr接口后端设计

m0_63187203

已于 2024-05-31 10:13:30 修改

阅读量834

点赞数 27

文章标签： ocr

于 2024-05-30 17:56:08 首次发布

本文链接：https://blog.csdn.net/m0_63187203/article/details/139317581

版权

一、需求分析

文件上传：用户通过前端界面上传文件到服务器。
OCR处理：服务器接收到文件后，对文件进行OCR识别，提取文本内容并返回给客户端。
接口定义：明确客户端和服务器之间的接口，包括请求和响应格式。
错误处理：处理文件上传和OCR过程中的各种错误情况。
安全性：确保文件传输和处理过程中的数据安全。

二、系统架构

系统架构采用客户端-服务器架构，并使用RESTful API进行通信。主要组件包括：

前端：负责上传文件和请求OCR处理。
服务器：接收文件上传请求，并调用OCR模块进行文本识别。
OCR模块：负责对上传的文件进行OCR处理。

三、接口定义

1. 文件上传接口

请求方式：POST URL：/upload

请求头：Content-Type: multipart/form-data

请求参数：

file: 要上传的文件

请求示例：

POST /upload HTTP/1.1
Host: example.com
Content-Type: multipart/form-data; boundary=boundary
--boundary
Content-Disposition: form-data; name="file"; filename="document.pdf"
Content-Type: application/pdf
(file content)
--boundary--

响应：

状态码：200 OK
响应体：

{
  "message": "File uploaded successfully",
  "file_id": "12345"
}

2. OCR处理接口

请求方式：GET URL：/process_ocr

请求参数：

file_id: 上传文件时返回的文件ID

请求示例：

GET /process_ocr?file_id=12345 HTTP/1.1
Host: example.com

响应：

状态码：200 OK
响应体：

{
  "file_id": "12345",
  "ocr_result": "识别出的文本内容"
}

四、流程图

sequenceDiagram
    participant Client
    participant Server
    participant OCRModule
    
    Client->>Server: 上传文件 (POST /upload)
    Server-->>Client: 返回 file_id (JSON)
    Client->>Server: 请求OCR处理 (GET /process_ocr?file_id=12345)
    Server->>OCRModule: 进行OCR处理
    OCRModule-->>Server: 返回识别结果
    Server-->>Client: 返回OCR识别结果 (JSON)

五、错误处理和安全性

错误处理

文件上传失败：返回状态码400，并包含错误信息。
OCR处理失败：返回状态码500，并包含错误信息。

安全性

使用HTTPS：确保传输过程中的数据安全。
病毒扫描：对上传的文件进行病毒扫描，确保安全。
文件大小和类型限制：防止恶意文件攻击。

六、示例代码框架

文件上传接口

Controller

@RestController
@RequestMapping("/api")
public class FileController {

    @PostMapping("/upload")
    public ResponseEntity<?> uploadFile(@RequestParam("file") MultipartFile file) {
        // 文件存储逻辑
        String fileId = fileService.storeFile(file);
        return ResponseEntity.ok(new UploadResponse("File uploaded successfully", fileId));
    }
}

Service

@Service
public class FileService {
    public String storeFile(MultipartFile file) {
        // 实现文件存储逻辑，并返回文件ID
        String fileId = UUID.randomUUID().toString();
        // 这里假设文件存储在某个路径
        Path filePath = Paths.get("uploads", fileId);
        try {
            Files.copy(file.getInputStream(), filePath, StandardCopyOption.REPLACE_EXISTING);
        } catch (IOException e) {
            throw new FileStorageException("Failed to store file", e);
        }
        return fileId;
    }
}

OCR处理接口

Controller

@RestController
@RequestMapping("/api")
public class OCRController {

    @GetMapping("/process_ocr")
    public ResponseEntity<?> processOCR(@RequestParam("file_id") String fileId) {
        // OCR处理逻辑
        String ocrResult = ocrService.processFile(fileId);
        return ResponseEntity.ok(new OCRResponse(fileId, ocrResult));
    }
}

Service

@Service
public class OCRService {
    public String processFile(String fileId) {
        // 假设文件存储路径
        Path filePath = Paths.get("uploads", fileId);
        try {
            // 调用OCR库处理文件并返回结果
            String ocrResult = ocrLibrary.process(filePath);
            return ocrResult;
        } catch (Exception e) {
            throw new OCRProcessingException("Failed to process OCR", e);
        }
    }
}

自定义响应类

public class UploadResponse {
    private String message;
    private String fileId;

    public UploadResponse(String message, String fileId) {
        this.message = message;
        this.fileId = fileId;
    }

    // Getter 和 Setter 方法
}

public class OCRResponse {
    private String fileId;
    private String ocrResult;

    public OCRResponse(String fileId, String ocrResult) {
        this.fileId = fileId;
        this.ocrResult = ocrResult;
    }

    // Getter 和 Setter 方法
}

后续结合已有的响应类设计，这里仅仅是自己测试的示例响应类。

异常处理

@ResponseStatus(HttpStatus.INTERNAL_SERVER_ERROR)
public class FileStorageException extends RuntimeException {
    public FileStorageException(String message, Throwable cause) {
        super(message, cause);
    }
}

@ResponseStatus(HttpStatus.INTERNAL_SERVER_ERROR)
public class OCRProcessingException extends RuntimeException {
    public OCRProcessingException(String message, Throwable cause) {
        super(message, cause);
    }
}

前面，OCR处理逻辑并未使用具体的OCR库。现在，补充具体的OCR处理逻辑。

七、后端具体实现部分

我们将使用Tesseract OCR库来进行OCR处理。首先，确保在系统中安装Tesseract OCR库，并在项目中添加相关依赖。

添加依赖

在Spring Boot项目的pom.xml文件中添加Tesseract的依赖：

<dependency>
    <groupId>net.sourceforge.tess4j</groupId>
    <artifactId>tess4j</artifactId>
    <version>4.5.5</version>
</dependency>

文件上传接口

Controller

@RestController
@RequestMapping("/api")
public class FileController {

    @PostMapping("/upload")
    public ResponseEntity<?> uploadFile(@RequestParam("file") MultipartFile file) {
        // 文件存储逻辑
        String fileId = fileService.storeFile(file);
        return ResponseEntity.ok(new UploadResponse("File uploaded successfully", fileId));
    }
}

Service

@Service
public class FileService {
    public String storeFile(MultipartFile file) {
        String fileId = UUID.randomUUID().toString();
        Path filePath = Paths.get("uploads", fileId + "-" + file.getOriginalFilename());
        try {
            Files.copy(file.getInputStream(), filePath, StandardCopyOption.REPLACE_EXISTING);
        } catch (IOException e) {
            throw new FileStorageException("Failed to store file", e);
        }
        return filePath.toString();
    }
}

OCR处理接口

Controller

@RestController
@RequestMapping("/api")
public class OCRController {

    @GetMapping("/process_ocr")
    public ResponseEntity<?> processOCR(@RequestParam("file_id") String fileId) {
        String ocrResult = ocrService.processFile(fileId);
        return ResponseEntity.ok(new OCRResponse(fileId, ocrResult));
    }
}

Service

import net.sourceforge.tess4j.Tesseract;
import net.sourceforge.tess4j.TesseractException;

@Service
public class OCRService {

    private final Tesseract tesseract;

    public OCRService() {
        tesseract = new Tesseract();
        tesseract.setDatapath("path_to_tessdata"); // 设置tessdata路径
        tesseract.setLanguage("eng"); // 设置语言
    }

    public String processFile(String filePath) {
        try {
            return tesseract.doOCR(new File(filePath));
        } catch (TesseractException e) {
            throw new OCRProcessingException("Failed to process OCR", e);
        }
    }
}

至于错误处理、状态码等，将会结合后端的相关工具类进一步完善实现，后续完善补充。

分别测试后，继续完善，实现upload后的ocr处理：

我们将调整后端代码，在文件上传后立即触发OCR处理，并将结果返回给前端。

文件上传和OCR处理接口合并

Controller

@RestController
@RequestMapping("/api")
public class FileController {

    private final FileService fileService;
    private final OCRService ocrService;

    public FileController(FileService fileService, OCRService ocrService) {
        this.fileService = fileService;
        this.ocrService = ocrService;
    }


    //@PostMapping(value = "/api/upload", consumes = MediaType.MULTIPART_FORM_DATA_VALUE, produces = MediaType.APPLICATION_JSON_VALUE)
    @PostMapping("/upload")
    public ResponseEntity<OCRResponse> uploadAndProcessFile(@RequestParam("file") MultipartFile file) {
        String filePath = fileService.storeFile(file);
        String ocrResult = ocrService.processFile(filePath);
        return ResponseEntity.ok()
                .contentType(MediaType.APPLICATION_JSON)
                .body(new OCRResponse(filePath, ocrResult));
    }

}

Service

@Service
public class FileService {
    public String storeFile(MultipartFile file) {
        String fileId = UUID.randomUUID().toString();
        Path filePath = Paths.get("uploads", fileId + "-" + file.getOriginalFilename());
        try {
            Files.copy(file.getInputStream(), filePath, StandardCopyOption.REPLACE_EXISTING);
        } catch (IOException e) {
            throw new FileStorageException("Failed to store file", e);
        }
        return filePath.toString();
    }
}

import net.sourceforge.tess4j.Tesseract;
import net.sourceforge.tess4j.TesseractException;

@Service
public class OCRService {

    private final Tesseract tesseract;

    public OCRService() {
        tesseract = new Tesseract();
        tesseract.setDatapath("path_to_tessdata"); // 设置tessdata路径
        tesseract.setLanguage("eng"); // 设置语言
    }

    public String processFile(String filePath) {
        try {
            return tesseract.doOCR(new File(filePath));
        } catch (TesseractException e) {
            throw new OCRProcessingException("Failed to process OCR", e);
        }
    }
}

八、总结

分析了文件传输和OCR处理接口的设计和实现，包括需求分析、系统架构、接口定义、流程图、错误处理、安全性和示例代码。通过以上步骤，我们能够实现一个聊天界面界面中的文件上传功能，增加了OCR处理的能力。这种设计可以广泛应用于需要处理文档的场景中，为用户提供方便快捷的文件上传和文本识别服务。前端部分示例由下一篇继续。

m0_63187203

关注

27
点赞
踩
21

收藏

觉得还不错? 一键收藏
1
评论
项目实训记录-文件传输ocr接口后端设计

请求方式：POSTURL/upload请求头请求参数file: 要上传的文件请求示例--boundary响应状态码：200 OK响应体// Getter 和 Setter 方法// Getter 和 Setter 方法后续结合已有的响应类设计，这里仅仅是自己测试的示例响应类。分析了文件传输和OCR处理接口的设计和实现，包括需求分析、系统架构、接口定义、流程图、错误处理、安全性和示例代码。
复制链接

扫一扫