vosk模块的模型可以识别出基础的文字,识别率还是可以的。使用简单,只要把包解压就好了,实测windows和linux都可以用。想要包的可以联系我:)
VOSKMODELPATH就是我配置的vosk包的位置。
public static String getWord(String path) { String filePath = voiceToAudio(path); File f = new File(filePath); try{ Assert.isTrue(StringUtils.hasLength(VOSKMODELPATH), "无效的VOS模块!"); byte[] bytes = Files.readAllBytes(Paths.get(filePath)); // 转换为16KHZ reSamplingAndSave(bytes, filePath); RandomAccessFile rdf = null; rdf = new RandomAccessFile(f, "r"); short track=toShort(read(rdf, 22, 2)); rdf.close(); LibVosk.setLogLevel(LogLevel.WARNINGS); Model model = new Model(VOSKMODELPATH); InputStream ais = AudioSystem.getAudioInputStream(new BufferedInputStream(new FileInputStream(filePath))); // 采样率为音频采样率的声道倍数 Recognize