# 在Linux上曲线救国实现语音输入：我的踩坑与替代方案全记录---搜狗语音模块

kai__csdn

于 2025-02-09 17:28:36 发布

阅读量490

点赞数 7

文章标签： linux 运维服务器经验分享学习 python

本文链接：https://blog.csdn.net/kai__csdn/article/details/145534241

版权

在Linux上曲线救国实现语音输入：我的踩坑与替代方案全记录

缘起：当Linux遇上语音输入

在学习使用linux过程中，发现没有很好的语音输入方法：在Windows/Mac上习以为常的语音输入功能，在Linux生态中竟成了奢侈品。市面主流输入法的Linux版本大多阉割了语音模块，而本地ASR引擎的部署比较复杂（还没折腾）。本文将提供一个方法，单独使用搜狗输入法的语言模块。

一、传统方案的滑铁卢

1.1 输入法生态调研

讯飞输入法：Linux版无语音模块
搜狗输入法：官方.deb包仅保留基础功能
百度输入法：同款功能缺失
本地ASR引擎：声学模型、语言模型、解码器配置。。。。。。。。

1.2 Wine安装尝试

通过Deepin官方仓库安装Wine环境后，直接在wine中安装搜狗输入法。
下载链接

https://ime-sec.gtimg.com/202502091719/f92ce9aed37af208602894a6339f5991/pc/dl/gzindex/1737187571/sogou_pinyin_15.1a.exe?f=pinyinbanner&dcs=c1cd74d7553a75f3c36d485945646588

二、柳暗花明的发现

在~/.wine/drive_c/Program Files/SogouInput目录中发现独立组件：

Components/
└── VoiceInput/
    └── 1.0.0.52/
        ├── VoiceInput.exe  # 语音输入核心模块

2.2 独立运行验证

通过指定Wine容器直接运行：

env WINEPREFIX=~/.sogou_voice wine "C:/Program Files/SogouInput/Components/VoiceInput/1.0.0.52/VoiceInput.exe"

成功弹出语音输入窗口，测试麦克风正常响应！

三、碎片化输出的自动化改造

3.1:但是有一个小问题

语音识别结果会以逐短句更新的方式写入剪贴板，导致：

需要频繁手动粘贴（Ctrl+V）
输入过程中文本碎片化
无法实现连续语音输入

3.2 自动化解决方案

开发Python监控脚本实现自动粘贴：

pyperclip：跨平台剪贴板监控
xdotool：X11窗口系统自动化
psutil：进程状态检测

首先安装依赖

sudo apt install xdotool python3-psutil

然后运行语言识别模块
再运行python自动识别粘贴脚本

import pyperclip
import time
import subprocess
from psutil import process_iter
from argparse import ArgumentParser

def is_voice_input_active():
    target_exe = "VoiceInput.exe"
    for proc in process_iter(['name', 'cmdline']):
        if proc.info['name'] == 'wine-preloader' and \
           target_exe in (proc.info['cmdline'] or []):
            return True
    return False

def monitor_clipboard(paste_delay=0.2):
    last_content = pyperclip.paste()
    while True:
        current_content = pyperclip.paste()
        if current_content != last_content and is_voice_input_active():
            time.sleep(paste_delay)  # 等待内容完全写入
            subprocess.run(['xdotool', 'key', 'ctrl+v'])
            last_content = current_content  # 更新最后内容防止重复触发
        time.sleep(0.1)

if __name__ == "__main__":
    parser = ArgumentParser(description="Auto-paste when VoiceInput updates clipboard.")
    parser.add_argument('--delay', type=float, default=0.2, help="Delay before pasting (seconds)")
    args = parser.parse_args()
    monitor_clipboard(args.delay)