PO(Portable Object)文件转换excel/csv，实现自动化更新翻译文件

bier_盖子

已于 2023-03-28 13:28:15 修改

阅读量1.4k

点赞数

文章标签： excel 自动化 python

于 2023-03-28 13:22:16 首次发布

本文链接：https://blog.csdn.net/teddy_dewei_lu/article/details/129814111

版权

文章介绍了一个使用Python编写的脚本，该脚本旨在将PO文件转换为Excel和CSV格式，以适应TMS系统中的在线翻译，同时也能将翻译后的文件转换回PO格式。主要依赖的库包括Flask-Babel、openpyxl、pandas和polib。脚本通过命令行接口提供输出和输入文件的选项，支持excel、csv和json格式的转换。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

PO(Portable Object)文件转换excel/csv，实现自动化更新翻译文件

由于长时间没有登陆，密码也忘记了，自己也比较懒不想重置密码，于是放任在这里，直到最近突然无意中输对了密码。恩，那我就又回来了，好多年没有更新了。那就更个最近因为 po文件翻译太麻烦的处理。

源于最近有的项目用了 python,然后python 比较出名的多语言用的是pybabel, 然而pybabel的文件是生成po文件的。对于翻译人员，要用特定的工具，并不友善，然后我们公司也是自己写有一套TMS系统的，可以通过导入excel或者csv文件在线翻译，而后再导出，放回自有项目当中。查找了一轮线上的内容，没有发现符合我需求的，于是就自己琢磨着，写个转换吧，方便自己，也方便翻译人员。

好，前事就提这么多吧。废话不多说，直接开始。

requirements.txt文件需要引入这些文件

Flask-Babel==2.0.0
openpyxl==3.1.2
pandas==1.5.1
polib==1.2.0
argparse=1.4.0
colorama==0.4.6

Flask-Babel国际化方案 Flask-Babel 是Flask的扩展，它在babel、pytz和 speaklater的帮助下为任何 Flask 应用程序添加 i18n 和 l10n 支持。它内置了对日期格式化和时区支持的支持，以及一个非常简单友好的gettext 翻译界面。
openpyxl 一个专门处理excel文件的python扩展库
pandas python很出名的数据分析处理的扩展库
polib polib 是一个用于操作、创建、修改 gettext 文件（pot、po 和 mo 文件）的库。是我们这个项目主要的处理python扩展库
argparse argparse 模块是python编写用户友好的命令行扩展，是我们此项目另一个最重要的库。
colorama 用于生成彩色终端文本和光标定位的 python扩展库（可不安装）。

我们的目的主要是通过使用python自定义命令处理po文件与翻译文件的相互转换

首先我们的主体程序, 我直接贴代码吧，因为我使用colorama,所以我重写了 parser的处理文件，如果不使用，可以去掉相关代码

import argparse
import os
from colorama import init, Fore

# 获取当前目录位置
dir_path = os.path.dirname(os.path.abspath(__file__))

# colorama的自定义初始化
# init(autoreset=True)

class ThrowingArgumentParser(argparse.ArgumentParser):
    def error(self, message):
        # 重写error, 视情况自己具体实现
        pass
    
parser = ThrowingArgumentParser(
    description=f"{Fore.BLUE}实现po文件转excel/csv文件,以及excel/csv/json文件转换po文件{Fore.RESET}",
    formatter_class=argparse.RawTextHelpFormatter
)

try:
    subparsers = parser.add_subparsers(
        help=f"""{Fore.LIGHTCYAN_EX}lang_dump.py out [-h] \n{Fore.LIGHTCYAN_EX}lang_dump.py in [-h]""",
        description=f"{Fore.BLUE}out子命令为po文件转excel/csv命令 | in子命令为翻译后文件转存po文件命令{Fore.RESET}",
    )

    # 输出为excel/csv文件
    parser_out = subparsers.add_parser("out", formatter_class=argparse.RawTextHelpFormatter)
    parser_out.add_argument(
        "-t", "--type", choices=['excel', 'csv'], type=str, default="excel,csv",
        help=f"""{Fore.BLUE}需要转存的文件格式,目前只支持excel和csv格式文件,默认会输出两种格式的文件
{Fore.LIGHTBLUE_EX}example:{Fore.LIGHTCYAN_EX} lang_dump.py out -t excel""")
    parser_out.set_defaults(func=args_out)

    # 转换回po文件
    parser_in = subparsers.add_parser("in", formatter_class=argparse.RawTextHelpFormatter)
    parser_in.add_argument(
        "-t", "--type", choices=['excel', 'csv', 'json'], type=str, required=True,
        help=f"""{Fore.BLUE}指定传入的文件类型,只支持 excel(*.xlsx),csv,json格式 文件
{Fore.LIGHTBLUE_EX}example:{Fore.LIGHTCYAN_EX} lang_dump.py in -t json{Fore.RESET}""")
    parser_in.add_argument(
        "-f", "--file", action="append", required=True,
        help=f"""{Fore.BLUE}指定传入的文件名称,建议放在当前目录下导入,多个翻译文件 使用多个 -f 引入
{Fore.LIGHTBLUE_EX}example:{Fore.LIGHTCYAN_EX} lang_dump.py in -f zh-CN.json -f en.json{Fore.RESET}""")
    parser_in.add_argument(
        "-l", "--lang", action="append", choices=['zh', 'en'], required=True,
        help=f"""{Fore.BLUE}指定语言版本,按--file参数引入的文件排序,顺序对应,多个语言包使用多个 -l 引入
{Fore.LIGHTBLUE_EX}example:{Fore.LIGHTCYAN_EX} lang_dump.py in -l zh -l en{Fore.RESET}""")
    parser_in.set_defaults(func=args_in)

    # 读取参数
    args = parser.parse_args()
    args.func(args)
except Exception as ex:
    parser.print_help()
    raise

我的导出代码，代码也是比较简单易懂

使用polib读取po文件内容，然后循环通过openpyxl写入excel文件，最后在保存进去excel文件

import os
import shutil

import pandas as pd
import polib
from colorama import init, Fore
from openpyxl.workbook import Workbook

dir_path = "..."

def args_out(args):
    tp_list = str(args.type).split(",")
    """po 转 excel"""
    # 先更新一次语言包
    command = f"""
        cd {dir_path}/../.. &&
        pybabel extract -F babel.cfg -o messages.pot . && 
        pybabel update -i messages.pot -d translations 
        """
    os.system(command)

    wb = Workbook()
    # grab the active worksheet
    ws = wb.active
    ws.title = u'sheet1'
    # 翻译文案的模板顺序 KEY(语言key)，zh-cn(中文)，en(英文)
    ws.append(['KEY', 'zh-cn', 'en'])

    zh_po_file = f'{dir_path}/../../translations/zh/LC_MESSAGES/messages.po'
    en_po_file = f'{dir_path}/../../translations/en/LC_MESSAGES/messages.po'
    zh_po = polib.pofile(zh_po_file, encoding='UTF-8')
    en_po = polib.pofile(en_po_file, encoding='UTF-8')
    # print(zh_po.fuzzy_entries())

    for zh_msg in zh_po:
        for en_msg in en_po:
            if en_msg.msgid == zh_msg.msgid:
                ws.append([zh_msg.msgid, zh_msg.msgstr, en_msg.msgstr])
                break

    excel_file = 'messages.xlsx'
    out_path = f'{dir_path}/out'
    if os.path.exists(out_path):
        shutil.rmtree(f'{out_path}/')
    os.makedirs(out_path)

    wb.save(f'{out_path}/{excel_file}')

    excel_data = pd.read_excel(f'{out_path}/{excel_file}', index_col=0)
    excel_data.to_csv(f'{out_path}/messages.csv', encoding='utf-8')

    wb.close()

    print(Fore.LIGHTGREEN_EX + "\n-----------> out success\n")
    print(Fore.LIGHTGREEN_EX + f"导出{args.type}文件成功, 导出目录为{out_path}")

我的导入代码，也比较简单，思路是可以通过 json/excel(*.xlsx)/csv文件导入进来转化为po文件

我的json文件是分开 zh.json en.json文件，分别处理的，excel/csv文件则按照顺序写入语言包数据再通过pandas处理数据，(ps：pandas真是谁用谁知知道，处理数据的能力真是优雅)，csv跟excel的处理就不写出来了，都是基础知识写出来的代码

import json
import os

import polib
from colorama import init, Fore

dir_path = "..."

class CommandArgvError(Exception):
    pass

def args_in(args):
    """excel/csv/json 转 po"""
    for i in range(len(args.lang)):
        po = polib.POFile()
        po_file = f'{dir_path}/../../translations/{args.lang[i]}/LC_MESSAGES/messages.po'
        if args.type == 'json':
            # json数据处理
            with open(f'{dir_path}/{args.file[i]}', 'r+', encoding='utf-8') as f:
                for line in f:
                    try:
                        json.loads(line)
                    except:
                        raise CommandArgvError("json文件格式有误")

            with open(f'{dir_path}/{args.file[i]}', 'r+', encoding='utf-8') as json_file:
                json_data = json.load(json_file)

            for key in json_data:
                entry = polib.POEntry(
                    msgid=key,
                    msgstr=json_data[key],
                )
                po.append(entry)

        if args.type == 'csv':
            # csv数据处理 balabala
            pass

        if args.type == 'excel':
            # excel数据处理 balabala
            pass

        po.save(po_file)

    # 执行pybabel语言包的更新操作
    command = f"""
    cd {dir_path}/../.. &&
    pybabel update -i messages.pot -d translations && 
    pybabel compile -d translations 
    """
    os.system(command)

    print(Fore.LIGHTGREEN_EX + "\n-----------> in success\n")
    print(Fore.LIGHTGREEN_EX + "文件转化成功,语言包已经更新")

最后执行命令，可以查看提示

$ python lang_dump.py -h
usage: lang_dump.py [-h] {out,in} ...

实现po文件转excel/csv文件,以及excel/csv/json文件转换po文件

optional arguments:
  -h, --help  show this help message and exit

subcommands:
  out子命令为po文件转excel/csv命令 | in子命令为翻译后文件转存po文件命令

  {out,in}    lang_dump.py out [-h] 
              lang_dump.py in [-h]

导出由于我是默认导出 excel/csv 所以不传入参数就会两个文件都会生成。

导出语言包命令(只支持导出excel和csv格式文件)

$ python lang_dump.py out -h
usage: lang_dump.py out [-h] [-t {excel,csv}]

optional arguments:
  -h, --help            show this help message and exit
  -t {excel,csv}, --type {excel,csv}
                        需要转存的文件格式,目前只支持excel和csv格式文件,默认会输出两种格式的文件
                        example: lang_dump.py out -t excel

Example

$ python lang_dump.py out
.
.
.

-----------> out success


导出excel,csv文件成功, 导出目录为/home/teddy/PythonProjects/admin-dataportal-api/commands/language/out

翻译好的文件转换回po文件(只支持传入excel/csv/json格式文件)

$ python lang_dump.py in -h 
usage: lang_dump.py in [-h] -t {excel,csv,json} -f FILE -l {zh,en}

optional arguments:
  -h, --help            show this help message and exit
  -t {excel,csv,json}, --type {excel,csv,json}
                        指定传入的文件类型,只支持 excel(*.xlsx),csv,json格式 文件
                        example: lang_dump.py in -t json
  -f FILE, --file FILE  指定传入的文件名称,建议放在当前目录下导入,多个翻译文件 使用多个 -f 引入
                        example: lang_dump.py in -f zh-CN.json -f en.json
  -l {zh,en}, --lang {zh,en}
                        指定语言版本,按--file参数引入的文件排序,顺序对应,多个语言包使用多个 -l 引入
                        example: lang_dump.py in -l zh -l en

导入我只展示json的用法，其他也差不多

Example1 (json文件,语言包顺序务必与文件导入顺序一致)

$ python lang_dump.py in -t json -f zh-CN.json -f en.json -l zh -l en

updating catalog translations/en/LC_MESSAGES/messages.po based on messages.pot
updating catalog translations/zh/LC_MESSAGES/messages.po based on messages.pot
compiling catalog translations/en/LC_MESSAGES/messages.po to translations/en/LC_MESSAGES/messages.mo
compiling catalog translations/zh/LC_MESSAGES/messages.po to translations/zh/LC_MESSAGES/messages.mo

-----------> in success

文件转化成功,语言包已经更新