【笔记】汉字拼音互转(带音标和笔顺)共20842字

本文档记录了开发一个汉字与拼音、笔顺相互转换工具的过程,包括从网上爬取汉字拼音和笔顺数据,使用VS2019创建.NET Core Console项目,设计SQLite数据库模型,实现汉字转拼音、拼音转汉字、查询汉字部首和笔顺等功能。
摘要由CSDN通过智能技术生成

1  爬取拼音和笔顺

拼音爬自https://zidian.900cha.com/。数据文件汉字拼音带音标和笔顺共20842字(“壭亪寽兯嚸”这五个字没收)

笔顺爬自http://bs.kaishicha.com/。数据文件汉字笔顺共20842字(“壭亪寽兯嚸”这五个字没收)

public class CharUnit
{
    /// <summary>
    /// 汉字
    /// </summary>
    public char Char;
    /// <summary>
    /// 偏旁部首
    /// </summary>
    public char Radical;
    /// <summary>
    /// 总笔画数
    /// </summary>
    public byte StrokeCount;
    /// <summary>
    /// 笔顺
    /// </summary>
    public string Strokes;
    /// <summary>
    /// 拼音个数
    /// </summary>
    public byte PinyinCount;
    /// <summary>
    /// 拼音
    /// </summary>
    public string[] PinyinList;

    public static CharUnit Deserialize(BinaryReader binaryReader)
    {
        var charUnit = new CharUnit();
        charUnit.Char = binaryReader.ReadChar();
        charUnit.Radical = binaryReader.ReadChar();
        charUnit.StrokeCount = binaryReader.ReadByte();
        charUnit.Strokes = binaryReader.ReadString();
        charUnit.PinyinCount = binaryReader.ReadByte();
        charUnit.PinyinList = new string[(int)charUnit.PinyinCount];
        for (int i = 0; i < (int)charUnit.PinyinCount; i++)
        {
            charUnit.PinyinList[i] = binaryReader.ReadString();
        }
        return charUnit;
    }

    public void Serialize(BinaryWriter binaryWriter)
    {
        binaryWriter.Write(this.Char);
        binaryWriter.Write(this.Radical);
        binaryWriter.Write(this.StrokeCount);
        binaryWriter.Write(this.Strokes);
        binaryWriter.Write(this.PinyinCount);
        for (int i = 0; i < (int)this.PinyinCount; i++)
        {
            binaryWriter.Write(this.PinyinList[i]);
        }
    }
}

2  vs2019新建.net core console项目,NuGet导入

Microsoft.EntityFrameworkCore              //ef core
Microsoft.EntityFrameworkCore.Design       //在nuget
Microsoft.EntityFrameworkCore.Tools        //控制台中管理数据迁移
Microsoft.EntityFrameworkCore.Sqlite       //sqlite
Microsoft.EntityFrameworkCore.Sqlite.Core  //sqlite
HtmlAgilityPack                            //xpath

3  共五个表:汉字、部首、笔顺、拼音、拼音汉字many-to-many辅助表。部首和汉字是one-to-many,笔顺和汉字是one-to-one。

public class ChineseChar
{
    public ChineseChar()
        => PinYins = new JoinCollectionFacade<PinYin, PinYinChar>(
            PinYinChars,
            pyc => pyc.PinYin,
            py => new PinYinChar { PinYin = py, ChineseChar = this });

    public int ChineseCharId { get; set; }

    [Column(TypeName = "NCHAR(1)"), Required]
    public
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值