关于 python机器学习根据中文名字判断性别

最新推荐文章于 2023-04-18 16:40:00 发布

码农的世界，你不懂

最新推荐文章于 2023-04-18 16:40:00 发布

阅读量1.3k

点赞数

分类专栏： python

本文链接：https://blog.csdn.net/u010395024/article/details/103726725

版权

本文介绍如何使用Python进行机器学习，通过中文姓名预测个体的性别。内容包括数据预处理、特征提取、模型训练及预测应用。

摘要由CSDN通过智能技术生成

完整代码如下： name.csv 需要自己采集数据import tensorflow as tf

name_dataset = './name.csv'
train_x = []
train_y = []
with open(name_dataset, 'r',encoding='UTF-8') as f:
first_line = True
for line in f:
      if first_line is True:
         first_line = False
         continue
      sample = line.strip().split(',')
      if len(sample) == 2:
         train_x.append(sample[0])
         if sample[1] == '男':
            train_y.append([0, 1])  # 男
         else:
            train_y.append([1, 0])  # 女

max_name_length = max([len(name) for name in train_x])
# print("最长名字的字符数: ", max_name_length)
max_name_length = 8
counter = 0
vocabulary = {}
for name in train_x:
counter += 1
tokens = [word for word in name]
for word in tokens:
      if word in vocabulary:
         vocabulary[word] += 1
      else:
         vocabulary[word] = 1

vocabul