大语言模型model官方文件整理【roberta_wwm,bert_wwm,bert,xlnet....】

tensorflow版本的模型 :
链接:https://pan.baidu.com/s/10tjVfypoQy6G_mkZK6cqOQ?pwd=yljr
提取码:yljr
模型文件名模型简称语料版本地址github地址加载方式huggingface加载
chinese_roberta_wwm_large_ext_L-24_H-1024_A-16RoBERTa-wwm-ext-large, Chinese中文维基百科,其他百科、新闻、问答等数据,总词数达5.4Btensorflow百度网盘 请输入提取码GitHub - ymcui/Chinese-BERT-wwm: Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
hflchinese-roberta-wwm-ext-largepytorchhttps://huggingface.co/hfl/chinese-roberta-wwm-ext-largehfl/chinese-roberta-wwm-ext-large
chinese_roberta_wwm_ext_L-12_H-768_A-12RoBERTa-wwm-ext, Chinesetensorflow百度网盘 请输入提取码
hflchinese-roberta-wwm-extpytorchhttps://huggingface.co/hfl/chinese-roberta-wwm-exthfl/chinese-roberta-wwm-ext
chinese_bert_wwm_ext_L-12_H-768_A-12BERT-wwm-ext, Chinesetensorflow百度网盘 请输入提取码
hflchinese-bert-wwm-extpytorchhfl/chinese-bert-wwm-ext · Hugging Facehfl/chinese-bert-wwm-ext
chinese_bert_wwm_L-12_H-768_A-12BERT-wwm, Chinesetensorflow百度网盘 请输入提取码
hfl/chinese-bert-wwm-extpytorchhfl/chinese-bert-wwm
GitHub - brightmart/roberta_zh: RoBERTa中文预训练模型: RoBERTa for Chinese
roberta_zh_l12RoBERTa_zh_L1230G原始文本,近3亿个句子,100亿个中文字(token),产生了2.5亿个训练数据(instance);覆盖新闻、社区问答、多个百科数据等;tensorflowroberta_zh_l12.zip_免费高速下载|百度网盘-分享无限制Bert 直接加载
roeberta_zh_L-24_H-1024_A-16RoBERTa-zh-Largetensorflowroeberta_zh_L-24_H-1024_A-16.zip_免费高速下载|百度网盘-分享无限制Bert 直接加载
RoBERTa_zh_L12_PyTorchRoBERTa_zh_L12pytorchRoBERTa_zh_L12_PyTorch.zip_免费高速下载|百度网盘-分享无限制Bert的PyTorch版直接加载
chinese_L-12_H-768_A-12bert_base, Chinese中文维基百科https://storage.googleapis.com/bert_models/2018_11_03/chinese_L-12_H-768_A-12.zipGitHub - google-research/bert: TensorFlow code and pre-trained models for BERT
uncased_L-2_H-128_A-2bert_tiny, 24个bert_uncased模型https://storage.googleapis.com/bert_models/2020_02_20/all_bert_models.zip
uncased_L-4_H-256_A-4bert_mini, 24个bert_uncased 模型
uncased_L-4_H-512_A-8bert_small,24个bert_uncased
uncased_L-8_H-512_A-8bert_medium, 24个bert_uncased
uncased_L-12_H-768_A-12bert_base, 24个bert_uncased
chinese_xlnet_mid_L-24_H-768_A-12XLNet-mid, Chinese中文维基百科,其他百科、新闻、问答等数据,总词数达5.4Btensorflow百度网盘 请输入提取码GitHub - ymcui/Chinese-XLNet: Pre-Trained Chinese XLNet(中文XLNet预训练模型)hfl/chinese-xlnet-mid
hflchinese-xlnet-midpytorchhttps://huggingface.co/hfl/chinese-xlnet-mid
chinese_xlnet_base_L-12_H-768_A-12XLNet-base, Chinesetensorflow百度网盘 请输入提取码hfl/chinese-xlnet-base
hflchinese-xlnet-basepytorchhttps://huggingface.co/hfl/chinese-xlnet-base
albert_tiny_zhalbert_tiny_zhhttps://storage.googleapis.com/albert_zh/albert_tiny.zipGitHub - brightmart/albert_zh: A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS, 海量中文预训练ALBERT模型
albert_tiny_489kalbert_tiny_zh(训练更久,累积学习20亿个样本)https://storage.googleapis.com/albert_zh/albert_tiny_489k.zip
albert_tiny_zh_googlealbert_tiny_google_zh(累积学习10亿个样本,google版本)https://storage.googleapis.com/albert_zh/albert_tiny_zh_google.zip
albert_small_zh_googlealbert_small_google_zh(累积学习10亿个样本,google版本)https://storage.googleapis.com/albert_zh/albert_small_zh_google.zip
albert_large_zhalbert_large_zhhttps://storage.googleapis.com/albert_zh/albert_large_zh.zip
albert_base_zhalbert_base_zh(小模型体验版)https://storage.googleapis.com/albert_zh/albert_base_zh.zip
albert_base_zh_additional_36k_stepsalbert_base_zh(额外训练了1.5亿个实例即 36k steps * batch_size 4096)https://storage.googleapis.com/albert_zh/albert_base_zh_additional_36k_steps.zip
albert_xlarge_zh_177kalbert_xlarge_zh_177khttps://storage.googleapis.com/albert_zh/albert_xlarge_zh_177k.zip
albert_xlarge_zh_183kalbert_xlarge_zh_183k(优先尝试)https://storage.googleapis.com/albert_zh/albert_xlarge_zh_183k.zip
voidfulalbert_chinese_tinypytorchhttps://huggingface.co/voidful/albert_chinese_tinyvoidful/albert_chinese_tiny
voidfulalbert_chinese_smallpytorchhttps://huggingface.co/voidful/albert_chinese_smallvoidful/albert_chinese_small
voidfulalbert_chinese_basepytorchvoidful/albert_chinese_base · Hugging Facevoidful/albert_chinese_base
voidfulalbert_chinese_largepytorchhttps://huggingface.co/voidful/albert_chinese_largevoidful/albert_chinese_large
voidfulalbert_chinese_xlargepytorchvoidful/albert_chinese_xlarge · Hugging Facevoidful/albert_chinese_xlarge
voidfulalbert_chinese_xxlargepytorchhttps://huggingface.co/voidful/albert_chinese_xxlargevoidful/albert_chinese_xxlarge
electra_180g_largeELECTRA-180g-large, Chinesetensorflow百度网盘 请输入提取码GitHub - ymcui/Chinese-ELECTRA: Pre-trained Chinese ELECTRA(中文ELECTRA预训练模型)hfl/chinese-electra-180g-large-discriminator
hflchinese-electra-180g-large-discriminatorpytorchhttps://huggingface.co/hfl/chinese-electra-180g-large-discriminator
electra_180g_baseELECTRA-180g-base, Chinesetensorflow百度网盘 请输入提取码hfl/chinese-electra-180g-base-discriminator
hflchinese-electra-180g-base-discriminatorpytorchhttps://huggingface.co/hfl/chinese-electra-180g-base-discriminator
electra_180g_small_exELECTRA-180g-small-ex, Chinesetensorflow百度网盘 请输入提取码hfl/chinese-electra-180g-small-ex-discriminator
hflchinese-electra-180g-small-ex-discriminatorpytorchhttps://huggingface.co/hfl/chinese-electra-180g-small-ex-discriminator
electra_180g_smallELECTRA-small, Chinesetensorflow百度网盘 请输入提取码hfl/chinese-electra-180g-small-discriminator
hflchinese-electra-180g-small-discriminatorpytorchhttps://huggingface.co/hfl/chinese-electra-180g-small-discriminator
chinese_macbert_largeMacBERT-large, Chinesetensorflow百度网盘 请输入提取码GitHub - ymcui/MacBERT: Revisiting Pre-trained Models for Chinese Natural Language Processing (MacBERT)hfl/chinese-macbert-large
hflchinese-macbert-largepytorchhttps://huggingface.co/hfl/chinese-macbert-large
chinese_macbert_baseMacBERT-base, Chinesetensorflow百度网盘 请输入提取码hfl/chinese-macbert-base
hflchinese-macbert-basepytorchhttps://cdn-lfs.huggingface.co/hfl/chinese-macbert-base
Erlangshen-SimCSE-110M-Chinesepytorch
unsup-simcse-bert-base-uncasedpytorch
sup-simcse-roberta-largepytorch
sup-simcse-bert-base-uncasedpytorch
albert-base-chinese-cluecorpussmallpytorch
RoBERTa中文预训练模型 概述 中文预训练RoBERTa模型 RoBERTa是BERT的改进版,通过改进训练任务和数据生成方式、训练更久、使用更大批次、使用更多数据等获得了State of The Art的效果;可以用Bert直接加载。 本项目是用TensorFlow实现了在大规模中文上RoBERTa的预训练,也会提供PyTorch的预训练模型和加载方式。 中文预训练RoBERTa模型-下载 6层RoBERTa体验版 RoBERTa-zh-Layer6: Google Drive 或 百度网盘,TensorFlow版本,Bert 直接加载, 大小为200M 推荐 RoBERTa-zh-Large 通过验证 RoBERTa-zh-Large: Google Drive 或 百度网盘 ,TensorFlow版本,Bert 直接加载 RoBERTa-zh-Large: Google Drive 或 百度网盘 ,PyTorch版本,Bert的PyTorch版直接加载 RoBERTa 24/12层版训练数据:30G原始文本,近3亿个句子,100亿个中文字(token),产生了2.5亿个训练数据(instance);覆盖新闻、社区问答、多个百科数据等; 本项目与中文预训练24层XLNet模型 XLNet_zh项目,使用相同的训练数据。 RoBERTa_zh_L12: Google Drive 或 百度网盘 TensorFlow版本,Bert 直接加载 RoBERTa_zh_L12: Google Drive 或百度网盘 PyTorch版本,Bert的PyTorch版直接加载 Roberta_l24_zh_base TensorFlow版本,Bert 直接加载 24层base版训练数据:10G文本,包含新闻、社区问答、多个百科数据等 什么是RoBERTa: 一种强大的用于预训练自然语言处理(NLP)系统的优化方法,改进了Transformers或BERT的双向编码器表示形式,这是Google在2018年发布的自监督方法。 RoBERTa在广泛使用的NLP基准通用语言理解评估(GLUE)上产生最先进的结果。 该模型在MNLI,QNLI,RTE,STS-B和RACE任务上提供了最先进的性能,并在GLUE基准上提供了可观的性能改进。 RoBERTa得分88.5,在GLUE排行榜上排名第一,与之前的XLNet-Large的表现相当。 效果测试与对比 Performance 互联网新闻情感分析:CCF-Sentiment-Analysis 模型 线上F1 BERT 80.3 Bert-wwm-ext 80.5 XLNet 79.6 Roberta-mid 80.5 Roberta-large (max_seq_length=512, split_num=1) 81.25 注:数据来源于guoday的开源项目;数据集和任务介绍见:CCF互联网新闻情感分析 自然语言推断:XNLI 模型 开发集 测试集 BERT 77.8 (77.4) 77.8 (77.5) ERNIE 79.7 (79.4) 78.6 (78.2) BERT-wwm 79.0 (78.4) 78.2 (78.0) BERT-wwm-ext 79.4 (78.6) 78.7 (78.3) XLNet 79.2 78.7 RoBERTa-zh-base 79.8 78.8 RoBERTa-zh-Large 80.2 (80.0) 79.9 (79.5) 注:RoBERTa_l24_zh,只跑了两次,Performance可能还会提升; BERT-wwm-ext来自于这里;XLNet来自于这里; RoBERTa-zh-base,指12层RoBERTa中文模型 问题匹配语任务:LCQMC(Sentence Pair Matching) 模型 开发集(Dev) 测试集(Test) BERT 89.4(88.4) 86.9(86.4) ERNIE 89.8 (89.6) 87.2 (87.0) BERT-wwm 89.4 (89.2) 87.0 (86.8) BERT-wwm-ext - - RoBERTa-zh-base 88.7 87.0 RoBERTa-zh-Large 89.9(89.6) 87.2(86.7) RoBERTa-zh-Large(20w_steps) 89.7 87.0 注:RoBERTa_l24_zh,只跑了两次,Performance可能还会提升。保持训练轮次和论文一致: 阅读理解测试 目前阅读理解类问题bert和roberta最优参数均为epoch2, batch=32, lr=3e-5, warmup=0.1 cmrc20
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

tiki_taka_

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值