[paddlepaddle文本分类样例代码]使用预训练模型Fine-tune完成中文文本分类任务

一、电脑运行环境

显卡:单块Nvidia RTX3090
驱动: 460.73.01
CUDA版本: 11.2
cudnn版本:8.2.0
paddlepaddle版本:paddlepaddle-gpu==2.1.3

二、初始化代码运行环境

#初始化conda环境
conda create -n paddlenlp python=3.7

conda activate paddlenlp
#安装paddlepaddle
conda install paddlepaddle-gpu==2.1.3 cudatoolkit=11.2 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/Paddle/ -c conda-forge
 
pip install --upgrade paddlenlp

 
#获取文本分类测试代码
git clone https://github.com/PaddlePaddle/PaddleNLP.git 
cd PaddleNLP/examples/text_classification/pretrained_models/

三、训练模型

python -m paddle.distributed.launch --gpus "0" train.py --device gpu --save_dir ./checkpoints

最后训练结果

eval loss: 0.22238, accu: 0.94167
global step 810, epoch: 3, batch: 210, loss: 0.02384, accu: 0.98438, speed: 3.04 step/s
global step 820, epoch: 3, batch: 220, loss: 0.05576, accu: 0.97969, speed: 16.15 step/s
global step 830, epoch: 3, batch: 230, loss: 0.03062, accu: 0.98229, speed: 16.15 step/s
global step 840, epoch: 3, batch: 240, loss: 0.06318, accu: 0.98438, speed: 16.10 step/s
global step 850, epoch: 3, batch: 250, loss: 0.16337, accu: 0.98438, speed: 15.47 step/s
global step 860, epoch: 3, batch: 260, loss: 0.07645, accu: 0.98490, speed: 15.83 step/s
global step 870, epoch: 3, batch: 270, loss: 0.02989, accu: 0.98661, speed: 16.12 step/s
global step 880, epoch: 3, batch: 280, loss: 0.14670, accu: 0.98711, speed: 15.94 step/s
global step 890, epoch: 3, batch: 290, loss: 0.06679, accu: 0.98507, speed: 16.11 step/s
global step 900, epoch: 3, batch: 300, loss: 0.07022, accu: 0.98594, speed: 16.18 step/s
eval loss: 0.21322, accu: 0.94667
INFO 2021-10-21 10:49:35,730 launch.py:268] Local processes completed.

四、导出模型

python deploy/python/predict.py --model_dir=./export

输出结果

[2021-10-21 10:55:41,299] [    INFO] - Already cached /home/ubuntu/.paddlenlp/models/ernie-tiny/vocab.txt
[2021-10-21 10:55:41,299] [    INFO] - Already cached /home/ubuntu/.paddlenlp/models/ernie-tiny/spm_cased_simp_sampled.model
[2021-10-21 10:55:41,299] [    INFO] - Already cached /home/ubuntu/.paddlenlp/models/ernie-tiny/dict.wordseg.pickle
W1021 10:55:44.848312 31977 device_context.cc:404] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.2, Runtime API Version: 11.2
W1021 10:55:44.850705 31977 device_context.cc:422] device: 0, cuDNN Version: 8.1.
Data: 这个宾馆比较陈旧了,特价的房间也很一般。总体来说一般 	 Label: negative
Data: 怀着十分激动的心情放映,可是看着看着发现,在放映完毕后,出现一集米老鼠的动画片!开始还怀疑是不是赠送的个别现象,可是后来发现每张DVD后面都有!真不知道生产商怎么想的,我想看的是猫和老鼠,不是米老鼠!如果厂家是想赠送的话,那就全套米老鼠和唐老鸭都赠送,只在每张DVD后面添加一集算什么??简直是画蛇添足!! 	 Label: negative

五、使用Paddle Serving API进行推理部署

#安装环境依赖

wget https://paddle-serving.bj.bcebos.com/others/centos_ssl.tar && \
    tar xf centos_ssl.tar && rm -rf centos_ssl.tar && \
    mv libcrypto.so.1.0.2k /usr/lib/libcrypto.so.1.0.2k && mv libssl.so.1.0.2k /usr/lib/libssl.so.1.0.2k && \
    ln -sf /usr/lib/libcrypto.so.1.0.2k /usr/lib/libcrypto.so.10 && \
    ln -sf /usr/lib/libssl.so.1.0.2k /usr/lib/libssl.so.10 && \
    ln -sf /usr/lib/libcrypto.so.10 /usr/lib/libcrypto.so && \
    ln -sf /usr/lib/libssl.so.10 /usr/lib/libssl.so
    
#安装代码依赖

pip install paddle-serving-app paddle-serving-client paddle-serving-server-gpu


#Serving的模型和配置导出
python -u deploy/serving/export_servable_model.py \
    --inference_model_dir ./export/ \
    --model_file inference.pdmodel \
    --params_file inference.pdiparams
    
#服务启动
python -m python -m paddle_serving_server.serve \
    --model ./serving_server \
    --port 8090 \
    --gpu_id 0

六、客户端预测

python deploy/serving/client.py \
    --client_config_file ./serving_client/serving_client_conf.prototxt \
    --server_ip_port 127.0.0.1:8090 \
    --max_seq_length 128

预测结果

[2021-10-20 16:51:27,305] [    INFO] - Already cached /home/ubuntu/.paddlenlp/models/ernie-tiny/vocab.txt
[2021-10-20 16:51:27,306] [    INFO] - Already cached /home/ubuntu/.paddlenlp/models/ernie-tiny/spm_cased_simp_sampled.model
[2021-10-20 16:51:27,306] [    INFO] - Already cached /home/ubuntu/.paddlenlp/models/ernie-tiny/dict.wordseg.pickle
WARNING: Logging before InitGoogleLogging() is written to STDERR
I1020 16:51:30.886700 29786 naming_service_thread.cpp:209] brpc::policy::ListNamingService("127.0.0.1:8090"): added 1
Data: 这个宾馆比较陈旧了,特价的房间也很一般。总体来说一般 	 Label: negative
Data: 怀着十分激动的心情放映,可是看着看着发现,在放映完毕后,出现一集米老鼠的动画片 	 Label: positive
Data: 作为老的四星酒店,房间依然很整洁,相当不错。机场接机服务很好,可以在车上办理入住手续,节省时间。 	 Label: positive

参考:

1.使用预训练模型Fine-tune完成中文文本分类任务

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值