1、dnccano runtime搭建
环境:python 3.9.6(要求python -version >=3.9)
参考资料:
PaddleNLP/doccano.md at develop · PaddlePaddle/PaddleNLP (github.com)
doccano/doccano: Open source annotation tool for machine learning practitioners. (github.com)
1.1、虚拟环境搭建
1.1.1、切换系统python到对应版本
1.1.2、命令行打开目标目录,输入如下命令(DRE为doccano运行的虚拟环境名称 )
python -m venv DRE
1.1.3、命令行打开scripts目录并激活虚拟环境(激活后自动打开了新的终端)
1.1.4、更新pip
python -m pip install --upgrade pip
1.1.5、在scripts下新建requirements.txt文件(安装doccano与依赖)
-->可使用如下命令安装
pip install doccano
-->需要注意的是 djangorestframework3.14.0会导致环境出问题,因此运行如下命令
pip install djangorestframework!=3.14.0
-->使用requirements.txt安装依赖继续往下,否则转到1.5.7 doccano初始化
-->写入依赖内容到requirements.txt(依赖如下)
amqp==5.1.1
asgiref==3.5.2
auto-labeling-pipeline==0.1.21
billiard==3.6.4.0
boto3==1.24.80
botocore==1.27.80
cachetools==5.2.0
celery==5.2.7
certifi==2022.9.24
chardet==4.0.0
charset-normalizer==2.1.1
click==8.1.3
click-didyoumean==0.3.0
click-plugins==1.1.1
click-repl==0.2.0
colorama==0.4.5
coreapi==2.3.3
coreschema==0.0.4
defusedxml==0.7.1
dj-database-url==0.5.0
dj-rest-auth==2.2.5
Django==4.1.1
django-celery-results==2.2.0
django-cleanup==6.0.0
django-cors-headers==3.13.0
django-drf-filepond==0.4.1
django-filter==21.1
django-health-check==3.17.0
django-polymorphic==3.1.0
django-rest-polymorphic==0.1.10
django-storages==1.13.1
djangorestframework==3.13.1
djangorestframework-xml==2.0.0
doccano==1.8.0
drf-yasg==1.21.3
environs==9.5.0
et-xmlfile==1.1.0
filetype==1.1.0
furl==2.1.3
google-api-core==2.10.1
google-auth==2.11.1
google-cloud-core==2.3.2
google-cloud-storage==2.5.0
google-crc32c==1.5.0
google-resumable-media==2.3.3
googleapis-common-protos==1.56.4
greenlet==1.1.3
gunicorn==20.1.0
idna==3.4
inflection==0.5.1
itypes==1.2.0
Jinja2==3.1.2
jmespath==1.0.1
joblib==1.2.0
kombu==5.2.4
lml==0.1.0
MarkupSafe==2.1.1
marshmallow==3.18.0
numpy==1.23.3
openpyxl==3.0.10
orderedmultidict==1.0.1
packaging==21.3
pandas==1.5.0
prompt-toolkit==3.0.31
protobuf==4.21.6
pyasn1==0.4.8
pyasn1-modules==0.2.8
pydantic==1.10.2
pyexcel==0.7.0
pyexcel-io==0.6.6
pyexcel-xlsx==0.6.0
pyparsing==3.0.9
python-dateutil==2.8.2
python-dotenv==0.21.0
pytz==2022.2.1
requests==2.28.1
rsa==4.9
ruamel.yaml==0.17.21
ruamel.yaml.clib==0.2.6
s3transfer==0.6.0
scikit-learn==1.1.2
scipy==1.9.1
seqeval==1.2.2
shortuuid==1.0.9
six==1.16.0
SQLAlchemy==1.4.41
sqlparse==0.4.3
texttable==1.6.4
threadpoolctl==3.1.0
typing_extensions==4.3.0
tzdata==2022.2
uritemplate==4.1.1
urllib3==1.26.12
vine==5.0.0
waitress==2.1.2
wcwidth==0.2.5
whitenoise==6.2.0
1.5.6、安装依赖
-->运行如下命令
pip install -r requirements.txt
-->安装成功展示
1.5.7、以上doccano runtime搭建完成,进行doccano初始化
After installation, run the following commands:
# Initialize database.
doccano init
# Create a super user.
doccano createuser --username admin --password pass
# Start a web server.
doccano webserver --port 8000
In another terminal, run the following command:
# Start the task queue to handle file upload/download.
doccano task
Go to http://127.0.0.1:8000/.
-->初始化成功展示
-->终端执行如下命令
doccano webserver --port 8000
-->另一个终端执行,而后浏览器访问地址127.0.0.1:8000
doccano task
-->至此doccano初始化成功
2、快捷运行
思路:使用bat批处理文件
-->选定一目录创建scripts文件夹
-->将scripts文件夹路径加入系统path
-->创建task.txt文件,并写入如下内容(其中C:\env\python\DRE为doccano runtime位置)
cd C:\env\python\DRE\Scripts
call activate.bat
doccano task
-->创建webserver.txt文件写入如下内容
cd C:\env\python\DRE\Scripts
call activate.bat
doccano webserver --port 8000
-->将上述两个文件后缀改成bat,任意位置新建startdoccano.txt文件写入以下内容
@echo off
%1(start /min cmd.exe /c %0 :& exit )
start webserver.bat
start task.bat
-->修改startdoccano.txt后缀为bat,双击startdoccano.bat文件即可运行