#获取项目报告管理-数据检查报告-数据清洗进度报表,然后将受试者状态添加到表格中,连同数据集一起作为原始数据。
#导入所需的库
import pandas as pd
from datetime import datetime
now=datetime.now()
import re
#读入page列表和数据集给药表单
writer=pd.ExcelWriter(r'C:\Users\XinXinbuX280\Desktop\E4F4P\output\E4F4缺失页面统计.xlsx')
pagefile=r'C:\Users\XinXinbuX280\Desktop\E4F4P\rawdata\PROD_leadingpharm2021012_数据清洗进度报表_zh-CN_20220112.xlsx'
datafile=r'C:\Users\XinXinbuX280\Desktop\E4F4P\rawdata\ODM_Datasets_XLS_Site-All_zh-CN_20220112084029.xlsx'
page=pd.read_excel(pagefile)
data=pd.read_excel(datafile,'EX')
exdate={} #受试者给药日期
predict_visit={} #给药受试者对应的应进行到的访视
for i in range(1,len(data)):
exdate[data['Subject_Id'][i]]=datetime.strptime(data['EXSTDAT'][i],'%Y-%m-%d')
#时间差对应的访视期
visit={'1':[-14,-3],'2':[-2,-2],'3':[-1,-1],'4':[1,1],'5':[2,2],'6':[3,3],'7':[4,4],'8':[5,5],'9':[6,6],'10':[7,7],'11':[8,11],'12':[12,14],'13':[15,21],'14':[22,28],'15':[29,35],'16':[36,49],'21':[50,67],'22':[68,1000]}
for i in exdate.keys():
for n in visit.keys():
if (now-exdate[i]).days>=visit[n][0] and (now-exdate[i]).days<=visit[n][1]:
predict_visit[i]=n
print(n)
page_v=[]
#将每一条记录对应的应进行到的访视期标记在最后一列。--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
for i in range(len(page)):
page_v.append(' ')
if page['受试者状态'][i]=='已入组' and page['筛选号'][i] not in exdate.keys():
page_v[i]=4
if page['受试者状态'][i]=='已入组' and page['筛选号'][i] in exdate.keys():
page_v[i]=predict_visit[page['筛选号'][i]]
if page['受试者状态'][i]=='筛选失败':
page_v[i]='3'
if page['受试者状态'][i]=='筛选中':
page_v[i]='1'
if page['受试者状态'][i]=='已完成':
page_v[i]='19'
if page['受试者状态'][i]=='提前退出':
page_v[i]='4'
page.insert(len(page.columns),'应进行到的访视期',page_v)
page.to_excel(writer,'进行到的访视',index=False)
#删除提前退出访视页面、已删除状态页面,除必填页面外的筛败受试者空白页-------------------------------------------------------
fail=['DM--人口学信息, #1','IC--知情同意, #1','SV--受试者访视信息, #1','RAND--筛选结论, #1']
for i in range(len(page)):
if page['受试者状态'][i] != '提前退出' and page['访视'][i]=='V17--提前退出':
page.drop(i,inplace=True)
continue
if page['表单状态'][i]=='已删除':
page.drop(i,inplace=True)
continue
if page['表单状态'][i]=='空白页':
if page['受试者状态'][i]=='筛选失败':
if page['表单名称'][i] not in fail:
page.drop(i,inplace=True)
continue
if page['表单名称'][i]=='SV--受试者访视信息, #1' and page['访视'][i] != 'V1--筛选期V1(D-14~D-2)':
page.drop(i,inplace=True)
continue
#入组受试者未进行到的访视空白页删除-------------------------------------------------------------------------------------
for i in page.index:
#V20单独计算,若应进行到的访视小于等于V11(
if page['表单状态'][i]=='空白页' and page['访视'][i] =='V20--观察期V20(D10)':
if page['访视'][i]=='V20--观察期V20(D10)' and int(page['应进行到的访视期'][i])<=11:
page.drop(i,inplace=True)
continue
elif page['表单状态'][i]=='空白页' and page['受试者状态'][i]=='提前退出':
if len(re.findall('观察期',page['访视'][i])) !=0:
page.drop(i,inplace=True)
continue
elif page['表单状态'][i]=='空白页' and page['受试者状态'][i]=='筛选中':
page.drop(i,inplace=True)
continue
elif page['表单状态'][i]=='空白页' and page['受试者状态'][i]=='已入组' and page['访视'][i] not in ['V19--共同页','V18--计划外访视','V18.01--计划外访视'] and int(re.findall('V(\d+)',page['访视'][i])[0])>int(page['应进行到的访视期'][i]):
page.drop(i,inplace=True)
continue
elif page['表单状态'][i]=='空白页' and page['受试者状态'][i]=='已入组' and page['访视'][i] =='V19--共同页' and page['表单名称'][i]=='TS--试验完成情况, #1':
page.drop(i,inplace=True)
continue
# elif page['表单状态'][i]=='空白页' and page['受试者状态'][i]=='已入组' and page['访视'][i]=='V19--共同页':
# page.drop(i,inplace=True)
# continue
else:
continue
page.to_excel(writer,index=False,sheet_name='保留页面')
#统计-------------------------------------------------------------------------------------------------------------------------------
writer.save()
Python用于临床数据管理-根据访视时间窗计算缺失页面
最新推荐文章于 2024-07-29 14:37:54 发布