data for exp

1. lisv2 测试  rateplan 

2. rtm  获取不同hotel下的房间的room的相似性,得出score,根据attribute去判断是否和得出的similarity score一致

roomtypesimilarity, mostsimilarityroom, fullranking 排序  

3.abs测试

3. prefill 抓取hotel获取room信息,训练抓取模型

if __name__ == '__main__':
    start = datetime.datetime.now()

    SCRIPT = 'script'
    EXCEL_PATH_ONE = 'C:\PythonRelatedProject\compare\compare.xlsx'

    MANUAL = 'manual'
    EXCEL_PATH_RESULT= 'C:\PythonRelatedProject\compare\compare.xlsx'

    dataFrame_script = pd.read_excel(EXCEL_PATH_ONE, sheet_name=SCRIPT)
    dataFrame_Manual = pd.read_excel(EXCEL_PATH_RESULT, sheet_name=MANUAL)
#replace ' ' with 0
    dataFrame_script['RoomSize'].replace('', 0, inplace=True)
    dataFrame_script['RoomSize'].replace(np.nan, 0, inplace=True)
    dataFrame_script['RoomSize'].replace('unknown', 0, inplace=True)
    dataFrame_Manual['RoomSize'].replace('', 0, inplace=True)
    dataFrame_Manual['RoomSize'].replace(np.nan, 0, inplace=True)
    dataFrame_Manual['RoomSize'].replace('unknown', 0, inplace=True)
    dataFrame_Manual['RoomType'].replace('', 0, inplace=True)
    dataFrame_Manual['BedType'].replace('', 0, inplace=True)
    dataFrame_Manual['Smoking'].replace('', 0, inplace=True)

    #dataFrame_Result.drop(['ExtraAttributes', 'NumberOfRoomType'], axis=1, inplace=True)

    distance_result_list = list()
    count_script = 0
    for index, row_script in dataFrame_script.iterrows():
        #check if the new row of script is equal to the last one of manual,if not, it's a new hotel,begin with the 1st room of this hotel
        if temp_var != row_script['URL']:
            count_script = 0
        #when one roomtype of hotel from script finished compare with all roomtype of the same hotel from manual,go to next roomtype
        count_manual = 0
        count_script += 1
        #compared is used to set default not to compare
        compared = False
        #while one hotel's compare finished, break inner loop
        for index_m, row_manual in dataFrame_Manual.iterrows():
            #if row['URL'].strip() == rowr['URL'].strip():
            if row_script['URL'] == row_manual['URL']:
                #set compared to true means this row has been compared
                compared = True
                #erery time will set temp_var to the current row of manual
                temp_var = row_manual['URL']
                count_manual += 1
                # df = pd.DataFrame(columns=['URL', 'RoomName', 'RoomType', 'RoomClass', 'RoomSize', 'BedType', 'Wheelchair', 'Smoking', 'View'])
                df = pd.DataFrame(columns=['URL', 'RoomType', 'RoomClass', 'RoomSize', 'BedType', 'Smoking', 'View'])
                df.loc[0] = row_script
                df.loc[1] = row_manual
                compareobjects = str(count_script) + ' : ' + str(count_manual)
                print(str(index) +" " + str(index_m) +" " + row_script['URL'] + "  " + compareobjects)
                distance_result_list.extend(getDistance(df, 0, compareobjects))
            #finished comparing the current row of script with all the roomtype of the same hotel from manual
            else:
                count_manual = 0
            #break inner loop when finish comparing the current row of script with all the roomtype from the same hotel of manual
            if (compared == True) and (count_manual == 0):
                break

    df_distance_result = pd.DataFrame(np.array(distance_result_list), columns=['script', 'manual', 'difference', 'compareobjects'])
    # compare result write into a excel file
    now = time.strftime("%Y-%m-%d-%H_%M_%S", time.localtime(time.time()))
    writer = pd.ExcelWriter('C:\PythonRelatedProject\compare\distance_' + now + '.xls')

    df_distance_result.to_excel(writer)
    writer.save()
    end = datetime.datetime.now()
    print('Running time: %s Seconds' % (end - start))
    pass
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值