层次分析法详解(AHP),python代码,楼盘综合水平评价算法

该文章讲述了层次分析法代码讲解,以及如何应用到楼盘综合水平分析当中


转载请说明原文出处!!


目录

该文章讲述了层次分析法代码讲解,以及如何应用到楼盘综合水平分析当中

一、层次分析法代码

1.1 生成判断矩阵的简单方法

1.1.1 代码

1.1.2 讲解

1.2 获得判断矩阵的最大特征值和对应的特征向量

1.2.1 代码

1.2.2 讲解

1.3 获得RI值

1.3.1 代码

1.3.2 讲解

1.4 进行一致性检验

1.4.1 代码

1.4.2 讲解

1.5 最大特征值对应的特征向量的归一化

1.5.1 代码

1.5.2 讲解

1.6 以上几步的综合

1.6.1 代码

1.6.2 讲解

二、基于重构的多层次分析法的楼盘综合水平评价算法

2.1 说明

2.2 论文

2.3 开源

2.4 代码

2.5 数据


一、层次分析法代码

1.1 生成判断矩阵的简单方法

1.1.1 代码

def get_judgement_matrix(scores):
    '''
    get judgement matrix  according to personal score.
    :param scores: a list, the item is the score range 1 to 10 means the importance of each sub-indicator.
    :return: judgement matrix, item range 1 to 9.

    - more: in judgement matrix:
    1 means two sub-indicators are the same important.

    3 means the first sub-indicator is a little important than another one.

    5 means the first sub-indicator is apparently important than another one.

    7 means the first sub-indicator is strongly significant than another one.

    9 means the first sub-indicator is extremely significant than another one.

    and 2, 4, 6, 8 are in the middle degree.
    '''

    # 评分1——10
    length = len(scores)

    array = np.zeros((length, length))
    for i in range(0, length):
        for j in range(0, length):
            point1 = scores[i]
            point2 = scores[j]
            deta = point1 - point2
            if deta < 0:
                continue
            elif deta == 0 or deta == 1:
                array[i][j] = 1
                array[j][i] = 1
            else:
                array[i][j] = deta
                array[j][i] = 1 / deta

    return array

1.1.2 讲解

原先的方法需要构建出判断矩阵,需要填入一个个矩阵元素,表示谁比谁的重要程度,在这里是利用重要性相减。

第一步,对指标的重要性进行打分1-10,以list存。

第二步,将list喂给函数,得到判断矩阵。

1.2 获得判断矩阵的最大特征值和对应的特征向量

1.2.1 代码

def get_tezheng(array):
    '''
    get the max eigenvalue and eigenvector
    :param array: judgement matrix
    :return: max eigenvalue and the corresponding eigenvector
    '''
    # 获取最大特征值和对应的特征向量
    te_val, te_vector = np.linalg.eig(array)
    list1 = list(te_val)

    max_val = np.max(list1)
    index = list1.index(max_val)
    max_vector = te_vector[:, index]

    return max_val, max_vector

1.2.2 讲解

把矩阵喂给函数就行了。

1.3 获得RI值

1.3.1 代码

def RImatrix(n):
    '''
    get RI value according the the order
    :param n: matrix order
    :return: Random consistency index RI of a n order matrix
    '''
    n1 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
    n2 = [0, 0, 0.52, 0.89, 1.12, 1.26, 1.36, 1.41, 1.46, 1.49, 1.52, 1.54, 1.56, 1.58, 1.59, 1.60]
    d = dict(zip(n1, n2))
    return d[n]

1.3.2 讲解

字典,根据矩阵的维度返回RI值。

1.4 进行一致性检验

1.4.1 代码

def consitstence(max_val, RI, n):
    '''
    use the CR indicator to test the consistency of a matrix.
    :param max_val: eigenvalue
    :param RI: Random consistency index
    :param n: matrix order
    :return: true or false, denotes whether it meat the validation of consistency
    '''
    CI = (max_val - n) / (n - 1)
    if RI == 0:

        return True
    else:
        CR = CI / RI
        if CR < 0.10:

            return True
        else:

            return False

1.4.2 讲解

CR小于0.1,一致性检验过关,否则不过关

1.5 最大特征值对应的特征向量的归一化

1.5.1 代码

def normalize_vector(max_vector):
    '''
    normalize the vector, the sum of elements is 1.0
    :param max_vector: a eigenvector
    :return: normalized eigenvector
    '''
    vector = []
    for i in max_vector:
        vector.append(i.real)
    vector_after_normalization = []
    sum0 = np.sum(vector)
    for i in range(len(vector)):
        vector_after_normalization.append(vector[i] / sum0)
    vector_after_normalization = np.array(vector_after_normalization)

    return vector_after_normalization

1.5.2 讲解

归一化后的向量就是子指标权重向量,元素加起来值为1。

1.6 以上几步的综合

1.6.1 代码

def get_weight(score):
    '''
    get weight vector according to personal score.
    :param score: a list, the item is the score range 1 to 10 means the importance of each sub-indicator.
    :return: a list, the item is the weight range 0.0 to 1.0.
    '''
    n = len(score)
    array = get_judgement_matrix(score)
    max_val, max_vector = get_tezheng(array)
    RI = RImatrix(n)
    if consitstence(max_val, RI, n) == True:
        feature_weight = normalize_vector(max_vector)
        return feature_weight
    else:
        return [1 / n] * n

1.6.2 讲解

就是把上面几步放到了一个函数中,输入子指标的打分向量,得到重要性权重向量。

二、基于重构的多层次分析法的楼盘综合水平评价算法

2.1 说明

  • 这是2018年-2019年的一个大创项目,用爬虫把安居客-长春地区的楼盘信息全给扒下来了,然后做数据分析,项目以一篇论文和软著结题了。
  • 这个方法还挺好用,可以适用于结构化数据和非结构化数据,例如数值的,离散特征的,文本的等等。
  • 文中瞎捣鼓了一个基于心理满足度的归一化函数,强行塞了个sigmoid函数和一个线性函数,现在看着挺low的,我的数学不好,有数学好的同学可以自己再弄个归一化函数。
  • 楼盘的二级三级指标挺多的,好些年前,数据没那么复杂,现在数据变复杂多了,指标也挺多样化。目前我在实习,需要对邮政业务进行综合指标的建立,层次达到了4-5层,指标多达百来个,看着头疼。
  • AHP经常出现在数学建模里面,同学们可以看看,直接调用get_weight函数就能得到权重,挺方便的。
  • 大创是个好东西,建议参加,我们这个论文因为赶时间没能好好搞,建议立项后早完成,别立项了就不管了。
  • 开源了算法,没开源java平台:个性化辅助购房平台,项目设计是在平台上打分,然后http请求将json个性偏好数据传到python算法中,将评价结果返回。java的开源后续可能有,请关注与期待。
  • 最近还会开源我本科所做的所有项目,十分丰富,内容包括机器学习方面的聚类,分类,决策树,SVM,L2回归,ARIMA等等;还包括深度学习方面的LSTM交通流,包裹量预测,TextCNN文本分类,知识图谱KGE,KG-Completion算法等。工程上也有,例如基于微信小程序的电商平台。总之东西有点多,反正现在也没书读,干脆做整理与总结吧,后续还会建立一个个人主页。
  • 请来个关注,点赞,收藏,有问题欢迎留言。

2.2 论文

基于多层次分析法的楼盘综合水平评价算法

2.3 开源

开源项目(只有算法部分)

2.4 代码

# -*- coding: utf-8 -*-
# @Time : 2019/1/5 11:28
# @Author : RedCedar
# @File : run.py
# @Software: PyCharm
# @note:

import pandas as pd
from math import *
import matplotlib.pyplot as plt
import matplotlib as mpl
import numpy as np
from scipy.optimize import  fsolve
import math

mpl.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
mpl.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号
mpl.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
mpl.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号


def get_judgement_matrix(scores):
    '''
    get judgement matrix  according to personal score.
    :param scores: a list, the item is the score range 1 to 10 means the importance of each sub-indicator.
    :return: judgement matrix, item range 1 to 9.

    - more: in judgement matrix:
    1 means two sub-indicators are the same important.

    3 means the first sub-indicator is a little important than another one.

    5 means the first sub-indicator is apparently important than another one.

    7 means the first sub-indicator is strongly significant than another one.

    9 means the first sub-indicator is extremely significant than another one.

    and 2, 4, 6, 8 are in the middle degree.
    '''

    # 评分1——10
    length = len(scores)

    array = np.zeros((length, length))
    for i in range(0, length):
        for j in range(0, length):
            point1 = scores[i]
            point2 = scores[j]
            deta = point1 - point2
            if deta < 0:
                continue
            elif deta == 0 or deta == 1:
                array[i][j] = 1
                array[j][i] = 1
            else:
                array[i][j] = deta
                array[j][i] = 1 / deta

    return array


def get_tezheng(array):
    '''
    get the max eigenvalue and eigenvector
    :param array: judgement matrix
    :return: max eigenvalue and the corresponding eigenvector
    '''
    # 获取最大特征值和对应的特征向量
    te_val, te_vector = np.linalg.eig(array)
    list1 = list(te_val)

    max_val = np.max(list1)
    index = list1.index(max_val)
    max_vector = te_vector[:, index]

    return max_val, max_vector


def RImatrix(n):
    '''
    get RI value according the the order
    :param n: matrix order
    :return: Random consistency index RI of a n order matrix
    '''
    n1 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
    n2 = [0, 0, 0.52, 0.89, 1.12, 1.26, 1.36, 1.41, 1.46, 1.49, 1.52, 1.54, 1.56, 1.58, 1.59, 1.60]
    d = dict(zip(n1, n2))
    return d[n]


def consitstence(max_val, RI, n):
    '''
    use the CR indicator to test the consistency of a matrix.
    :param max_val: eigenvalue
    :param RI: Random consistency index
    :param n: matrix order
    :return: true or false, denotes whether it meat the validation of consistency
    '''
    CI = (max_val - n) / (n - 1)
    if RI == 0:

        return True
    else:
        CR = CI / RI
        if CR < 0.10:

            return True
        else:

            return False


def minMax(array):
    result = []
    for x in array:
        x = float(x - np.min(array)) / (np.max(array) - np.min(array))
        result.append(x)
    return np.array(result)


def normalize_vector(max_vector):
    '''
    normalize the vector, the sum of elements is 1.0
    :param max_vector: a eigenvector
    :return: normalized eigenvector
    '''
    vector = []
    for i in max_vector:
        vector.append(i.real)
    vector_after_normalization = []
    sum0 = np.sum(vector)
    for i in range(len(vector)):
        vector_after_normalization.append(vector[i] / sum0)
    vector_after_normalization = np.array(vector_after_normalization)

    return vector_after_normalization


def get_weight(score):
    '''
    get weight vector according to personal score.
    :param score: a list, the item is the score range 1 to 10 means the importance of each sub-indicator.
    :return: a list, the item is the weight range 0.0 to 1.0.
    '''
    n = len(score)
    array = get_judgement_matrix(score)
    max_val, max_vector = get_tezheng(array)
    RI = RImatrix(n)
    if consitstence(max_val, RI, n) == True:
        feature_weight = normalize_vector(max_vector)
        return feature_weight
    else:
        return [1 / n] * n


def getScore(array, point1, point2):
    '''
    a normalization function based on Human psychological satisfaction
    :param array: list, element is indicator's original value
    :param point1: the left expectation point, a list, [x1,y1]
    :param point2: the right expectation point, a list, [x2,y2]
    :return: normalized array
    '''
    x1 = point1[0]
    x2 = point2[0]
    y1 = point1[1]
    y2 = point2[1]

    def f1(a):
        equation1 = 1 / (1 + math.exp(-a * x1)) - y1
        return equation1

    def f2(a):
        equation1 = 1 / (1 + math.exp(-a * x2)) - y2
        return equation1

    # 存储归一化后的值
    values = []

    for i in array:
        try:
            i=i[0]
        except:
            pass
        if i < x1:
            sol3_fsolve = fsolve(f1, [0])
            a = sol3_fsolve[0]
            value = 1 / (1 + math.exp(a * (i - 2 * x1)))
        elif x1 <= i and i <= x2:
            value = (i - x1) * (y2 - y1) / (x2 - x1) + y1
        else:
            sol3_fsolve = fsolve(f2, [0])
            a = sol3_fsolve[0]
            value = 1 / (1 + math.exp(-a * i))
        values.append(value)

    # plt.scatter(array, values)
    # plt.show()
    return values


def show_score(value, title=''):
    x = np.linspace(1, len(value) + 1, len(value))
    plt.scatter(x, value)
    plt.title(title)
    plt.show()


def result(dataDict):
    def price_score():
        '''

        :return: 返回价格指数
        '''

        # 单价
        df = pd.read_csv('all.csv', index_col=0)
        eachPrice_array = df.loc[:, ['单价']].values
        # print('eachPrice_array')
        # print(eachPrice_array)

        # 单价分数
        eachPrice_value = getScore(eachPrice_array[:, 0], [dataDict['price[each_price_min]'], 0.8],
                                   [dataDict['price[each_price_max]'], 0.3])

        # 物业费
        propertyFee_array = df.loc[:, ['物业费']].values
        # 物业费分数
        propertyFee_value = getScore(propertyFee_array, [dataDict['price[property_price_min]'], 0.8],
                                     [dataDict['price[property_price_max]'], 0.3])

        # 楼盘单价  物业费
        price_score = [dataDict['price[price_each]'], dataDict['price[price_property]']]
        price_weight = get_weight(price_score)

        prices_values = []
        for i in range(0, len(propertyFee_array)):
            a1 = eachPrice_value[i]
            a2 = propertyFee_value[i]
            price_value = price_weight * [a1, a2]
            prices_values.append(sum(price_value))

        # show_score(prices_values,'价格指数')
        return prices_values

    def traffic_score():
        df = pd.read_csv('all.csv', index_col=0)
        array = df.loc[:, ['站点数目', '平均距离', '平均线路数目', '平均打分', '平均轨道数目']].values

        # 站点数目打分
        zhandianshumu_value = getScore(array[:, 0], [dataDict['traffic[zhandianshumu_min]'], 0.2],
                                       [dataDict['traffic[zhandianshumu_max]'], 0.8])

        # 平均距离打分
        pinjunjuli_value = getScore(array[:, 1], [dataDict['traffic[pingjunjuli_min]'], 0.8],
                                    [dataDict['traffic[pingjunjuli_max]'], 0.4])

        # 平均线路数目打分
        pingjunxianlu_value = getScore(array[:, 2], [dataDict['traffic[pingjungongjiaoxianlu_min]'], 0.2],
                                       [dataDict['traffic[pingjungongjiaoxianlu_max]'], 0.8])

        # 平均打分打分
        pingjundafen_value = getScore(array[:, 3], [dataDict['traffic[pingjunjiaotongdefen_min]'], 0.2],
                                      [dataDict['traffic[pingjunjiaotongdefen_max]'], 0.8])

        # 平均轨道数目打分
        pingjunguidao_value = getScore(array[:, 4], [dataDict['traffic[pingjunguidaojiaotong_min]'], 0.2],
                                       [dataDict['traffic[pingjunguidaojiaotong_max]'], 0.8])

        # 站点数目	平均距离	平均线路数目	平均打分	平均轨道数目
        traffic_score = [dataDict['traffic[traffic_zhandianshumu]'], dataDict['traffic[traffic_pingjunjuli]'],
                         dataDict['traffic[traffic_pingjungongjiaoxianlu]'],
                         dataDict['traffic[traffic_pingjunjiaotongdefen]'],
                         dataDict['traffic[traffic_pingjunguidaojiaotong]']]
        # 交通各个因素的权重为:
        traffic_weight = get_weight(traffic_score)

        traffic_values = []
        for i in range(0, len(array)):
            a1 = zhandianshumu_value[i]
            a2 = pinjunjuli_value[i]
            a3 = pingjunxianlu_value[i]
            a4 = pingjundafen_value[i]
            a5 = pingjunguidao_value[i]
            traffic_value = traffic_weight * [a1, a2, a3, a4, a5]
            traffic_values.append(sum(traffic_value))
        # show_score(traffic_values,'交通指数')
        return traffic_values

    def community_score():
        '''
        楼盘硬件设施方面
        :return:
        '''
        # 容积率
        # 容积率得分,容积率是越小越好
        # 容积率分为:独立别墅为0.2~0.5;
        # 联排别墅为0.4~0.7;
        # 6层以下多层住宅为0.8~1.2;
        # 11层小高层住宅为1.5~2.0;
        # 18层高层住宅为1.8~2.5;
        # 19层以上住宅为2.4~4.5;
        # 住宅小区容积率小于1.0的,为非普通住宅.
        df = pd.read_csv('all.csv', index_col=0)
        plotRate_array = df.loc[:, ['容积率']].values
        plotRate_value = getScore(plotRate_array[:, 0], [dataDict['community[plotRate_min]'], 0.8],
                                  [dataDict['community[plotRate_max]'], 0.3])

        # 绿化率
        greenRate_array = df.loc[:, ['绿化率']].values
        greenRate_value = getScore(greenRate_array[:, 0], [dataDict['community[greenRate_min]'], 0.4],
                                   [dataDict['community[greenRate_max]'], 0.8])

        # 车位比
        parkProportion_array = df.loc[:, ['车位比']].values
        parkProportion_value = getScore(parkProportion_array[:, 0], [dataDict['community[parkProportion_min]'], 0.2],
                                        [dataDict['community[parkProportion_max]'], 0.8])

        # 容积率  绿化率	车位比
        community_score = [dataDict['community[community_plotRate]'], dataDict['community[community_greenRate]'],
                           dataDict['community[community_parkProportion]']]
        community_weight = get_weight(community_score)

        community_values = []
        for i in range(0, len(parkProportion_array)):
            a1 = plotRate_value[i]
            a2 = greenRate_value[i]
            a3 = parkProportion_value[i]
            community_value = community_weight * [a1, a2, a3]
            community_values.append(sum(community_value))
        # show_score(community_values,'小区设施')
        return community_values

    def muti_score():
        # 建筑类型得分
        # '高层', '花园洋房', '别墅_建筑类型', '多层','住宅','商住','店铺','购物中心','商业街'
        buildingType_score = [dataDict['muti[buildType_gaoCeng]'],
                              dataDict['muti[buildType_huaYuanYangFang]'],
                              dataDict['muti[buildType_bieShu]'],
                              dataDict['muti[buildType_duoCeng]'],
                              dataDict['muti[buildType_zhuZhai]'],
                              dataDict['muti[buildType_shangZhu]'],
                              dataDict['muti[buildType_dianPu]'],
                              dataDict['muti[buildType_gouWuZhongXin]'],
                              dataDict['muti[buildType_shangYeJie]']]
        buildingType_weight = get_weight(buildingType_score)

        df = pd.read_csv('all.csv', index_col=0)
        bulidingType_array = df.loc[:, ['高层', '花园洋房', '别墅_建筑类型', '多层', '住宅', '商住', '店铺', '购物中心', '商业街']].values
        buildingType_value = []
        for i in bulidingType_array:
            buildingType_value.append(np.sum(buildingType_weight * i))
        buildingType_value = minMax(buildingType_value)

        # 楼盘特征得分
        # 其他0,低总价15, 公交枢纽14, 低密度13, 婚房12, 投资地产11,
        #  公园10, 商场9, 公寓8, 大型超市7, 轨交房6, 大型社区5, 品牌开发商4, 改善房3, 学校2, 刚需房1
        feature_score = [1,
                         dataDict['muti[feature_diZongJia]'],
                         dataDict['muti[feature_gongJiaoShuNiu]'],
                         dataDict['muti[feature_diMiDu]'],
                         dataDict['muti[feature_hunFang]'],
                         dataDict['muti[feature_touZiDiChan]'],
                         dataDict['muti[feature_gongYuan]'],
                         dataDict['muti[feature_shangChang]'],
                         dataDict['muti[feature_gongYu]'],
                         dataDict['muti[feature_daXingChaoShi]'],
                         dataDict['muti[feature_guiJiaoFang]'],
                         dataDict['muti[feature_daXingSheQu]'],
                         dataDict['muti[feature_pinPaiKaiFaShang]'],
                         dataDict['muti[feature_gaiShanFang]'],
                         dataDict['muti[feature_xueXiao]'],
                         dataDict['muti[feature_gangXuFang]']]
        feature_weight = get_weight(feature_score)

        feature_array = df.loc[:,
                        ['其他', '刚需房', '学校', '改善房', '品牌开发商', '大型社区', '轨交房', '大型超市', '公寓', '商场', '公园', '投资地产', '婚房',
                         '低密度', '公交枢纽', '低总价']].values
        feature_value = []
        for i in feature_array:
            feature_value.append(np.sum(feature_weight * i))
        feature_value = minMax(buildingType_value)

        # 户型类型得分
        # 1室	2室	3室	4室	5室	6室	7室	别墅	商户
        houseType_score = [dataDict['muti[houseType_1]'],
                           dataDict['muti[houseType_2]'],
                           dataDict['muti[houseType_3]'],
                           dataDict['muti[houseType_4]'],
                           dataDict['muti[houseType_5]'],
                           dataDict['muti[houseType_6]'],
                           dataDict['muti[houseType_7]'],
                           dataDict['muti[houseType_bieShu]'],
                           dataDict['muti[houseType_shangHu]']]
        houseType_weight = get_weight(houseType_score)

        houseType_array = df.loc[:, ['1室', '2室', '3室', '4室', '5室', '6室', '7室', '别墅_户型', '商户']].values
        houseType_value = []
        for i in houseType_array:
            value = i * houseType_weight
            value = np.sum(value)
            houseType_value.append(value)
        houseType_value = getScore(houseType_value, [0.4, 0.2], [2, 0.8])

        # 建筑类型 楼盘特征 户型
        muti_score = [dataDict['muti[muti_buildType]'], dataDict['muti[muti_feature]'],
                      dataDict['muti[muti_houseType]']]
        muti_weight = get_weight(muti_score)

        muti_values = []
        for i in range(0, len(bulidingType_array)):
            a1 = buildingType_value[i]
            a2 = feature_value[i]
            a3 = houseType_value[i]
            muti_value = muti_weight * [a1, a2, a3]
            muti_values.append(sum(muti_value))
        # show_score(muti_values,'楼盘多样性')
        return muti_values

    def location_score():
        def manhatten(point1, point2):
            def haversine(point1, point2):  # 经度1,纬度1,经度2,纬度2 (十进制度数)
                """
                Calculate the great circle distance between two points
                on the earth (specified in decimal degrees)
                """
                # 将十进制度数转化为弧度
                lon1 = point1[0]
                lat1 = point1[1]
                lon2 = point2[0]
                lat2 = point2[1]
                lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])

                # haversine公式
                dlon = lon2 - lon1
                dlat = lat2 - lat1
                a = sin(dlat / 2) ** 2 + cos(lat1) * cos(lat2) * sin(dlon / 2) ** 2
                c = 2 * asin(sqrt(a))
                r = 6371  # 地球平均半径,单位为公里
                return c * r

            p = [point2[0], point1[1]]
            return haversine(point1, p) + haversine(p, point2)

        # 地域得分
        region_score = [1,
                        dataDict['location[region_nanGuan]'],
                        dataDict['location[region_zhaoYang]'],
                        dataDict['location[region_jingYue]'],
                        dataDict['location[region_lvYuan]'],
                        dataDict['location[region_erDao]'],
                        dataDict['location[region_gaoXin]'],
                        dataDict['location[region_jingKai]'],
                        dataDict['location[region_kuanCheng]'],
                        dataDict['location[region_qiKai]'],
                        1]
        region_weight = get_weight(region_score)

        df = pd.read_csv('all.csv', index_col=0)
        region_array = df.loc[:, ['其他', '南关', '朝阳', '净月', '绿园', '二道', '高新', '经开', '宽城', '汽开', '长春周边']].values
        region_value = []
        for i in region_array:
            region_value.append(np.sum(region_weight * i))
        region_value = minMax(region_value)

        # 平均曼哈段距离
        # 经纬度
        jw_array = df.loc[:, ['经度', '纬度']].values
        points = [[125.330665, 43.917598],
                  [125.313642, 43.898338],
                  [125.457405, 43.808973],
                  [125.453503, 43.778884],
                  [125.307394, 43.876633],
                  [125.296092, 43.870066],
                  [125.322061, 43.872801],
                  [125.330277, 43.897067],
                  [125.31221, 43.882698],
                  [125.453503, 43.778884]]
        jw_value = []
        for i in jw_array:
            manhatten_distance = 0
            for j in points:
                manhatten_distance += manhatten(i, j)
            # 平均曼哈顿距离

            avg_manhatten_distance = manhatten_distance / len(points)
            jw_value.append(avg_manhatten_distance)

        jw_value = getScore(jw_value, [dataDict['location[manhatten_min]'], 0.7],
                            [dataDict['location[manhatten_max]'], 0.4])

        # 地域得分 平均曼哈顿距离
        location_score = [dataDict['location[location_region]'], dataDict['location[location_manhatten]']]
        location_weight = get_weight(location_score)

        location_values = []
        for i in range(0, len(region_array)):
            a1 = region_value[i]
            a2 = jw_value[i]
            location_value = location_weight * [a1, a2]
            location_values.append(sum(location_value))
        # show_score(location_values,'地域指数')
        return location_values

    # # 字符改为数值
    # for key, value in list(dataDict.items()):
    #     dataDict[key] = float(value)

    ps = price_score()
    ts = traffic_score()
    cs = community_score()
    ms = muti_score()
    ls = location_score()
    ps=np.array(ps).reshape(242,)

    V = [dataDict['final[final_price]'],
         dataDict['final[final_traffic]'],
         dataDict['final[final_community]'],
         dataDict['final[final_muti]'],
         dataDict['final[final_location]']]
    W = get_weight(V)

    M = []
    data = []
    df = pd.read_csv('all.csv', index_col=0)
    names = df.loc[:, ['楼盘名称']].values[:, 0]
    prices = df.loc[:, ['单价']].values[:, 0]
    for i in range(0, len(ps)):
        a1 = ps[i]
        a2 = ts[i]
        a3 = cs[i]
        a4 = ms[i]
        a5 = ls[i]
        q = W * [a1, a2, a3, a4, a5]
        m = sum(q)
        M.append(m)
        name = names[i]
        price = prices[i]
        data.append([name, price, a1, a2, a3, a4, a5, m, price / m])
    df = pd.DataFrame(data, columns=['楼盘名称', '价格', '价格指数', '交通指数', '小区设施', '楼盘多样性', '地域指数', '楼盘综合水平', '价格/综合水平'])
    df.to_excel('comprehensive evaluation.xlsx')
    return data


if __name__ == '__main__':
    '''
    you can mark your personal score in daraDict
    '''
    dataDict = {'final[final_price]': 8,
                'final[final_traffic]': 6,
                'final[final_community]': 5,
                'final[final_muti]': 3,
                'final[final_location]': 4,
                'price[each_price_min]': 6000,
                'price[each_price_max]': 12000,
                'price[property_price_min]': 0.5,
                'price[property_price_max]': 1.4,
                'traffic[zhandianshumu_min]': 2,
                'traffic[zhandianshumu_max]': 6,
                'traffic[pingjunjuli_min]': 500,
                'traffic[pingjunjuli_max]': 1200,
                'traffic[pingjungongjiaoxianlu_min]': 2,
                'traffic[pingjungongjiaoxianlu_max]': 5,
                'traffic[pingjunjiaotongdefen_min]': 3,
                'traffic[pingjunjiaotongdefen_max]': 6,
                'traffic[pingjunguidaojiaotong_min]': 1,
                'traffic[pingjunguidaojiaotong_max]': 3,
                'community[plotRate_min]': 0.8,
                'community[plotRate_max]': 2.2,
                'community[greenRate_min]': 0.5,
                'community[greenRate_max]': 1.1,
                'community[parkProportion_min]': 0.5,
                'community[parkProportion_max]': 1.0,
                }
    dataDict['price[price_each]']=7
    dataDict['price[price_property]']=3

    dataDict['traffic[traffic_zhandianshumu]']=5
    dataDict['traffic[traffic_pingjunjuli]']=6
    dataDict['traffic[traffic_pingjungongjiaoxianlu]']=4
    dataDict['traffic[traffic_pingjunjiaotongdefen]']=8
    dataDict['traffic[traffic_pingjunguidaojiaotong]']=3

    dataDict['community[community_plotRate]']=4
    dataDict['community[community_greenRate]']=7
    dataDict['community[community_parkProportion]']=5

    dataDict['muti[muti_buildType]']=5
    dataDict['muti[muti_feature]']=5
    dataDict['muti[muti_houseType]']=5
    dataDict['location[manhatten_min]']=1000
    dataDict['location[manhatten_max]']=3000

    dataDict['location[location_region]']=7
    dataDict['location[location_manhatten]']=4

    dataDict['muti[buildType_gaoCeng]'] = 1
    dataDict['muti[buildType_huaYuanYangFang]'] = 0
    dataDict['muti[buildType_bieShu]'] = 0
    dataDict['muti[buildType_duoCeng]'] = 0
    dataDict['muti[buildType_zhuZhai]'] = 0
    dataDict['muti[buildType_shangZhu]'] = 1
    dataDict['muti[buildType_dianPu]'] = 1
    dataDict['muti[buildType_gouWuZhongXin]'] = 0
    dataDict['muti[buildType_shangYeJie]'] = 1

    dataDict['muti[feature_diZongJia]'] = 1
    dataDict['muti[feature_gongJiaoShuNiu]'] = 1
    dataDict['muti[feature_diMiDu]'] = 0
    dataDict['muti[feature_hunFang]'] = 1
    dataDict['muti[feature_touZiDiChan]'] = 1
    dataDict['muti[feature_gongYuan]'] = 1
    dataDict['muti[feature_shangChang]'] = 0
    dataDict['muti[feature_gongYu]'] = 0
    dataDict['muti[feature_daXingChaoShi]'] = 1
    dataDict['muti[feature_guiJiaoFang]'] = 1
    dataDict['muti[feature_daXingSheQu]'] = 1
    dataDict['muti[feature_pinPaiKaiFaShang]'] = 0
    dataDict['muti[feature_gaiShanFang]'] = 1
    dataDict['muti[feature_xueXiao]'] = 1
    dataDict['muti[feature_gangXuFang]'] = 1

    dataDict['muti[houseType_1]'] = 1
    dataDict['muti[houseType_2]'] = 1
    dataDict['muti[houseType_3]'] = 0
    dataDict['muti[houseType_4]'] = 1
    dataDict['muti[houseType_5]'] = 0
    dataDict['muti[houseType_6]'] = 0
    dataDict['muti[houseType_7]'] = 1
    dataDict['muti[houseType_bieShu]'] = 0
    dataDict['muti[houseType_shangHu]'] = 0

    dataDict['location[region_nanGuan]'] = 7
    dataDict['location[region_zhaoYang]'] = 9
    dataDict['location[region_jingYue]'] = 5
    dataDict['location[region_lvYuan]'] = 3
    dataDict['location[region_erDao]'] = 4
    dataDict['location[region_gaoXin]'] = 4
    dataDict['location[region_jingKai]'] = 3
    dataDict['location[region_kuanCheng]'] = 2
    dataDict['location[region_qiKai]'] = 2
    data = result(dataDict)
    print(data)

2.5 数据

这里列出来了喂给算法的数据,包括列名与一行数据。

,楼盘名称,经度,纬度,总价,单价,待售,期房在售,售罄,现房在售,尾盘,其他,刚需房,学校,改善房,品牌开发商,大型社区,轨交房,大型超市,公寓,商场,公园,投资地产,婚房,低密度,公交枢纽,低总价,住宅,别墅_属性,商住,店铺,购物中心,商业街,其他,南关,朝阳,净月,绿园,二道,高新,经开,宽城,汽开,长春周边,高层,花园洋房,别墅_建筑类型,多层,产权年限,容积率,绿化率,居住人数,在建中,已竣工,已封顶,规划中,物业费,车位数,车位比,1室,2室,3室,4室,5室,6室,7室,别墅_户型,商户,开盘年份,开盘月份,开盘日,交盘年份,交盘月份,交盘日,站点数目,平均距离,平均线路数目,平均打分,平均轨道数目
0,天茂湖,125.236257,43.771314,3509790.881,12000,0,1,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,70,1.1,0.4,200,1,0,0,0,1,500,0,0,1,4,3,1,0,0,0,0,2018,5,26,2018.184675,9.608352987,30.01640875,12,1792,1.333333333,1.333333333,0

下面是得到的一部分综合指标结果。

原始数据过于复杂,我清洗数据搞了很久,这里就不展示了。

  • 4
    点赞
  • 45
    收藏
    觉得还不错? 一键收藏
  • 4
    评论
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值