机器学习实验（十）：基于WiFi fingerprints用自编码器(Autoencoders)和神经网络(Neural Network)进行定位_1(tensorflow版)

最新推荐文章于 2024-08-11 12:09:58 发布

风雪夜归子

最新推荐文章于 2024-08-11 12:09:58 发布

阅读量8.8k

点赞数 3

分类专栏：机器学习实验文章标签：自编码器神经网络 Autoencoders Neural Network

本文链接：https://blog.csdn.net/u013719780/article/details/53261125

版权

机器学习实验专栏收录该内容

13 篇文章 6 订阅

订阅专栏

Autoencoders and Neural Network for Place recognition with WiFi fingerprints

本文来源于Michał Nowicki⋆and Jan Wietrzykowski 论文的读书笔记

论文原文：https://arxiv.org/pdf/1611.02049v1.pdf

现实世界的很多场景需要知道用户位置以便为他们提供某些服务。因此,自动用户定位一直是近年来的研究热点。自动用户定位包括估算用户的位置(纬度、经度和海拔)。由于有包括GPS传感器等连接在移动设备上，解决室外定位问题比较容易。然而,室内定位还面临这多困难,仍然是一个悬而未决的问题，主要是由于在室内环境中GPS信号有损失。

本文使用数据集UJIIndoorLoc进行实验。数据集的详细信息如下：

Attribute Information:

Attribute 001 (WAP001): Intensity value for WAP001. Negative integer values from -104 to 0 and +100. Positive value 100 used if WAP001 was not detected.
....
Attribute 520 (WAP520): Intensity value for WAP520. Negative integer values from -104 to 0 and +100. Positive Vvalue 100 used if WAP520 was not detected.
Attribute 521 (Longitude): Longitude. Negative real values from -7695.9387549299299000 to -7299.786516730871000
Attribute 522 (Latitude): Latitude. Positive real values from 4864745.7450159714 to 4865017.3646842018.
Attribute 523 (Floor): Altitude in floors inside the building. Integer values from 0 to 4.
Attribute 524 (BuildingID): ID to identify the building. Measures were taken in three different buildings. Categorical integer values from 0 to 2.
Attribute 525 (SpaceID): Internal ID number to identify the Space (office, corridor, classroom) where the capture was taken. Categorical integer values.
Attribute 526 (RelativePosition): Relative position with respect to the Space (1 - Inside, 2 - Outside in Front of the door). Categorical integer values.
Attribute 527 (UserID): User identifier (see below). Categorical integer values.
Attribute 528 (PhoneID): Android device identifier (see below). Categorical integer values.
Attribute 529 (Timestamp): UNIX Time when the capture was taken. Integer value.

UserID Anonymized user Height (cm)

0 USER0000 (Validation User) N/A
1 USER0001 170
2 USER0002 176
3 USER0003 172
4 USER0004 174
5 USER0005 184
6 USER0006 180
7 USER0007 160
8 USER0008 176
9 USER0009 177
10 USER0010 186
11 USER0011 176
12 USER0012 158
13 USER0013 174
14 USER0014 173
15 USER0015 174
16 USER0016 171
17 USER0017 166

18 USER0018 162

PhoneID Android Device Android Ver. UserID

0 Celkon A27 4.0.4(6577) 0
1 GT-I8160 2.3.6 8
2 GT-I8160 4.1.2 0
3 GT-I9100 4.0.4 5
4 GT-I9300 4.1.2 0
5 GT-I9505 4.2.2 0
6 GT-S5360 2.3.6 7
7 GT-S6500 2.3.6 14
8 Galaxy Nexus 4.2.2 10
9 Galaxy Nexus 4.3 0
10 HTC Desire HD 2.3.5 18
11 HTC One 4.1.2 15
12 HTC One 4.2.2 0
13 HTC Wildfire S 2.3.5 0,11
14 LT22i 4.0.4 0,1,9,16
15 LT22i 4.1.2 0
16 LT26i 4.0.4 3
17 M1005D 4.0.4 13
18 MT11i 2.3.4 4
19 Nexus 4 4.2.2 6
20 Nexus 4 4.3 0
21 Nexus S 4.1.2 0
22 Orange Monte Carlo 2.3.5 17
23 Transformer TF101 4.0.3 2
24 bq Curie 4.1.1 12

    In [1]: 
  

import pandas as pd
import numpy as np
import tensorflow as tf
from sklearn.preprocessing import scale

    In [2]: 
  

dataset = pd.read_csv("trainingData.csv",header = 0)
features = scale(np.asarray(dataset.ix[:,0:520]))
labels = np.asarray(dataset["BUILDINGID"].map(str) + dataset["FLOOR"].map(str))
labels = np.asarray(pd.get_dummies(labels))

/Applications/anaconda/lib/python2.7/site-packages/sklearn/utils/validation.py:420: DataConversionWarning: Data with input dtype int64 was converted to float64 by the scale function.
  warnings.warn(msg, DataConversionWarning)

Dividing UJIndoorLoc training data set into training and validation set

    In [3]: 
  

train_val_split = np.random.rand(len(features)) < 0.70
train_x = features[train_val_split]
train_y = labels[train_val_split]
val_x = features[~train_val_split]
val_y = labels[~train_val_split]

Using UJIndoorLoc validation data set as testing set

    In [4]: 
  

test_dataset = pd.read_csv("validationData.csv",header = 0)
test_features = scale(np.asarray(test_dataset.ix[:,0:520]))
test_labels = np.asarray(test_dataset["BUILDINGID"].map(str) + test_dataset["FLOOR"].map(str))
test_labels = np.asarray(pd.get_dummies(test_labels))

/Applications/anaconda/lib/python2.7/site-packages/sklearn/utils/validation.py:420: DataConversionWarning: Data with input dtype int64 was converted to float64 by the scale function.
  warnings.warn(msg, DataConversionWarning)

    In [5]: 
  

def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev = 0.1)
    return tf.Variable(initial)

def bias_variable(shape):
    initial = tf.constant(0.0, shape = shape)
    return tf.Variable(initial)

    In [6]: 
  

n_input = 520 
n_hidden_1 = 256 
n_hidden_2 = 128 
n_hidden_3 = 64 

n_classes = labels.shape[1]

learning_rate = 0.01
training_epochs = 20
batch_size = 10

total_batches = dataset.shape[0] // batch_size

    In [7]: 
  

X = tf.placeholder(tf.float32, shape=[None,n_input])
Y = tf.placeholder(tf.float32,[None,n_classes])

# --------------------- Encoder Variables --------------- #

e_weights_h1 = weight_variable([n_input, n_hidden_1])
e_biases_h1 = bias_variable([n_hidden_1])

e_weights_h2 = weight_variable([n_hidden_1, n_hidden_2])
e_biases_h2 = bias_variable([n_hidden_2])

e_weights_h3 = weight_variable([n_hidden_2, n_hidden_3])
e_biases_h3 = bias_variable([n_hidden_3])

# --------------------- Decoder Variables --------------- #

d_weights_h1 = weight_variable([n_hidden_3, n_hidden_2])
d_biases_h1 = bias_variable([n_hidden_2])

d_weights_h2 = weight_variable([n_hidden_2, n_hidden_1])
d_biases_h2 = bias_variable([n_hidden_1])

d_weights_h3 = weight_variable([n_hidden_1, n_input])
d_biases_h3 = bias_variable([n_input])

# --------------------- DNN Variables ------------------ #

dnn_weights_h1 = weight_variable([n_hidden_3, n_hidden_2])
dnn_biases_h1 = bias_variable([n_hidden_2])

dnn_weights_h2 = weight_variable([n_hidden_2, n_hidden_2])
dnn_biases_h2 = bias_variable([n_hidden_2])

dnn_weights_out = weight_variable([n_hidden_2, n_classes])
dnn_biases_out = bias_variable([n_classes])

    In [8]: 
  

def encode(x):
    l1 = tf.nn.tanh(tf.add(tf.matmul(x,e_weights_h1),e_biases_h1))
    l2 = tf.nn.tanh(tf.add(tf.matmul(l1,e_weights_h2),e_biases_h2))
    l3 = tf.nn.tanh(tf.add(tf.matmul(l2,e_weights_h3),e_biases_h3))
    return l3
    
def decode(x):
    l1 = tf.nn.tanh(tf.add(tf.matmul(x,d_weights_h1),d_biases_h1))
    l2 = tf.nn.tanh(tf.add(tf.matmul(l1,d_weights_h2),d_biases_h2))
    l3 = tf.nn.tanh(tf.add(tf.matmul(l2,d_weights_h3),d_biases_h3))
    return l3

def dnn(x):
    l1 = tf.nn.tanh(tf.add(tf.matmul(x,dnn_weights_h1),dnn_biases_h1))
    l2 = tf.nn.tanh(tf.add(tf.matmul(l1,dnn_weights_h2),dnn_biases_h2))
    out = tf.nn.softmax(tf.add(tf.matmul(l2,dnn_weights_out),dnn_biases_out))
    return out

    In [9]: 
  

encoded = encode(X)
decoded = decode(encoded) 
y_ = dnn(encoded)

    In [10]: 
  

us_cost_function = tf.reduce_mean(tf.pow(X - decoded, 2))
s_cost_function = -tf.reduce_sum(Y * tf.log(y_))
us_optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(us_cost_function)
s_optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(s_cost_function)

    In [11]: 
  

correct_prediction = tf.equal(tf.argmax(y_,1), tf.argmax(Y,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

模型结构

图片取自论文原文: https://arxiv.org/pdf/1611.02049v1.pdf

    In [12]: 
  

with tf.Session() as session:
    tf.initialize_all_variables().run()
    
    # ------------ 1. Training Autoencoders - Unsupervised Learning ----------- #
    for epoch in range(training_epochs):
        epoch_costs = np.empty(0)
        for b in range(total_batches):
            offset = (b * batch_size) % (features.shape[0] - batch_size)
            batch_x = features[offset:(offset + batch_size), :]
            _, c = session.run([us_optimizer, us_cost_function],feed_dict={X: batch_x})
            epoch_costs = np.append(epoch_costs,c)
        print "Epoch: ",epoch," Loss: ",np.mean(epoch_costs)
    print "Unsupervised pre-training finished..."
    
    
    # ---------------- 2. Training NN - Supervised Learning ------------------ #
    for epoch in range(training_epochs):
        epoch_costs = np.empty(0)
        for b in range(total_batches):
            offset = (b * batch_size) % (train_x.shape[0] - batch_size)
            batch_x = train_x[offset:(offset + batch_size), :]
            batch_y = train_y[offset:(offset + batch_size), :]
            _, c = session.run([s_optimizer, s_cost_function],feed_dict={X: batch_x, Y : batch_y})
            epoch_costs = np.append(epoch_costs,c)
        print "Epoch: ",epoch," Loss: ",np.mean(epoch_costs)," Training Accuracy: ", \
            session.run(accuracy, feed_dict={X: train_x, Y: train_y}), \
            "Validation Accuracy:", session.run(accuracy, feed_dict={X: val_x, Y: val_y})
            
    print "Supervised training finished..."
    

    print "\nTesting Accuracy:", session.run(accuracy, feed_dict={X: test_features, Y: test_labels})

Epoch:  0  Loss:  0.946417506465
Epoch:  1  Loss:  0.872724663348
Epoch:  2  Loss:  0.834939743301
Epoch:  3  Loss:  0.812426232725
Epoch:  4  Loss:  0.797482786783
Epoch:  5  Loss:  0.786710632076
Epoch:  6  Loss:  0.778407857031
Epoch:  7  Loss:  0.771705506587
Epoch:  8  Loss:  0.766137119669
Epoch:  9  Loss:  0.761425737017
Epoch:  10  Loss:  0.757388042674
Epoch:  11  Loss:  0.753892278756
Epoch:  12  Loss:  0.750838549624
Epoch:  13  Loss:  0.748148437356
Epoch:  14  Loss:  0.745759032577
Epoch:  15  Loss:  0.743619203037
Epoch:  16  Loss:  0.741687947782
Epoch:  17  Loss:  0.739931871977
Epoch:  18  Loss:  0.738323841833
Epoch:  19  Loss:  0.736841905176
Unsupervised pre-training finished...
Epoch:  0  Loss:  4.27480528807  Training Accuracy:  0.507781 Validation Accuracy: 0.504976
Epoch:  1  Loss:  2.49841824394  Training Accuracy:  0.689392 Validation Accuracy: 0.675493
Epoch:  2  Loss:  1.63081804872  Training Accuracy:  0.772202 Validation Accuracy: 0.757801
Epoch:  3  Loss:  1.13079869366  Training Accuracy:  0.820174 Validation Accuracy: 0.80334
Epoch:  4  Loss:  0.957438500986  Training Accuracy:  0.845874 Validation Accuracy: 0.828808
Epoch:  5  Loss:  0.759610852076  Training Accuracy:  0.891919 Validation Accuracy: 0.871648
Epoch:  6  Loss:  0.668730932672  Training Accuracy:  0.834166 Validation Accuracy: 0.815483
Epoch:  7  Loss:  0.644024800692  Training Accuracy:  0.895131 Validation Accuracy: 0.869118
Epoch:  8  Loss:  0.586599285754  Training Accuracy:  0.926613 Validation Accuracy: 0.908416
Epoch:  9  Loss:  0.368939029375  Training Accuracy:  0.937607 Validation Accuracy: 0.914825
Epoch:  10  Loss:  0.383492833295  Training Accuracy:  0.932967 Validation Accuracy: 0.913307
Epoch:  11  Loss:  0.348123138894  Training Accuracy:  0.945174 Validation Accuracy: 0.918705
Epoch:  12  Loss:  0.397730887604  Training Accuracy:  0.948672 Validation Accuracy: 0.924776
Epoch:  13  Loss:  0.316421336578  Training Accuracy:  0.952741 Validation Accuracy: 0.924945
Epoch:  14  Loss:  0.29667654338  Training Accuracy:  0.95474 Validation Accuracy: 0.926632
Epoch:  15  Loss:  0.279483787604  Training Accuracy:  0.948101 Validation Accuracy: 0.920729
Epoch:  16  Loss:  0.331246199433  Training Accuracy:  0.971945 Validation Accuracy: 0.945859
Epoch:  17  Loss:  0.250471480337  Training Accuracy:  0.967875 Validation Accuracy: 0.943161
Epoch:  18  Loss:  0.221882153683  Training Accuracy:  0.970874 Validation Accuracy: 0.946365
Epoch:  19  Loss:  0.245788279902  Training Accuracy:  0.977013 Validation Accuracy: 0.950076
Supervised training finished...

Testing Accuracy: 0.717372