Following is a simple instance of KNN algorithm
Our goal is to build a machine learning model that can learn from the measurement of these irises whose species is known,so that we can predict the species for a new iris.
Because we have measurements for which we know the correct species of iris,this is a supervised learning problem. In this problem,we want to predict one of several options (the species of iris).this is a example of a classification problem.The possible output(different species of iris) are called classes.Every iris in the dataset belong to one of three classed,so the problem is a three-class classification problem.
The desired output for single data point(an iris) is the species of this flower.For a particular data point,the species it belongs to is called lable
Here is the code
#iris.py
import numpy as np
import pandas as pd
from sklearn.model_select import train_test_split
from sklearn.datasets import load_iris
from sklearn.neighbors import KNeighborsClassifier
iris_dataset = load_iris()
#split dataset to two part,which is 75% for training,25% for test
X_train,X_test,y_train,y_test = train_test_split(
iris_dataset['data'],iris_dataset['target'],random.state=0)
#define the knn classifier
knn = KNeighborsClassifier(n_neighbors=1)
#train the dataset
knn.fit(x_train,y_train)
After training,the object knn is built to be a model,we can use it now.
Support that we found an iris in the wild with a sepal(花萼) length of 5 cm,a sepal width of 2.9 cm,a petal(花瓣) length of 1 cm,a petal width of 0.2 cm.
Now let’s predict what species it would be.
#contect to the upper code
X_new = np.array([[5,2.9,1,0.2]])
prediction = knn.prediction(X_new)
print("prediction:{}".format(prediction))
print("predicted target name:{}".format(
iris_dataset['target_name’][prediction]))
Here is the output
pediction:[0]
predicted target name:['setosa']
Last we should measure how well the model works by computing the accuracy
y_pred = knn.predict(X_test)
print("test set score:{:.2f}".format(np.mean(y_pred==y_test)))
output is
test set score:0.97