这是一个小项目,用K-NN通过PCA进行图片分类。
首先惯例,破竹先放出github
数据是用cifar-10的第一个batch的前1000个图片
K = int(sys.argv[1])
D = int(sys.argv[2])
N = int(sys.argv[3])
file = sys.argv[4]
分别代表:k个邻居,PCA降到d维,0到n是test,n到1000是train
upload数据分好train和test
#extract the data from data_batch_1
def unpickle(file):
import pickle
with open(file, 'rb') as fo:
dict = pickle.load(fo, encoding='latin1')
return dict
#divide train set and test set
def divide_train_and_test(dict, n):
data = dict['data']
data_test = data[ : n]
data_test = data_test.reshape(n, 3, 32, 32).transpose(0, 2, 3, 1).astype("float")
data_train = data[n : 1000]
data_train = data_train.reshape(1000 - n, 3, 32, 32).transpose(0, 2, 3, 1).astype("float")
labels = dict['labels']
labels_test = np.array(labels[ : n])
labels_train = np.array(labels[n : 1000])
return data_test, data_train, labels_test, labels_train
然后从RGB变成灰度图