我正在Django应用程序中使用Kprototype算法创建一个聚类算法。
现在我正在用假数据测试我的所有算法,以了解它是如何工作的,并验证它是如何工作的。
我的聚类和预测函数是:
def ClusterCreation(request,*args):
global kproto
# random categorical data
data = np.array([
[0,'a',4],
[1,'e',3],
[3,'ffe',7],
[5,'fdfd',16]
])
kproto = KPrototypes(n_clusters=2, init='Cao', verbose=2)
clusters = kproto.fit_predict(data, categorical=[1,2])
# Create CSV with cluster statistics
clusterStatisticsCSV(kproto)
for argument in args:
if argument is not None:
return
# Print the cluster centroids
return HttpResponse('Clustering ok')
def ClusterPrediction(request):
global kproto
if (kproto==0):
ClusterCreation(None,1)
# random point to fit
data = np.array([0,'a',4])
fit_label = kproto.predict(data, categorical=[0,1]) #categorical is the Index of columns that contain categorical data
# Print the cluster centroids
return HttpResponse('Point '+str(data)+' is in cluster '+str(fit_label))
我实现了无问题地运行ClusterCreation函数,但是现在我添加了预测新数据点集群的功能。
您将看到一个名为
clusterStatisticsCSV
它的工作没有问题,是一个简单的CSV导出。
我收到以下错误日志:
Initialization method and algorithm are deterministic. Setting n_init to 1.
dz01 | Init: initializing centroids
dz01 | Init: initializing clusters
dz01 | Starting iterations...
dz01 | Run: 1, iteration: 1/100, moves: 0, ncost: 8.50723954060097
dz01 | Internal Server Error: /cluster/clusterPrediction/
dz01 | Traceback (most recent call last):
dz01 | File "/usr/local/lib/python3.5/site-packages/django/core/handlers/exception.py", line 35, in inner
dz01 | response = get_response(request)
dz01 | File "/usr/local/lib/python3.5/site-packages/django/core/handlers/base.py", line 128, in _get_response
dz01 | response = self.process_exception_by_middleware(e, request)
dz01 | File "/usr/local/lib/python3.5/site-packages/django/core/handlers/base.py", line 126, in _get_response
dz01 | response = wrapped_callback(request, *callback_args, **callback_kwargs)
dz01 | File "/src/cluster/views.py", line 62, in ClusterPrediction
dz01 | fit_label = kproto.predict(data, categorical=[0,1]) #categorical is the Index of columns that contain categorical data
dz01 | File "/usr/local/lib/python3.5/site-packages/kmodes/kprototypes.py", line 438, in predict
dz01 | Xnum, Xcat = _split_num_cat(X, categorical)
dz01 | File "/usr/local/lib/python3.5/site-packages/kmodes/kprototypes.py", line 44, in _split_num_cat
dz01 | Xnum = np.asanyarray(X[:, [ii for ii in range(X.shape[1])
dz01 | IndexError: tuple index out of range
我知道哪个是错误,我想这与:
kproto.predict(data, categorical=[0,1])
ClusterCreation
函数,因为可能也是错误的,然后簇也是错误的。