【机器学习】sklearn包中K近邻图的构造

一穷二白到年薪百万

已于 2023-02-09 18:06:49 修改

阅读量990

点赞数

分类专栏：机器学习文章标签： sklearn python

于 2023-02-03 16:50:25 首次发布

本文链接：https://blog.csdn.net/zfhsfdhdfajhsr/article/details/128871411

版权

机器学习专栏收录该内容

20 篇文章 5 订阅

订阅专栏

1 K近邻图


sklearn.neighbors.kneighbors_graph(X, n_neighbors, *, mode='connectivity', metric='minkowski', p=2, metric_params=None, include_self=False, n_jobs=None)[source]

from sklearn.neighbors import kneighbors_graph

X = [[0], [3], [1]]
A = kneighbors_graph(X, 2, mode='connectivity', include_self=True)
A.toarray()

2 参数解释

X：array-like of shape (n_samples, n_features) or BallTree
Sample data, in the form of a numpy array or a precomputed BallTree.
n_neighborsint：Number of neighbors for each sample.
mode{‘connectivity’, ‘distance’}, default=’connectivity’ ：Type of returned matrix: ‘connectivity’ will return the connectivity matrix with ones and zeros, and ‘distance’ will return the distances between neighbors according to the given metric.
metricstr, default=’minkowski’：Metric to use for distance computation. Default is “minkowski”, which results in the standard Euclidean distance when p = 2. See the documentation of scipy.spatial.distance and the metrics listed in distance_metrics for valid metric values.
pint, default=2：Power parameter for the Minkowski metric. When p = 1, this is equivalent to using manhattan_distance (l1), and euclidean_distance (l2) for p = 2. For arbitrary p, minkowski_distance (l_p) is used.
metric_paramsdict, default=None：Additional keyword arguments for the metric function.
include_self：bool or ‘auto’, default=False. Whether or not to mark each sample as the first nearest neighbor to itself. If ‘auto’, then True is used for mode=’connectivity’ and False for mode=’distance’.
n_jobsint, default=None: The number of parallel jobs to run for neighbors search. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors. See Glossary for more details.

3 例子

>>> from sklearn.neighbors import NearestNeighbors
>>> import numpy as np
>>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
>>> nbrs = NearestNeighbors(n_neighbors=2, algorithm='ball_tree').fit(X)
>>> distances, indices = nbrs.kneighbors(X)
>>> indices
array([[0, 1],
       [1, 0],
       [2, 1],
       [3, 4],
       [4, 3],
       [5, 4]]...)
>>> distances
array([[0.        , 1.        ],
       [0.        , 1.        ],
       [0.        , 1.41421356],
       [0.        , 1.        ],
       [0.        , 1.        ],
       [0.        , 1.41421356]])