pandas.factorize

最新推荐文章于 2024-08-04 20:25:04 发布

傅华涛Fu

最新推荐文章于 2024-08-04 20:25:04 发布

阅读量230

点赞数

分类专栏： python数据分析文章标签： pandas

本文链接：https://blog.csdn.net/fu_jian_ping/article/details/108018959

版权

python数据分析专栏收录该内容

10 篇文章 0 订阅

订阅专栏

官网地址https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.factorize.html

pandas.factorize

将Series中的相同的标称型映射为相同的index

pandas.factorize(values, sort=False, na_sentinel=- 1, size_hint=None, dropna=True)[source]

Encode the object as an enumerated type or categorical variable.

This method is useful for obtaining a numeric representation of an array when all that matters is identifying distinct values. factorize is available as both a top-level function pandas.factorize(), and as a method Series.factorize() and Index.factorize().

Parameters

valuessequence

A 1-D sequence. Sequences that aren’t pandas objects are coerced to ndarrays before factorization.

sortbool, default False

Sort uniques and shuffle codes to maintain the relationship.

na_sentinelint, default -1

Value to mark “not found”.

size_hintint, optional

Hint to the hashtable sizer.

Returns

codesndarray

An integer ndarray that’s an indexer into uniques. uniques.take(codes) will have the same values as values.

uniquesndarray, Index, or Categorical

The unique valid values. When values is Categorical, uniques is a Categorical. When values is some other pandas object, an Index is returned. Otherwise, a 1-D ndarray is returned.

Note

Even if there’s a missing value in values, uniques will not contain an entry for it.