pandas_profiling

最新推荐文章于 2024-08-12 08:32:46 发布

奈落2

最新推荐文章于 2024-08-12 08:32:46 发布

阅读量518

点赞数

文章标签：数据分析

本文链接：https://blog.csdn.net/w2486696/article/details/106975472

版权

pandas_profilingextends the pandas DataFrame with df.profile_report()for quick data analysis.
结果由以下部分组成

Type inference
Essentials: type, unique values, missing values
Quantile statistics 最小值, Q1, median, Q3, 最大值, range,四分位数
Descriptive statistics like mean, mode, standard deviation(标准差), sum, median absolute deviation, coefficient of variation(变异系数), kurtosis(峰度), skewness(偏态)
Most frequent values
Histogram
Correlations highlighting of highly correlated variables, Spearman, Pearson and Kendall matrices
Missing values matrix, count, heatmap and dendrogram of missing values
Text analysis learn about categories (Uppercase, Space), scripts (Latin, Cyrillic) and blocks (ASCII) of text data.
File and Image analysis extract file sizes, creation dates and dimensions and scan for truncated images or those containing EXIF information.

from pandas_profiling import ProfileReport
profile=ProfileReport(df,title="")

防止过量计算

profile = ProfileReport(large_dataset, minimal=True)
profile.to_file("output.html")

Report界面也可以设置，详情参考github页面，Explore deeper

pandas_profiling input_file output_file
参数之后再看能看懂

profile.to_file("your_report.html")

或者

# As a string
json_data = profile.to_json()

# As a file
profile.to_file("your_report.json")

目前识别的数据类型

更详细的看visdom

集成之后直接右击文件即可生成report.html
参考github

关注