Python酷库之旅-第三方库Pandas(008)

最新推荐文章于 2024-07-28 15:46:11 发布

神奇夜光杯

最新推荐文章于 2024-07-28 15:46:11 发布

阅读量1.9k

点赞数 93

分类专栏： Myelsa的Python酷库之旅文章标签： python pandas 人工智能开发语言 excel 标准库及第三方库学习和成长

本文链接：https://blog.csdn.net/ygb_1024/article/details/140223497

版权

Myelsa的Python酷库之旅专栏收录该内容

95 篇文章 20 订阅

订阅专栏

一、用法精讲

16、pandas.DataFrame.to_json函数

16-1、语法

16-2、参数

16-3、功能

16-4、返回值

16-5、说明

16-6、用法

16-6-1、数据准备

16-6-2、代码示例

16-6-3、结果输出

17、pandas.read_html函数

17-1、语法

17-2、参数

17-3、功能

17-4、返回值

17-5、说明

17-6、用法

17-6-1、数据准备

17-6-2、代码示例

17-6-3、结果输出

18、pandas.DataFrame.to_html函数

18-1、语法

18-2、参数

18-3、功能

18-4、返回值

18-5、说明

18-6、用法

一、用法精讲

16、pandas.DataFrame.to_json函数

16-1、语法

# 16、pandas.DataFrame.to_json函数
DataFrame.to_json(path_or_buf=None, *, orient=None, date_format=None, double_precision=10, force_ascii=True, date_unit='ms', default_handler=None, lines=False, compression='infer', index=None, indent=None, storage_options=None, mode='w')
Convert the object to a JSON string.

Note NaN’s and None will be converted to null and datetime objects will be converted to UNIX timestamps.

Parameters:
path_or_bufstr, path object, file-like object, or None, default None
String, path object (implementing os.PathLike[str]), or file-like object implementing a write() function. If None, the result is returned as a string.

orientstr
Indication of expected JSON string format.

Series:

default is ‘index’

allowed values are: {‘split’, ‘records’, ‘index’, ‘table’}.

DataFrame:

default is ‘columns’

allowed values are: {‘split’, ‘records’, ‘index’, ‘columns’, ‘values’, ‘table’}.

The format of the JSON string:

‘split’ : dict like {‘index’ -> [index], ‘columns’ -> [columns], ‘data’ -> [values]}

‘records’ : list like [{column -> value}, … , {column -> value}]

‘index’ : dict like {index -> {column -> value}}

‘columns’ : dict like {column -> {index -> value}}

‘values’ : just the values array

‘table’ : dict like {‘schema’: {schema}, ‘data’: {data}}

Describing the data, where data component is like orient='records'.

date_format{None, ‘epoch’, ‘iso’}
Type of date conversion. ‘epoch’ = epoch milliseconds, ‘iso’ = ISO8601. The default depends on the orient. For orient='table', the default is ‘iso’. For all other orients, the default is ‘epoch’.

double_precisionint, default 10
The number of decimal places to use when encoding floating point values. The possible maximal value is 15. Passing double_precision greater than 15 will raise a ValueError.

force_asciibool, default True
Force encoded string to be ASCII.

date_unitstr, default ‘ms’ (milliseconds)
The time unit to encode to, governs timestamp and ISO8601 precision. One of ‘s’, ‘ms’, ‘us’, ‘ns’ for second, millisecond, microsecond, and nanosecond respectively.

default_handlercallable, default None
Handler to call if object cannot otherwise be converted to a suitable format for JSON. Should receive a single argument which is the object to convert and return a serialisable object.

linesbool, default False
If ‘orient’ is ‘records’ write out line-delimited json format. Will throw ValueError if incorrect ‘orient’ since others are not list-like.

compressionstr or dict, default ‘infer’
For on-the-fly compression of the output data. If ‘infer’ and ‘path_or_buf’ is path-like, then detect compression from the following extensions: ‘.gz’, ‘.bz2’, ‘.zip’, ‘.xz’, ‘.zst’, ‘.tar’, ‘.tar.gz’, ‘.tar.xz’ or ‘.tar.bz2’ (otherwise no compression). Set to None for no compression. Can also be a dict with key 'method' set to one of {'zip', 'gzip', 'bz2', 'zstd', 'xz', 'tar'} and other key-value pairs are forwarded to zipfile.ZipFile, gzip.GzipFile, bz2.BZ2File, zstandard.ZstdCompressor, lzma.LZMAFile or tarfile.TarFile, respectively. As an example, the following could be passed for faster compression and to create a reproducible gzip archive: compression={'method': 'gzip', 'compresslevel': 1, 'mtime': 1}.

New in version 1.5.0: Added support for .tar files.

Changed in version 1.4.0: Zstandard support.

indexbool or None, default None
The index is only used when ‘orient’ is ‘split’, ‘index’, ‘column’, or ‘table’. Of these, ‘index’ and ‘column’ do not support index=False.

indentint, optional
Length of whitespace used to indent each record.

storage_optionsdict, optional
Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to urllib.request.Request as header options. For other URLs (e.g. starting with “s3://”, and “gcs://”) the key-value pairs are forwarded to fsspec.open. Please see fsspec and urllib for more details, and for more examples on storage options refer here.

modestr, default ‘w’ (writing)
Specify the IO mode for output when supplying a path_or_buf. Accepted args are ‘w’ (writing) and ‘a’ (append) only. mode=’a’ is only supported when lines is True and orient is ‘records’.

Returns:
None or str
If path_or_buf is None, returns the resulting json format as a string. Otherwise returns None.

16-2、参数

16-2-1、path_or_buf(可选，默认值为None)：字符串或文件对象，如果为字符串，表示JSON数据将被写入该路径的文件中；如果为文件对象，则数据将被写入该文件对象；如果为None，则返回生成的JSON格式的字符串。

16-2-2、orient(可选，默认值为None)：字符串，用于指示JSON文件中数据的期望格式。

16-2-2-1、'split'：字典像{index -> [index], columns -> [columns], data -> [values]}。

16-2-2-2、'records'：列表像[{column -> value}, ... , {column -> value}]。

16-2-2-3、'index'：字典像index -> {column -> value}}，其中索引是JSON对象中的键。

16-2-2-4、'columns'：字典像{{column -> index} -> value}。

16-2-2-5、'values'：仅仅是值数组。

16-2-2-6、如果没有指定，Pandas会尝试自动推断。

16-2-3、date_format(可选，默认值为None)：字符串，用于日期时间对象的格式。默认为 None，意味着使用ISO8601格式。

16-2-4、double_precision(可选，默认值为10)：整数，指定浮点数的精度(小数点后的位数)。

16-2-5、force_ascii(可选，默认值为True)：布尔值，是否确保所有非ASCII字符都被转义。

16-2-6、date_unit(可选，默认值为'ms')：字符串，用于时间戳的时间单位，'s', 'ms', 'us', 'ns' 分别代表秒、毫秒、微秒、纳秒。

16-2-7、default_handler(可选，默认值为None)：可调用对象，用于处理无法转换为JSON的对象。默认为None，此时会抛出TypeError。

16-2-8、lines(可选，默认值为False)：布尔值，如果为True，则输出将是每行一个记录的JSON字符串的列表。

16-2-9、compression(可选，默认值为'infer')：字符串或None，指定用于写入文件的压缩方式。'infer'(默认)会根据文件扩展名自动选择压缩方式(如 .gz)。

16-2-10、index(可选，默认值为None)：布尔值或字符串列表，是否将索引作为JSON的一部分输出。如果为False，则不输出索引；如果为True，则输出所有索引；如果为字符串列表，则只输出指定的索引。

16-2-11、indent(可选，默认值为None)：整数或None，指定输出JSON字符串的缩进量。如果为None，则不进行缩进。

16-2-12、storage_options(可选，默认值为None)：字典，用于文件存储的额外选项，如AWS S3访问密钥。

16-2-13、mode(可选，默认值为'w')：字符串，'w' 表示写入模式(如果文件存在则覆盖)，'a'表示追加模式。

16-3、功能

将Pandas DataFrame对象转换为JSON格式的数据，并可以选择性地将其写入文件或作为字符串返回。

16-4、返回值

16-4-1、如果path_or_buf参数为None(默认值)，则函数返回一个包含JSON数据的字符串。

16-4-2、如果path_or_buf参数被指定为一个文件路径或文件对象，则函数不返回任何值(即返回None)，而是将JSON数据写入指定的文件或文件对象。

16-5、说明

该函数在数据分析和数据科学中非常有用，特别是当你需要将DataFrame的内容导出到前端应用程序、Web服务或进行跨语言的数据交换时。

16-6、用法

16-6-1、数据准备

无

16-6-2、代码示例

# 16、pandas.DataFrame.to_json函数
# 16-1、直接输出
import pandas as pd
df = pd.DataFrame({
    'A': [1, 2, 3],
    'B': [4, 5, 6]
})
json_str = df.to_json(orient='records')
print(json_str) # 输出：[{"A":1,"B":4},{"A":2,"B":5},{"A":3,"B":6}]

# 16-2、写入文件
import pandas as pd
df = pd.DataFrame({
    'A': [1, 2, 3],
    'B': [4, 5, 6]
})
df.to_json('data.json', orient='records', lines=True)
# 在Python脚本所在目录自动生成data.json文件，文件中包含了JSON数据

16-6-3、结果输出

# 16、pandas.DataFrame.to_json函数
# 16-1、直接输出
# [{"A":1,"B":4},{"A":2,"B":5},{"A":3,"B":6}] 

# 16-2、写入文件
# 在Python脚本所在目录自动生成data.json文件，文件中包含了JSON数据

17、pandas.read_html函数

17-1、语法

# 17、pandas.read_html函数
pandas.read_html(io, *, match='.+', flavor=None, header=None, index_col=None, skiprows=None, attrs=None, parse_dates=False, thousands=',', encoding=None, decimal='.', converters=None, na_values=None, keep_default_na=True, displayed_only=True, extract_links=None, dtype_backend=_NoDefault.no_default, storage_options=None)
Read HTML tables into a list of DataFrame objects.

Parameters:
iostr, path object, or file-like object
String, path object (implementing os.PathLike[str]), or file-like object implementing a string read() function. The string can represent a URL or the HTML itself. Note that lxml only accepts the http, ftp and file url protocols. If you have a URL that starts with 'https' you might try removing the 's'.

Deprecated since version 2.1.0: Passing html literal strings is deprecated. Wrap literal string/bytes input in io.StringIO/io.BytesIO instead.

matchstr or compiled regular expression, optional
The set of tables containing text matching this regex or string will be returned. Unless the HTML is extremely simple you will probably need to pass a non-empty string here. Defaults to ‘.+’ (match any non-empty string). The default value will return all tables contained on a page. This value is converted to a regular expression so that there is consistent behavior between Beautiful Soup and lxml.

flavor{“lxml”, “html5lib”, “bs4”} or list-like, optional
The parsing engine (or list of parsing engines) to use. ‘bs4’ and ‘html5lib’ are synonymous with each other, they are both there for backwards compatibility. The default of None tries to use lxml to parse and if that fails it falls back on bs4 + html5lib.

headerint or list-like, optional
The row (or list of rows for a MultiIndex) to use to make the columns headers.

index_colint or list-like, optional
The column (or list of columns) to use to create the index.

skiprowsint, list-like or slice, optional
Number of rows to skip after parsing the column integer. 0-based. If a sequence of integers or a slice is given, will skip the rows indexed by that sequence. Note that a single element sequence means ‘skip the nth row’ whereas an integer means ‘skip n rows’.

attrsdict, optional
This is a dictionary of attributes that you can pass to use to identify the table in the HTML. These are not checked for validity before being passed to lxml or Beautiful Soup. However, these attributes must be valid HTML table attributes to work correctly. For example,

attrs = {'id': 'table'}
is a valid attribute dictionary because the ‘id’ HTML tag attribute is a valid HTML attribute for any HTML tag as per this document.

attrs = {'asdf': 'table'}
is not a valid attribute dictionary because ‘asdf’ is not a valid HTML attribute even if it is a valid XML attribute. Valid HTML 4.01 table attributes can be found here. A working draft of the HTML 5 spec can be found here. It contains the latest information on table attributes for the modern web.

parse_datesbool, optional
See read_csv() for more details.

thousandsstr, optional
Separator to use to parse thousands. Defaults to ','.

encodingstr, optional
The encoding used to decode the web page. Defaults to None.``None`` preserves the previous encoding behavior, which depends on the underlying parser library (e.g., the parser library will try to use the encoding provided by the document).

decimalstr, default ‘.’
Character to recognize as decimal point (e.g. use ‘,’ for European data).

convertersdict, default None
Dict of functions for converting values in certain columns. Keys can either be integers or column labels, values are functions that take one input argument, the cell (not column) content, and return the transformed content.

na_valuesiterable, default None
Custom NA values.

keep_default_nabool, default True
If na_values are specified and keep_default_na is False the default NaN values are overridden, otherwise they’re appended to.

displayed_onlybool, default True
Whether elements with “display: none” should be parsed.

extract_links{None, “all”, “header”, “body”, “footer”}
Table elements in the specified section(s) with <a> tags will have their href extracted.

New in version 1.5.0.

dtype_backend{‘numpy_nullable’, ‘pyarrow’}, default ‘numpy_nullable’
Back-end data type applied to the resultant DataFrame (still experimental). Behaviour is as follows:

"numpy_nullable": returns nullable-dtype-backed DataFrame (default).

"pyarrow": returns pyarrow-backed nullable ArrowDtype DataFrame.

New in version 2.0.

storage_optionsdict, optional
Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to urllib.request.Request as header options. For other URLs (e.g. starting with “s3://”, and “gcs://”) the key-value pairs are forwarded to fsspec.open. Please see fsspec and urllib for more details, and for more examples on storage options refer here.

New in version 2.1.0.

Returns:
dfs
A list of DataFrames.

17-2、参数

17-2-1、io(必须)：字符串、文件型对象或类文件对象，表示HTML数据的来源，可以是URL、文件路径或包含HTML内容的字符串。

17-2-2、match(可选，默认值为'.+')：字符串或正则表达式，用于过滤出符合条件的表格。默认值为 '.+'，意味着匹配所有表格。

17-2-3、flavor(可选，默认值为None)：字符串，指定解析HTML的库，Pandas使用lxml或bs4(BeautifulSoup 4)来解析HTML。如果未指定，Pandas会尝试自动选择。

17-2-4、header(可选，默认值为None)：整数或整数列表，指定作为列名的行。如果为None，则不使用列名，DataFrame的列名将会是默认的整数序列；如果为整数列表，则可以使用多行作为列名的多级索引。

17-2-5、index_col(可选，默认值为None)：整数、字符串或整数列表/字符串列表，指定作为行索引的列。如果为None，则不设置行索引。

17-2-6、skiprows(可选，默认值为None)：整数或整数列表，指定需要跳过的行(不用于标题行)，这对于跳过表格头部的无关信息非常有用。

17-2-7、attrs(可选，默认值为None)：字典，用于指定额外的属性来匹配表格，这些属性会作为HTML标签的属性进行匹配。

17-2-8、parse_dates(可选，默认值为False)：布尔值或列表，如果为True，则尝试将数据解析为日期类型；如果为列表，则指定需要解析为日期的列。

17-2-9、thousands(可选，默认值为',')：字符串，用于解析千分位分隔符。

17-2-10、encoding(可选，默认值为None)：字符串，指定用于编码的字符集。如果未指定，则使用文档的声明编码(如果有的话)。

17-2-11、decimal(可选，默认值为'.')：字符串，指定小数点字符，这对于处理非标准小数点的数据非常有用。

17-2-12、converters(可选，默认值为None)：字典，用于指定列的转换器。键是列名(或列的索引)，值是用于转换该列数据的函数。

17-2-13、na_values(可选，默认值为None)：标量、字符串列表或字典，用于指定哪些值应该被视为缺失值(NaN)。

17-2-14、keep_default_na(可选，默认值为True)：布尔值，如果为True，则使用pandas的默认NaN值集。

17-2-15、displayed_only(可选，默认值为True)：布尔值，如果为True，则只解析可见的表格元素(忽略 <style> 或 <script> 中的表格)。

17-2-16、extract_links(可选，默认值为None)：布尔值或字符串列表，如果为True，则尝试从表格中提取所有<a>标签的href属性；如果为字符串列表，则只提取特定类名的链接。

17-2-17、dtype_backend(可选)：内部使用，通常不需要用户指定。

17-2-18、storage_options(可选，默认值为None)：字典，用于指定存储后端(如S3、GCS等)的额外选项。

17-3、功能

从HTML文档(通常包含表格数据)中读取数据，并将这些数据解析为Pandas DataFrame或DataFrame的列表(如果HTML文档中包含多个表格)。

17-4、返回值

DataFrame或DataFrame列表：如果HTML文档中只有一个表格，read_html函数将返回一个包含该表格数据的DataFrame；如果文档中包含多个表格，它将返回一个包含多个DataFrame的列表，每个DataFrame对应一个表格。

17-5、说明

该函数非常适合于从网页中提取表格数据进行分析。

17-6、用法

17-6-1、数据准备

无

17-6-2、代码示例

# 17、pandas.read_html函数
import pandas as pd
# 假设的HTML内容，其中包含了一个简单的表格
html_content = """  
<html>  
<head><title>示例表格</title></head>  
<body>  

<table border="1">  
  <tr>  
    <th>姓名</th>  
    <th>年龄</th>  
    <th>职业</th>  
  </tr>  
  <tr>  
    <td>张三</td>  
    <td>30</td>  
    <td>工程师</td>  
  </tr>  
  <tr>  
    <td>李四</td>  
    <td>25</td>  
    <td>设计师</td>  
  </tr>  
</table>  

</body>  
</html>  
"""
# 使用pandas.read_html函数读取HTML中的表格
# 注意：read_html返回一个DataFrame列表，因为HTML中可以包含多个表格
dfs = pd.read_html(html_content)
# 我们假设HTML中只有一个表格，因此取第一个DataFrame
df = dfs[0]
# 显示DataFrame内容
print(df)

17-6-3、结果输出

# 17、pandas.read_html函数
#    姓名  年龄   职业
# 0  张三  30  工程师
# 1  李四  25  设计师

18、pandas.DataFrame.to_html函数

18-1、语法

# 18、pandas.DataFrame.to_html函数
DataFrame.to_html(buf=None, *, columns=None, col_space=None, header=True, index=True, na_rep='NaN', formatters=None, float_format=None, sparsify=None, index_names=True, justify=None, max_rows=None, max_cols=None, show_dimensions=False, decimal='.', bold_rows=True, classes=None, escape=True, notebook=False, border=None, table_id=None, render_links=False, encoding=None)
Render a DataFrame as an HTML table.

Parameters
:
buf
str, Path or StringIO-like, optional, default None
Buffer to write to. If None, the output is returned as a string.

columns
array-like, optional, default None
The subset of columns to write. Writes all columns by default.

col_space
str or int, list or dict of int or str, optional
The minimum width of each column in CSS length units. An int is assumed to be px units..

header
bool, optional
Whether to print column labels, default True.

index
bool, optional, default True
Whether to print index (row) labels.

na_rep
str, optional, default ‘NaN’
String representation of NaN to use.

formatters
list, tuple or dict of one-param. functions, optional
Formatter functions to apply to columns’ elements by position or name. The result of each function must be a unicode string. List/tuple must be of length equal to the number of columns.

float_format
one-parameter function, optional, default None
Formatter function to apply to columns’ elements if they are floats. This function must return a unicode string and will be applied only to the non-NaN elements, with NaN being handled by na_rep.

sparsify
bool, optional, default True
Set to False for a DataFrame with a hierarchical index to print every multiindex key at each row.

index_names
bool, optional, default True
Prints the names of the indexes.

justify
str, default None
How to justify the column labels. If None uses the option from the print configuration (controlled by set_option), ‘right’ out of the box. Valid values are

left

right

center

justify

justify-all

start

end

inherit

match-parent

initial

unset.

max_rows
int, optional
Maximum number of rows to display in the console.

max_cols
int, optional
Maximum number of columns to display in the console.

show_dimensions
bool, default False
Display DataFrame dimensions (number of rows by number of columns).

decimal
str, default ‘.’
Character recognized as decimal separator, e.g. ‘,’ in Europe.

bold_rows
bool, default True
Make the row labels bold in the output.

classes
str or list or tuple, default None
CSS class(es) to apply to the resulting html table.

escape
bool, default True
Convert the characters <, >, and & to HTML-safe sequences.

notebook
{True, False}, default False
Whether the generated HTML is for IPython Notebook.

border
int
A border=border attribute is included in the opening <table> tag. Default pd.options.display.html.border.

table_id
str, optional
A css id is included in the opening <table> tag if specified.

render_links
bool, default False
Convert URLs to HTML links.

encoding
str, default “utf-8”
Set character encoding.

Returns
:
str or None
If buf is None, returns the result as a string. Otherwise returns None.

18-2、参数

18-2-1、buf(可选，默认值为None)：一个文件对象(如文件句柄)或StringIO对象，用于写入生成的 HTML。如果为None(默认值)，则返回一个字符串。

18-2-2、columns(可选，默认值为None)：要写入的列名列表。如果为None(默认值)，则写入所有列。

18-2-3、col_space(可选，默认值为None)：列之间的空间大小(以像素为单位)。如果为None(默认值)，则使用DataFrame的样式设置(如果有的话)。

18-2-4、header(可选，默认值为True)：是否写入列名作为HTML表格的头部。

18-2-5、index(可选，默认值为True)：是否将行索引作为一列写入HTML表格。

18-2-6、na_rep(可选，默认值为'NaN')：用于表示缺失值的字符串。

18-2-7、formatters(可选，默认值为None)：一个字典或函数列表，用于格式化列的值。键可以是列名或列索引，值可以是格式化字符串或函数。

18-2-8、float_format(可选，默认值为None)：浮点数的格式化字符串，例如'%.2f'表示保留两位小数。

18-2-9、sparsify(可选，默认值为None)：此参数在较新版本的pandas中已被弃用，不再使用。

18-2-10、index_names(可选，默认值为True)：如果index为True，是否将索引名写入HTML表格。

18-2-11、justify(可选，默认值为None)：如何对齐HTML表格中的文本，可以是'left', 'right', 'center', None。如果为None，则使用DataFrame的样式设置(如果有的话)。

18-2-12、max_rows/max_cols(可选，默认值为None)：分别限制输出HTML表格的最大行数和列数。如果设置为None(默认值)，则不限制。

18-2-13、show_dimensions(可选，默认值为False)：是否在HTML表格下方显示DataFrame的尺寸(形状)。

18-2-14、decimal(可选，默认值为'.')：用于格式化浮点数的十进制点字符。

18-2-15、bold_rows(可选，默认值为True)：是否将行标签(索引)加粗显示。

18-2-16、classes(可选，默认值为None)：一个字符串或字符串列表，用于为生成的HTML表格指定CSS类名。

18-2-17、escape(可选，默认值为True)：是否转义HTML特殊字符，这有助于防止跨站脚本攻击(XSS)。

18-2-18、notebook(可选，默认值为False)：在Jupyter Notebook中使用时，是否使用特定的样式和格式，这通常会自动处理，但在某些情况下可能需要手动设置。

18-2-19、border(可选，默认值为None)：表格的边框大小(以像素为单位)。如果为None(默认值)，则使用DataFrame的样式设置(如果有的话)。

18-2-20、table_id(可选，默认值为None)：生成的HTML表格的ID，这可以用于CSS样式或JavaScript脚本中引用表格。

18-2-21、render_links(可选，默认值为False)：是否将超链接(如URL)转换为可点击的HTML链接。注意，这要求列中的值是以字符串形式给出的URL。

18-2-22、encoding(可选，默认值为None)：编码方式，用于将生成的字符串写入文件或StringIO对象。如果buf是文件对象，则此参数通常不需要设置。

18-3、功能

用于将Pandas DataFrame转换为HTML表格的字符串表示。

18-4、返回值

返回值取决于buf参数的值。

18-4-1、如果buf参数为None(默认值)，则to_html函数将返回一个包含HTML表格的字符串，这个字符串可以直接被打印到控制台、保存到文件中，或者嵌入到网页或文档中。

18-4-2、如果buf参数是一个文件对象(如使用open函数打开的文件句柄)或StringIO对象，则to_html函数会将HTML表格写入到这个对象中，而不是返回字符串。在这种情况下，你需要在之后对文件或StringIO对象进行操作(如关闭文件或获取StringIO对象的内容)来获取或查看HTML表格。

18-5、说明

无

18-6、用法

18-6-1、数据准备

无

18-6-2、代码示例

# 18、pandas.DataFrame.to_html函数
# 18-1、返回字符串
import pandas as pd
# 创建一个简单的 DataFrame
df = pd.DataFrame({
    'A': [1, 2, 3],
    'B': [4, 5, 6],
    'C': [7, 8, 9]
})
# 将 DataFrame 转换为 HTML 字符串
html_str = df.to_html()
# 打印 HTML 字符串
print(html_str)

# 18-2、写入文件
import pandas as pd
# 创建一个简单的 DataFrame
df = pd.DataFrame({
    'A': [1, 2, 3],
    'B': [4, 5, 6],
    'C': [7, 8, 9]
})
# 打开一个文件用于写入
with open('example.html', 'w') as f:
    # 将DataFrame写入文件
    df.to_html(buf=f)
    # 此时，HTML表格已经被写入到'example.html'文件中

18-6-3、结果输出

# 18、pandas.DataFrame.to_html函数
# 18-1、返回字符串
# <table border="1" class="dataframe">
#   <thead>
#     <tr style="text-align: right;">
#       <th></th>
#       <th>A</th>
#       <th>B</th>
#       <th>C</th>
#     </tr>
#   </thead>
#   <tbody>
#     <tr>
#       <th>0</th>
#       <td>1</td>
#       <td>4</td>
#       <td>7</td>
#     </tr>
#     <tr>
#       <th>1</th>
#       <td>2</td>
#       <td>5</td>
#       <td>8</td>
#     </tr>
#     <tr>
#       <th>2</th>
#       <td>3</td>
#       <td>6</td>
#       <td>9</td>
#     </tr>
#   </tbody>
# </table>

# 18-2、写入文件
# HTML表格已经被写入到'example.html'文件中,'example.html'文件与Python脚本在同一目录