python help(csv) csv帮助文档

最新推荐文章于 2024-03-30 22:53:28 发布

今夕何夕2112

最新推荐文章于 2024-03-30 22:53:28 发布

阅读量205

点赞数

分类专栏：笔记文章标签： python

本文链接：https://blog.csdn.net/lovefor999/article/details/130778555

版权

笔记专栏收录该内容

12 篇文章 1 订阅

订阅专栏

python 3.11.2 
2023-05-20

本文为python中 help(csv)命令产生的输出及一些翻译和自己的理解。
参考： https://docs.python.org/zh-cn/3/library/csv.html
如果是想快速学习怎么使用csv模块读写csv文件，请查看上方网址，官方帮助文档详细且实用。

Help on module csv:
模块csv的帮助(文档)：

NAME
    csv - CSV parsing and writing.

模块名称 csv-CSV解析和写入

MODULE REFERENCE
    https://docs.python.org/3.11/library/csv.htm
    
    The following documentation is automatically generated from the Python
    source files. It may be incomplete, incorrect or include features that
    are considered implementation detail and may vary between Python
    implementations. When in doubt, consult the module reference at the
    location listed above.

模块引言
下面的文档是从Python源文件自动生成的。它可能不完整、不准确、包含其实是实现细节的功能，并且可能在不同Python版本中有所不同。如果有疑问，请查阅上面列出网址里的模块参考。

DESCRIPTION
    This module provides classes that assist in the reading and writing
    of Comma Separated Value (CSV) files, and implements the interface
    described by PEP 305.  Although many CSV files are simple to parse,
    the format is not formally defined by a stable specification and
    is subtle enough that parsing lines of a CSV file with something
    like line.split(",") is bound to fail.  The module supports three
    basic APIs: reading, writing, and registration of dialects.

介绍/说明
该模块提供了帮助读写csv文件的类，并实现了PEP 305中描述的接口。尽管许多csv文件解析起来很简单，但是这种格式并没有被稳定的规范正式定义，而且它非常微妙，使用类似line.split(“,”)的代码来解析csv文件的各行注定要失败。该模块支持三个基本api：读取csv文件、写入csv文件和dialect注册。
注：dialect 是参数、类的名字，有csv.writer(csvfile,dialect='excel')和class Dialect(...，此处和以下直接使用不再翻译。

DIALECT REGISTRATION:

    Readers and writers support a dialect argument, which is a convenient
    handle on a group of settings.  When the dialect argument is a string,
    it identifies one of the dialects previously registered with the module.
    If it is a class or instance, the attributes of the argument are used as
    the settings for the reader or writer:

        class excel:
            delimiter = ','
            quotechar = '"'
            escapechar = None
            doublequote = True
            skipinitialspace = False
            lineterminator = '\r\n'
            quoting = QUOTE_MINIMAL

dialect 注册
读取器和写入器（csv.reader/csv.writer）都支持一个名为dialect的参数，这个参数是对一组设置的方便处理。当dialect参数是一个字符串时，它标识之前在模块中注册dialect中的一个。如果这个参数是一个类或者实例，则参数的属性用作读取器或写入器的设置：（下面是一个例子）
注：当然，也可以直接单独设置一项或几项，比如csv.reader(csv_file, delimiter='|', skipinitialspace=True)。

SETTINGS:
        * quotechar - specifies a one-character string to use as the
            quoting character.  It defaults to '"'.
            
        * delimiter - specifies a one-character string to use as the
            field separator.  It defaults to ','.
        * skipinitialspace - specifies how to interpret spaces which
            immediately follow a delimiter.  It defaults to False, which
            means that spaces immediately following a delimiter is part
            of the following field.
        * lineterminator -  specifies the character sequence which should
            terminate rows.
        * quoting - controls when quotes should be generated by the writer.
            It can take on any of the following module constants:

            csv.QUOTE_MINIMAL means only when required, for example, when a
                field contains either the quotechar or the delimiter
            csv.QUOTE_ALL means that quotes are always placed around fields.
            csv.QUOTE_NONNUMERIC means that quotes are always placed around
                fields which do not parse as integers or floating point
                numbers.
            csv.QUOTE_NONE means that quotes are never placed around fields.
        * escapechar - specifies a one-character string used to escape
            the delimiter when quoting is set to QUOTE_NONE.
        * doublequote - controls the handling of quotes inside fields.  When
            True, two consecutive quotes are interpreted as one during read,
            and when writing, each quote character embedded in the data is
            written as two quotes

设置：(可以进行设置的项：)
quotechar：指定要用作引号字符的单字符字符串。默认是(英文符号双引号) "
delimiter：指定要用作字段分隔符的单字符字符串。默认是(英文符号逗号) ,
注：下面说的引号字符和字段分隔符就是分别指上面两项设置的单字符字符串
skipinitialspace：指定程序如何处理紧跟在分隔符后面的空格。默认是False，代表紧跟在分隔符后的空格是下一字段的一部分。(设为True则忽略紧跟在分隔符后的空格，可以忽略多个空格)
lineterminator：指定应终止行的字符序列。
quoting：控制写入器什么时候该生成(插入)引号字符(quotechar)。它可以为以下任一模块常量：
csv.QUOTE_MINIMAL ：翻译【仅当需要时。例如：当字段包含引号字符或分隔字符时。】官网帮助文档【指示 writer 对象仅为包含特殊字符（例如定界符、引号字符或行结束符中的任何字符）的字段加上引号。】
csv.QUOTE_ALL ：总是在字段两端加quotechar
csv.QUOTE_NONNUMERIC ：官网帮助文档【指示 writer 对象为所有非数字字段加上引号。指示 reader 将所有未用引号引出的字段转换为 float 类型。】
csv.QUOTE_NONE ：永远不加
escapechar：指定在引号设置为QUOTE_NONE时用于转义分隔符的单字符字符串
doublequote：控制对字段内引号字符的处理。当为True时：在读取时，两个连续的引号字符被解释为一个，在写入时，数据中的每个引号字符都被写为两个引号字符

CLASSES
    builtins.Exception(builtins.BaseException)
        _csv.Error
    builtins.object
        Dialect
            excel
                excel_tab
            unix_dialect
        DictReader
        DictWriter
        Sniffer

类：
内置/内建异常处理类 Error类
内置对象/类：
Dialect DictReader DictWriter Sniffer
Dialect的子类有excel和unix_dialect
excel类的子类excel_tab

注：下面很多可以略过，直接看最后。

    class Dialect(builtins.object)
    类 Dialect
     |  Describe a CSV dialect.
     |  描述一个csv的dialect
     |
     |  This must be subclassed (see csv.excel).  Valid attributes are:
     |  delimiter, quotechar, escapechar, doublequote, skipinitialspace,
     |  lineterminator, quoting.
     |  它必须被子类化。有效的属性有：（上两行的七项）
     |
     |  Methods defined here:
     |  类的方法定义在这里：
     |
     |  __init__(self)
     |      Initialize self.  See help(type(self)) for accurate signature.
     |      初始化自己
     |
     |  ----------------------------------------------------------------------
     |  Data descriptors defined here:
     |  数据描述符定义在这里：
     |
     |  __dict__
     |      dictionary for instance variables (if defined)
     |      实例变量字典（如果定义了的话）
     |
     |  __weakref__
     |      list of weak references to the object (if defined)
     |      对象的弱引用列表（如果定义了的话）
     |
     |  ----------------------------------------------------------------------
     |  Data and other attributes defined here:
     |  数据和其他属性定义在这里：
     |
     |  delimiter = None
     |
     |  doublequote = None
     |
     |  escapechar = None
     |
     |  lineterminator = None
     |
     |  quotechar = None
     |
     |  quoting = None
     |
     |  skipinitialspace = None

Dialect类，七个属性均为None，想要在csv.reader/csv.writer的dialect参数使用，需要用Dialect的子类。

    class DictReader(builtins.object)
     |  DictReader(f, fieldnames=None, restkey=None, restval=None, dialect='excel', *args, **kwds)
     |
     |  Methods defined here:
     |
     |  __init__(self, f, fieldnames=None, restkey=None, restval=None, dialect='excel', *args, **kwds)
     |      Initialize self.  See help(type(self)) for accurate signature.
     |
     |  __iter__(self)
     |
     |  __next__(self)
     |
     |  ----------------------------------------------------------------------
     |  Data descriptors defined here:
     |
     |  __dict__
     |      dictionary for instance variables (if defined)
     |
     |  __weakref__
     |      list of weak references to the object (if defined)
     |
     |  fieldnames

DictReader类，初始化的参数表DictReader(self, f, fieldnames=None, restkey=None, restval=None, dialect='excel', *args, **kwds)。定义了__iter__和__next__，说明可以当迭代器用。

class DictWriter(builtins.object)
     |  DictWriter(f, fieldnames, restval='', extrasaction='raise', dialect='excel', *args, **kwds)
     |
     |  Methods defined here:
     |
     |  __init__(self, f, fieldnames, restval='', extrasaction='raise', dialect='excel', *args, **kwds)
     |      Initialize self.  See help(type(self)) for accurate signature.
     |
     |  writeheader(self)
     |
     |  writerow(self, rowdict)
     |
     |  writerows(self, rowdicts)
     |
     |  ----------------------------------------------------------------------
     |  Data descriptors defined here:
     |
     |  __dict__
     |      dictionary for instance variables (if defined)
     |
     |  __weakref__
     |      list of weak references to the object (if defined)

DictWriter类，初始化的参数表DictWriter(f, fieldnames, restval='', extrasaction='raise', dialect='excel', *args, **kwds)。类方法有writeheader(self)、writerow(self, rowdict)和writerows(self, rowdicts)，分别是写入标题行、写入一行、写入多行的方法。

class Error(builtins.Exception)
     |  Method resolution order:
     |      Error
     |      builtins.Exception
     |      builtins.BaseException
     |      builtins.object
     |
     |  Methods inherited from builtins.Exception:
     |  从builtins.Exception继承的方法：
     |
     |  __init__(self, /, *args, **kwargs)
     |      Initialize self.  See help(type(self)) for accurate signature.
     |
     |  ----------------------------------------------------------------------
     |  Static methods inherited from builtins.Exception:
     |    从builtins.Exception继承的静态方法：
     |
     |  __new__(*args, **kwargs) from builtins.type
     |      Create and return a new object.  See help(type) for accurate signature.
     |
     |  ----------------------------------------------------------------------
     |  Methods inherited from builtins.BaseException:
     |
     |  __delattr__(self, name, /)
     |      Implement delattr(self, name).
     |
     |  __getattribute__(self, name, /)
     |      Return getattr(self, name).
     |
     |  __reduce__(...)
     |      Helper for pickle.
     |
     |  __repr__(self, /)
     |      Return repr(self).
     |
     |  __setattr__(self, name, value, /)
     |      Implement setattr(self, name, value).
     |
     |  __setstate__(...)
     |
     |  __str__(self, /)
     |      Return str(self).
     |
     |  add_note(...)
     |      Exception.add_note(note) --
     |      add a note to the exception
     |
     |  with_traceback(...)
     |      Exception.with_traceback(tb) --
     |      set self.__traceback__ to tb and return self.
     |
     |  ----------------------------------------------------------------------
     |  Data descriptors inherited from builtins.BaseException:
     |
     |  __cause__
     |      exception cause
     |
     |  __context__
     |      exception context
     |
     |  __dict__
     |
     |  __suppress_context__
     |
     |  __traceback__
     |
     |  args

内置对象->内置基本异常类->内置异常类->Error类
注意到实现了delattr(self, name)、getattr(self, name)、setattr(self, name, value)、str(self)和add_note(…)这几个方法。

class Sniffer(builtins.object)
     |  "Sniffs" the format of a CSV file (i.e. delimiter, quotechar)
     |  “嗅探”csv文件的格式（例如字段分隔符、引号字符）
     |
     |  Returns a Dialect object.
     |  返回一个Dialect对象
     |
     |  Methods defined here:
     |
     |  __init__(self)
     |      Initialize self.  See help(type(self)) for accurate signature.
     |
     |  has_header(self, sample)
     |      返回是否有标题行，此处和下面的sample可以为字符串
     |
     |  sniff(self, sample, delimiters=None)
     |      Returns a dialect (or None) corresponding to the sample
     |      返回与sample对应的dialect(子)类（或None）
     |
     |  ----------------------------------------------------------------------
     |  Data descriptors defined here:
     |
     |  __dict__
     |      dictionary for instance variables (if defined)
     |
     |  __weakref__
     |      list of weak references to the object (if defined)

Sniffer类，两个方法。

下面是几个Dialect类的子类，excel类及其子类excel_tab类和unix_dialect类，只有属性不同。

class excel(Dialect)
     |  Describe the usual properties of Excel-generated CSV files.
     |
     |  Method resolution order:
     |      excel
     |      Dialect
     |      builtins.object
     |
     |  Data and other attributes defined here:
     |
     |  delimiter = ','
     |
     |  doublequote = True
     |
     |  lineterminator = '\r\n'
     |
     |  quotechar = '"'
     |
     |  quoting = 0
     |
     |  skipinitialspace = False
     |
     |  ----------------------------------------------------------------------
     |  Methods inherited from Dialect:
     |
     |  __init__(self)
     |      Initialize self.  See help(type(self)) for accurate signature.
     |
     |  ----------------------------------------------------------------------
     |  Data descriptors inherited from Dialect:
     |
     |  __dict__
     |      dictionary for instance variables (if defined)
     |
     |  __weakref__
     |      list of weak references to the object (if defined)
     |
     |  ----------------------------------------------------------------------
     |  Data and other attributes inherited from Dialect:
     |
     |  escapechar = None

class excel_tab(excel)
     |  Describe the usual properties of Excel-generated TAB-delimited files.
     |
     |  Method resolution order:
     |      excel_tab
     |      excel
     |      Dialect
     |      builtins.object
     |
     |  Data and other attributes defined here:
     |
     |  delimiter = '\t'
     |
     |  ----------------------------------------------------------------------
     |  Data and other attributes inherited from excel:
     |
     |  doublequote = True
     |
     |  lineterminator = '\r\n'
     |
     |  quotechar = '"'
     |
     |  quoting = 0
     |
     |  skipinitialspace = False
     |
     |  ----------------------------------------------------------------------
     |  Methods inherited from Dialect:
     |
     |  __init__(self)
     |      Initialize self.  See help(type(self)) for accurate signature.
     |
     |  ----------------------------------------------------------------------
     |  Data descriptors inherited from Dialect:
     |
     |  __dict__
     |      dictionary for instance variables (if defined)
     |
     |  __weakref__
     |      list of weak references to the object (if defined)
     |
     |  ----------------------------------------------------------------------
     |  Data and other attributes inherited from Dialect:
     |
     |  escapechar = None

class unix_dialect(Dialect)
     |  Describe the usual properties of Unix-generated CSV files.
     |
     |  Method resolution order:
     |      unix_dialect
     |      Dialect
     |      builtins.object
     |
     |  Data and other attributes defined here:
     |
     |  delimiter = ','
     |
     |  doublequote = True
     |
     |  lineterminator = '\n'
     |
     |  quotechar = '"'
     |
     |  quoting = 1
     |
     |  skipinitialspace = False
     |
     |  ----------------------------------------------------------------------
     |  Methods inherited from Dialect:
     |
     |  __init__(self)
     |      Initialize self.  See help(type(self)) for accurate signature.
     |
     |  ----------------------------------------------------------------------
     |  Data descriptors inherited from Dialect:
     |
     |  __dict__
     |      dictionary for instance variables (if defined)
     |
     |  __weakref__
     |      list of weak references to the object (if defined)
     |
     |  ----------------------------------------------------------------------
     |  Data and other attributes inherited from Dialect:
     |
     |  escapechar = None

一些方法：

FUNCTIONS
    field_size_limit(...)
        Sets an upper limit on parsed fields.
        设置解析字段的上限

            csv.field_size_limit([limit])

        Returns old limit. If limit is not given, no new limit is set and
        the old limit is returned
        返回旧的上限。如果limit没有给出，不设置新的上限并返回旧的上限。

    get_dialect(name)
        Return the dialect instance associated with name.
        返回name关联的dialect类的实例

        dialect = csv.get_dialect(name)

    list_dialects()
        Return a list of all known dialect names.
        返回所有已知dialect名字的列表

        names = csv.list_dialects()

    reader(...)
        csv_reader = reader(iterable [, dialect='excel']
                                [optional keyword args])
            for row in csv_reader:
                process(row)

        The "iterable" argument can be any object that returns a line
        of input for each iteration, such as a file object or a list.  The
        optional "dialect" parameter is discussed below.  The function
        also accepts optional keyword arguments which override settings
        provided by the dialect.
        “iterable”参数可以是每次迭代时返回一行输入的任意对象，
        比如一个文件对象或一个列表。
        可选参数“dialect”将会在下面讨论。
        该函数还接受可选的关键字参数，包括dialect提供的设置。

        The returned object is an iterator.  Each iteration returns a row
        of the CSV file (which can span multiple input lines).
        返回对象是一个迭代器。每次迭代返回csv文件的一行（
        可以跨越多个输入行）。
    register_dialect(...)
        Create a mapping from a string name to a dialect class.
        创建从字符串名字到dialect类的映射
        dialect = csv.register_dialect(name[, dialect[, **fmtparams]])

    unregister_dialect(name)
        Delete the name/dialect mapping associated with a string name.
        删除name对应的name->dialect映射
        csv.unregister_dialect(name)

    writer(...)
        csv_writer = csv.writer(fileobj [, dialect='excel']
                                    [optional keyword args])
            for row in sequence:
                csv_writer.writerow(row)

            [or]

            csv_writer = csv.writer(fileobj [, dialect='excel']
                                    [optional keyword args])
            csv_writer.writerows(rows)

        The "fileobj" argument can be any object that supports the file API.
        “fileobj”参数可以是任何支持文件API的对象。

DATA
    QUOTE_ALL = 1
    QUOTE_MINIMAL = 0
    QUOTE_NONE = 3
    QUOTE_NONNUMERIC = 2
    __all__ = ['QUOTE_MINIMAL', 'QUOTE_ALL', 'QUOTE_NONNUMERIC', 'QUOTE_NO...

VERSION
    1.0

FILE
    /usr/lib/python3.11/csv.py