lotrus28..
5
尝试读取带有空格,逗号和引号的制表符分隔表时,我遇到了类似的问题:
1115794 4218 "k__Bacteria", "p__Firmicutes", "c__Bacilli", "o__Bacillales", "f__Bacillaceae", ""
1144102 3180 "k__Bacteria", "p__Firmicutes", "c__Bacilli", "o__Bacillales", "f__Bacillaceae", "g__Bacillus", ""
368444 2328 "k__Bacteria", "p__Bacteroidetes", "c__Bacteroidia", "o__Bacteroidales", "f__Bacteroidaceae", "g__Bacteroides", ""
import pandas as pd
# Same error for read_table
counts = pd.read_csv(path_counts, sep='\t', index_col=2, header=None, engine = 'c')
pandas.io.common.CParserError: Error tokenizing data. C error: out of memory
这说明它与C解析引擎(默认引擎)有关。也许更改为python会改变一切
counts = pd.read_table(path_counts, sep=&#