c++ rapidcsv

https://github.com/d99kris/rapidcsv

Rapidcsv
Linux + Mac Windows
Build status Build status
Rapidcsv is a C++ header-only library for CSV parsing. While the name admittedly was inspired by the rapidjson project, the objectives are not the same. The goal of rapidcsv is to be an easy-to-use CSV library enabling rapid development. For optimal performance (be it CPU or memory usage) a CSV parser implemented for the specific use-case is likely to be more performant.

Example Usage
Here is a simple example reading a CSV file and getting ‘Close’ column as a vector of floats, and an example of getting a specific cell as well.

colrowhdr.csv content:

Date,Open,High,Low,Close,Volume,Adj Close
2017-02-24,64.529999,64.800003,64.139999,64.620003,21705200,64.620003
2017-02-23,64.419998,64.730003,64.190002,64.620003,20235200,64.620003
2017-02-22,64.330002,64.389999,64.050003,64.360001,19259700,64.360001
2017-02-21,64.610001,64.949997,64.449997,64.489998,19384900,64.489998
2017-02-17,64.470001,64.690002,64.300003,64.620003,21234600,64.620003

ex001.cpp content:

#include <iostream>
#include <vector>
#include "rapidcsv.h"

int main()
{
  rapidcsv::Document doc("examples/colrowhdr.csv");

  std::vector<float> close = doc.GetColumn<float>("Close");
  std::cout << "Read " << close.size() << " values." << std::endl;

  long long volume = doc.GetCell<long long>("Volume", "2017-02-22");
  std::cout << "Volume " << volume << " on 2017-02-22." << std::endl;
}

Refer to section More Examples below for more examples. The tests directory also contains many simple usage examples.

Supported Platforms
Rapidcsv is implemented using C++11 with the intention of being portable. It’s been tested on:

macOS Mojave 10.14
Ubuntu 18.04 LTS
Windows 7 / Visual Studio 2015
Installation
Simply copy src/rapidcsv.h to your project/include directory and include it.

More Examples
Several of the following examples are also provided in the examples/ directory and can be executed directly under Linux and macOS thanks to a shebang-hack. Example running ex001.cpp:

./examples/ex001.cpp

Reading a File without Headers
By default rapidcsv treats the first row as column headers, and the first column as row headers. This allows accessing rows/columns/cells using their labels, for example GetCell(“Close”, “2017-02-22”) to get the cell from column labelled “Close”, at row labelled “2017-02-22”. Sometimes one may prefer to be able to access first row and/or column as data, and only access cells by their row and column index. In order to do so one need use LabelParams and set pColumnNameIdx and/or pRowNameIdx to -1 (disabled).

Column Headers Only
colhdr.csv content:

Open,High,Low,Close,Volume,Adj Close
64.529999,64.800003,64.139999,64.620003,21705200,64.620003
64.419998,64.730003,64.190002,64.620003,20235200,64.620003
64.330002,64.389999,64.050003,64.360001,19259700,64.360001
64.610001,64.949997,64.449997,64.489998,19384900,64.489998
64.470001,64.690002,64.300003,64.620003,21234600,64.620003

ex002.cpp content:

#include <iostream>
#include <vector>
#include "rapidcsv.h"

int main()
{
  rapidcsv::Document doc("examples/colhdr.csv", rapidcsv::LabelParams(0, -1));

  std::vector<float> col = doc.GetColumn<float>("Close");
  std::cout << "Read " << col.size() << " values." << std::endl;
}

Row Headers Only
rowhdr.csv content:

2017-02-24,64.529999,64.800003,64.139999,64.620003,21705200,64.620003
2017-02-23,64.419998,64.730003,64.190002,64.620003,20235200,64.620003
2017-02-22,64.330002,64.389999,64.050003,64.360001,19259700,64.360001
2017-02-21,64.610001,64.949997,64.449997,64.489998,19384900,64.489998
2017-02-17,64.470001,64.690002,64.300003,64.620003,21234600,64.620003

ex003.cpp content:

#include <iostream>
#include <vector>
#include "rapidcsv.h"

int main()
{
  rapidcsv::Document doc("examples/rowhdr.csv", rapidcsv::LabelParams(-1, 0));

  std::vector<std::string> row = doc.GetRow<std::string>("2017-02-22");
  std::cout << "Read " << row.size() << " values." << std::endl;
}

No Headers
nohdr.csv content:

64.529999,64.800003,64.139999,64.620003,21705200,64.620003
64.419998,64.730003,64.190002,64.620003,20235200,64.620003
64.330002,64.389999,64.050003,64.360001,19259700,64.360001
64.610001,64.949997,64.449997,64.489998,19384900,64.489998
64.470001,64.690002,64.300003,64.620003,21234600,64.620003

ex004.cpp content:

#include <iostream>
#include <vector>
#include "rapidcsv.h"

int main()
{
  rapidcsv::Document doc("examples/nohdr.csv", rapidcsv::LabelParams(-1, -1));

  std::vector<float> close = doc.GetColumn<float>(5);
  std::cout << "Read " << close.size() << " values." << std::endl;

  long long volume = doc.GetCell<long long>(4, 2);
  std::cout << "Volume " << volume << " on 2017-02-22." << std::endl;
}

Reading a File with Custom Separator
For reading of files with custom separator (i.e. not comma), one need to specify the SeparatorParams argument. The following example reads a file using semi-colon as separator.

semi.csv content:

Date;Open;High;Low;Close;Volume;Adj Close
2017-02-24;64.529999;64.800003;64.139999;64.620003;21705200;64.620003
2017-02-23;64.419998;64.730003;64.190002;64.620003;20235200;64.620003
2017-02-22;64.330002;64.389999;64.050003;64.360001;19259700;64.360001
2017-02-21;64.610001;64.949997;64.449997;64.489998;19384900;64.489998
2017-02-17;64.470001;64.690002;64.300003;64.620003;21234600;64.620003

ex005.cpp content:

#include <iostream>
#include <vector>
#include "rapidcsv.h"

int main()
{
  rapidcsv::Document doc("examples/semi.csv", rapidcsv::LabelParams(),
                         rapidcsv::SeparatorParams(';'));

  std::vector<float> close = doc.GetColumn<float>("Close");
  std::cout << "Read " << close.size() << " values." << std::endl;

  long long volume = doc.GetCell<long long>("Volume", "2017-02-22");
  std::cout << "Volume " << volume << " on 2017-02-22." << std::endl;
}

Supported Get/Set Datatypes
The internal cell representation in the Document class is using std::string and when other types are requested, standard conversion routines are used. All standard conversions are relatively straight-forward, with the exception of char for which rapidcsv interprets the cell’s (first) byte as a character. The following example illustrates the supported datatypes.

colrowhdr.csv content:

Date,Open,High,Low,Close,Volume,Adj Close
2017-02-24,64.529999,64.800003,64.139999,64.620003,21705200,64.620003
2017-02-23,64.419998,64.730003,64.190002,64.620003,20235200,64.620003
2017-02-22,64.330002,64.389999,64.050003,64.360001,19259700,64.360001
2017-02-21,64.610001,64.949997,64.449997,64.489998,19384900,64.489998
2017-02-17,64.470001,64.690002,64.300003,64.620003,21234600,64.620003

ex006.cpp content:

#include <iostream>
#include <vector>
#include "rapidcsv.h"

int main()
{
  rapidcsv::Document doc("examples/colrowhdr.csv");

  std::cout << doc.GetCell<std::string>("Volume", "2017-02-22") << std::endl;
  std::cout << doc.GetCell<int>("Volume", "2017-02-22") << std::endl;
  std::cout << doc.GetCell<long>("Volume", "2017-02-22") << std::endl;
  std::cout << doc.GetCell<long long>("Volume", "2017-02-22") << std::endl;
  std::cout << doc.GetCell<unsigned>("Volume", "2017-02-22") << std::endl;
  std::cout << doc.GetCell<unsigned long>("Volume", "2017-02-22") << std::endl;
  std::cout << doc.GetCell<unsigned long long>("Volume", "2017-02-22") << std::endl;
  std::cout << doc.GetCell<float>("Volume", "2017-02-22") << std::endl;
  std::cout << doc.GetCell<double>("Volume", "2017-02-22") << std::endl;
  std::cout << doc.GetCell<long double>("Volume", "2017-02-22") << std::endl;
  std::cout << doc.GetCell<char>("Volume", "2017-02-22") << std::endl;
}

Custom Data Conversion
One may override conversion routines (or add new ones) by implementing ToVal() and ToStr(). Here is an example overriding int conversion, to instead provide two decimal fixed-point numbers. See tests/test035.cpp for a complete program example.

namespace rapidcsv
{
  template<>
  void Converter<int>::ToVal(const std::string& pStr, int& pVal) const
  {
    pVal = roundf(100.0 * stof(pStr));
  }

  template<>
  void Converter<int>::ToStr(const int& pVal, std::string& pStr) const
  {
    std::ostringstream out;
    out << std::fixed << std::setprecision(2) << static_cast<float>(pVal) / 100.0f;
    pStr = out.str();
  }
}

Reading CSV Data from a Stream or String
In addition to specifying a filename, rapidcsv supports constructing a Document from a stream and, indirectly through stringstream, from a string. Here is a simple example reading CSV data from a string:

ex007.cpp content:

#include <iostream>
#include <vector>
#include "rapidcsv.h"

int main()
{
  const std::string& csv =
    "Date,Open,High,Low,Close,Volume,Adj Close\n"
    "2017-02-24,64.529999,64.800003,64.139999,64.620003,21705200,64.620003\n"
    "2017-02-23,64.419998,64.730003,64.190002,64.620003,20235200,64.620003\n"
    "2017-02-22,64.330002,64.389999,64.050003,64.360001,19259700,64.360001\n"
    "2017-02-21,64.610001,64.949997,64.449997,64.489998,19384900,64.489998\n"
    "2017-02-17,64.470001,64.690002,64.300003,64.620003,21234600,64.620003\n"
    ;

  std::stringstream sstream(csv);
  rapidcsv::Document doc(sstream);

  std::vector<float> close = doc.GetColumn<float>("Close");
  std::cout << "Read " << close.size() << " values." << std::endl;

  long long volume = doc.GetCell<long long>("Volume", "2017-02-22");
  std::cout << "Volume " << volume << " on 2017-02-22." << std::endl;
}

Reading a File with Invalid Numbers (e.g. Empty Cells) as Numeric Data
By default rapidcsv throws an exception if one tries to access non-numeric data as a numeric datatype, as it basically propagates the underlying conversion routines’ exceptions to the calling application.

The reason for this is to ensure data correctness. If one wants to be able to read data with invalid numbers as numeric datatypes, one can use ConverterParams to configure the converter to default to a numeric value. The value is configurable and by default it’s std::numeric_limits::signaling_NaN() for float types, and 0 for integer types. Example:

rapidcsv::Document doc("file.csv", rapidcsv::LabelParams(),
                       rapidcsv::SeparatorParams(),
                       rapidcsv::ConverterParams(true));

Check if a Column Exists
Rapidcsv provides the methods GetColumnNames() and GetRowNames() to retrieve the column and row names. To check whether a particular column name exists one can for example do:

rapidcsv::Document doc("file.csv");
std::vector<std::string> columnNames = doc.GetColumnNames();
bool column_A_exists =
  (std::find(columnNames.begin(), columnNames.end(), "A") != columnNames.end());

UTF-16 and UTF-8
Rapidcsv’s preferred encoding for non-ASCII text is UTF-8. UTF-16 LE and UTF-16 BE can be read and written by rapidcsv on systems where codecvt header is present. Define HAS_CODECVT before including rapidcsv.h in order to enable the functionality. Rapidcsv unit tests automatically detects the presence of codecvt and sets HAS_CODECVT as needed, see CMakeLists.txt for reference. When enabled, the UTF-16 encoding of any loaded file is automatically detected.

API Documentation
The following classes makes up the Rapidcsv interface:

class rapidcsv::Document
class rapidcsv::SeparatorParams
class rapidcsv::LabelParams
class rapidcsv::ConverterParams
class rapidcsv::no_converter
class rapidcsv::Converter< T >
Technical Details
Rapidcsv uses cmake for its tests. Commands to build and execute the test suite:

mkdir -p build && cd build && cmake … && make && ctest -C unit --output-on-failure && ctest -C perf --verbose ; cd -
Rapidcsv uses doxyman2md to generate its API documentation:

doxyman2md src doc
Rapidcsv uses Uncrustify to ensure consistent code formatting:

uncrustify -c uncrustify.cfg --no-backup src/rapidcsv.h
Alternatives
There are many CSV parsers for C++, for example:

CSV Parser
CSVparser
Fast C++ CSV Parser
Vince’s CSV Parser
License
Rapidcsv is distributed under the BSD 3-Clause license. See LICENSE file.

Contributions
Bugs, PRs, etc are welcome on the GitHub project page https://github.com/d99kris/rapidcsv

Keywords
c++, c++11, csv parser, comma separated values, single header library.

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值