mongodb 转换为年月_使用Perl将Excel工作簿转换为MongoDB,并将每个工作表转换为单独的MongoDB集合

mongodb 转换为年月

A year or so back I was asked to have a play with MongoDB; within half an hour I had

大约一年前,我被要求在MongoDB上玩耍。 半个小时之内

downloaded,  installed and started the daemon, and had a console window open. 下载 ,安装并启动守护程序,并打开控制台窗口。

After an hour or two of playing at the command line I created a database or two, a couple of collections and a number of handcrafted JSON documents. At which point I went in search of a GUI and found RockMongo.

在命令行上玩了一两个小时之后,我创建了一个或两个数据库,几个集合和一些手工制作的JSON文档。 那时我去寻找GUI,找到了RockMongo

Another half hour of playing and I had a Web based GUI, that's great for ad-hoc queries and admin tasks, but your still to left to manually handcraft and enter your own JSON documents, via a textarea box. At which point I realised that if I was to evaluate the map-reduce functionality,  attempt to join data from two collections, let alone identify and evaluate any Business Reporting tools, that I would need to start cutting some code and convert an existing data source.

再玩了半个小时,我有了一个基于Web的GUI,非常适合临时查询和管理任务,但是您仍然可以通过文本框手动手工制作并输入自己的JSON文档。 在这一点上,我意识到,如果我要评估map-reduce功能,尝试将两个集合中的数据联接起来,更不用说识别和评估任何业务报告工具,那么我就需要开始切割一些代码并转换现有的数据源。 。

Data wasn't really an issue as the Companies business was data, so a simple choice  of either hooking up to a DB or using one or more of the many Excel reports kicking around and picking a language.  After a quick surf of the MongoDB site and read of the Perl tutorial I chose the latter and Perl.

数据并不是真正的问题,因为公司业务就是数据,因此可以简单地选择连接到数据库,还是使用众多Excel报告中的一个或多个并选择一种语言。 快速浏览MongoDB站点并阅读Perl 教程后,我选择了后者和Perl。

After installing the necessary Perl libraries, enter:

安装必要的Perl库后,输入:

cpan YAML Data::Dumper Spreadsheet::ParseExcel Tie::IxHash Encode Scalar::Util  JSON MongoDB MongoDB::OID File::Basename 

A quick play with both the MongoDB and Spreadsheet::ParseExcel examples and a little bit of thought, I had hacked together a very basic (and slightly naughty - blindly inserts without checking the status) command line tool that will happily convert an Excel Workbook (XLS not XLSX) into:

我快速浏览了MongoDB和Spreadsheet :: ParseExcel示例,并加了一点思考,我一起破解了一个非常基本的(而且很顽皮-盲目插入而不检查状态)命令行工具,该工具可以愉快地转换Excel工作簿( XLS而不是XLSX)转换为:

A Database - named after the file

数据库-以文件命名

A Series of collections - One per Worksheet present in the  Workbook, and named accordingly

一系列集合-工作簿中存在每个工作表一个集合,并相应命名

A Series of Documents in each Collection, where each document represents one row from a Worksheet, with it's Key names taken from the Cell (Column) names in Row 1 of the sheet

每个集合中的一系列文档,其中每个文档代表工作表中的一行,其键名取自工作表第1行中的单元格(列)名称

Anyway enough of a rant, some code.

无论如何,有些代码。

#!/usr/bin/perl -w
# Purpose: Insert each Worksheet, in an Excel Workbook, into an existing MongoDB, of the same name as the Excel(.xls).
#          The worksheet names are mapped to the collection names, and the column named to the document hash labels.
#          Assumes each sheet is named and that the first ROW on each sheet contains the hash(field) names.
#
 
use strict;
use Spreadsheet::ParseExcel;
use MongoDB;
use MongoDB::OID;
use Tie::IxHash;
 
die "You must provide a filename to $0 to be parsed as an Excel file" unless @ARGV;
 
my $sDbName              = $ARGV[0];
   $sDbName              =~ s/\.xls//i;
my $oExcel               = new Spreadsheet::ParseExcel;
my $oBook                = $oExcel->Parse($ARGV[0]);
my $oConn                = MongoDB::Connection->new(host => 'some.server:27017');
my $oDB                  = $oConn->$sDbName;
my ($sColName, %hNewDoc, $hColToInsertInto, $sFieldName, $iR, $iC, $oWkS, $oWkC);
 
print "FILE  :", $oBook->{File} , "\n";
print "DB: $sDbName\n";
print "Collection Count :", $oBook->{SheetCount} , "\n";
 
for(my $iSheet=0; $iSheet < $oBook->{SheetCount} ; $iSheet++)
{
 $oWkS                   = $oBook->{Worksheet}[$iSheet];
 $sColName               = $oWkS->{Name};
 $hColToInsertInto       = $oDB->$sColName;
 print "Collection(WorkSheet name):", $sColName, "\n";
 for(my $iR   = $oWkS->{MinRow} ; defined $oWkS->{MaxRow} && $iR <= $oWkS->{MaxRow} ;  $iR++)
 {
  tie ( %hNewDoc, "Tie::IxHash");
  for(my $iC = $oWkS->{MinCol} ; defined $oWkS->{MaxCol} && $iC <= $oWkS->{MaxCol} ; $iC++)
  {
   $sFieldName           = $oWkS->{Cells}[$oWkS->{MinRow}][$iC]->Value;
   $oWkC                 = $oWkS->{Cells}[$iR][$iC];
   $hNewDoc{$sFieldName} = $oWkC->Value if($oWkC && $sFieldName);
  }
  $hColToInsertInto->insert(\%hNewDoc);
 }
 print "Documents inserted(Rows):", ($oWkS->{MaxRow} - $oWkS->{MinRow}), "\n";
}

Change the connection ($oConn) string to suit, and if needed add a user-id and password to the arguments.

更改连接($ oConn)字符串以适合,并在需要时向参数添加用户ID和密码。

If you need XLSX support a quick switch to

如果需要XLSX支持,请快速切换至

Spreadsheet::XLSX is all that's needed. Alternatively it only takes a few lines of code, to detect the filetype and call the appropriate library. Spreadsheet :: XLSX就是所需要的。 或者,只需要花费几行代码即可检测文件类型并调用适当的库。

The above is a simple hack, assumes everything in a cell is a string / scalar, if preserving type is important, a little function with a few regexp can be used in conjunction with a few if statements to ensure numbers / dates remain in the applicable format when written to the DB

上面是一个简单的技巧,假设单元格中的所有内容都是字符串/标量,如果保留类型很重要,则可以将带有一些regexp的小功能与一些if语句结合使用,以确保数字/日期保留在适用范围内写入数据库时​​的格式

Apparently the command line is scarey, so if asked to share consider wrapping your logic in a CGI script / upload form :)

显然,命令行很吓人,因此如果要求共享,请考虑将逻辑包装在CGI脚本/上传表格中:)

The script should return some output along the following lines:

该脚本应按照以下几行返回一些输出:

arober11@wibble:~/src/perl> ./mongoTST.pl testData.xls
FILE  :testData.xls
DB: testData
Collection Count :1
Collection(WorkSheet name):Sheet1
Documents inserted(Rows):244
arober11@wibble:~/src/perl>

翻译自: https://www.experts-exchange.com/articles/10313/Using-Perl-to-convert-an-Excel-Workbook-into-a-MongoDB-with-each-Worksheet-converted-into-a-separate-MongoDB-collection.html

mongodb 转换为年月

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值