ruby on excel

Showing posts with label excel. Show all posts
Showing posts with label excel. Show all posts

Wednesday, January 28, 2009

Automating Excel with Ruby: One Cell, Multiple Fonts

As a loyal reader mentions in the comments to my article about Formatting Worksheets, he wanted to apply different formatting to separate words in a single cell. Specifically, he had cells containing a surname, a comma, and then first name, such as:


Jeter, Derek
Matsui, Hideki
Teixeira, Mark

His goal was to format the last name in bold, while leaving the first name in normal weight font:

Jeter, Derek
Matsui, Hideki
Teixeira, Mark

(BTW, the sample names are mine. How many days until Opening Day? Or at least until pitchers and catchers report? But, alas, I digress...)

My suggested solution was to use the Characters object to get a reference to a specific subset of characters within the cell, rather than to all of the cell's text.

Call the Characters(start, length) method on a Range object (usually a cell) to return a reference to the specific characters. Note that the Characters object's index is 1-based.

Once you have your subset of characters you can apply Font formatting much like you would with a Cell object, setting the Font object's Name, Size, Bold, ColorIndex, and other properties.

So, assuming you have a worksheet object, ws, and you want to format the values in cells A1 through A3 as described above, try the following code:

ws.Range('A1:A3').Cells.each do |cell|
# find the index of the first comma:
comma = cell.Value.to_s.index(',')
# get the range of characters before the comma
lastname = cell.Characters(1, comma)
# get the range of characters after the comma
firstname = cell.Characters(comma + 2, cell.Value.to_s.size)
# apply formatting:
lastname.Font.Bold = true
firstname.Font.Bold = false
end

There you have it. Let me know if you have questions or comments.

Thanks for stopping by!

 

Sunday, July 6, 2008

Working with Win32OLE Constants

Excel (and Word, Outlook, etc.) has hundreds of built-in constants that represent numeric values. When reviewing code written in Visual Basic, you may see these constants passed when calling methods or setting property values:


mychart.ChartType = xlColumnClustered

But the above line of code won't work on its own in Ruby, as xlColumnClustered won't be recognized as a constant. So, when translating this code to Ruby, how do you get it to work?

Do-It-Yourself

In my code examples here, I usually either provide the actual value...

mychart.ChartType = 51

...or explicitly assign the value to a constant or variable myself:

xlColumnClustered = 51
mychart.ChartType = xlColumnClustered

In Ruby, constants must begin with an upper-case letter, so ' xlColumnClustered' is really a variable in the last example above. To make it a constant, I should actually name it something like ' XlColumnClustered', with an upper-case X.

How did I know that the Excel constant xlColumnClustered equals 51? Well, I simply googled ' Const xlColumnClustered' and quickly found examples where the constant was being explicitly declared in VB/VBA code ("Const xlColumnClustered = 51"). Googling for ' Excel Constants' will return numerous pages that list all the the Excel constants and their corresponding values. Microsoft provides their own listing here.

Loading the Win32OLE Constants

But you don't have to do it yourself. Ruby's win32ole library allows you to load an object's built-in constants into a class or module. To do so, first create an empty class or module:

class ExcelConst
end

Then call the WIN32OLE.const_load method. Pass this method your previously-defined Excel application object and your new ExcelConst class:

WIN32OLE.const_load(excel, ExcelConst)

This loads the Excel application object's built-in constants into your ExcelConst class, but each constant will now begin with an upper-case letter, as required in Ruby. Now you can call Excel's built-in constants from your new ExcelConst class. So our original example...

mychart.ChartType = xlColumnClustered

...works with only a slight modification, inserting the name of our ExcelConst class and capitalizing the first letter of the constant:

mychart.ChartType = ExcelConst::XlColumnClustered

There you have it. This same method works for loading constants from other win32 application objects, such as Word or Outlook.

By the way, you can review the docs for the WIN32OLE library, including the const_load method, here.

I hope you found this useful. Feel free to post a comment here or email me if you have questions, comments, or suggestions for future articles (or the book).

Thanks for stopping by!

 

Sunday, June 22, 2008

Automating Excel: Creating Charts

They say a picture is worth a thousand words, and sometimes a chart can make your point more effectively than the raw data behind it can. This is especially true when comparing relative values such as monthly revenue data or team statistics.

In our last episode, we looked at how to define titles in existing Excel charts. Sticking with the Excel Charts theme, let's now investigate how to create new charts in Excel.

A chart is a visualization of data, and for this example, the data represents Runs Scored and Runs Allowed for American League Baseball teams, as reported on the mlb.com website. Our Excel worksheet contains a row for each team, with columns for Runs Scored and Runs Allowed:



We'll use the win32ole library for automating Excel, and we'll connect to a running instance of Excel and use the already open 'mlb_stats.xls' workbook:


require 'win32ole'
xl = WIN32OLE.connect('Excel.Application')
wb = xl.Workbooks('mlb_stats.xls')

Let's define some parameter variables that we'll use later:

xlColumns = 2
xlColumnClustered = 51
xlWhite = 2
xlRed = 3
xlBlue = 5
xlGray = 15

To add a new Chart object to the workbook's Charts collection, call the Charts.Add method:

mychart = wb.Charts.Add
mychart.Name = "MLB Scoring"

The Charts.Add method returns a reference to the newly-created Chart object, which we've assigned to the variable mychart.

To delete an existing Chart object, call the Charts( chart).Delete method, where chart is the name or (1-based) index of the chart to delete:

wb.Charts("MLB Scoring").Delete
wb.Charts(1).Delete

Naturally, we can't produce a chart without data, and we use the SetSourceData method to define the source of the data that the chart will represent. This method takes 2 arguments, [1] a range of cells from a worksheet, and [2] a number that indicates whether to plot the graph by rows (1) or by columns (2):

mychart.SetSourceData wb.Worksheets("Runs Scored and Allowed").Range("A1:C15"), xlColumns

There are many different types of charts to choose from, and for our purpose, we'll use a 2-dimensional column [aka vertical bars] chart, setting the ChartType property to 51 (via our previously-defined xlColumnClustered variable):

mychart.ChartType = xlColumnClustered

Now, you could execute your code at this point and create a new chart. But let's tweak the colors a bit. The SeriesCollection object holds each series of data and its related properties. Our chart has 2 series, one for Runs Scored, and one for Runs Allowed. Let's make the Runs Scored columns blue and the Runs Allowed columns red, via the Interior.ColorIndex property:

mychart.SeriesCollection(1).Interior.ColorIndex = xlBlue
mychart.SeriesCollection(2).Interior.ColorIndex = xlRed

We could also dictate the name that is displayed for each series in the legend:

mychart.SeriesCollection(1).Name = "Runs Scored"
mychart.SeriesCollection(2).Name = "Runs Allowed "

If you do not define a name for each series, Excel will try to pull it from your source data worksheet.

The PlotArea represents the area on which the data (columns, in our case) is plotted. The ChartArea is the area surrounding the PlotArea, on which the title, axes, and legend are placed. For demonstration purposes, let's define specific colors for these objects by setting their Interior.ColorIndex property:

mychart.ChartArea.Interior.ColorIndex = xlWhite
mychart.ChartArea.Border.ColorIndex = xlBlue
mychart.PlotArea.Interior.ColorIndex = xlGray
mychart.PlotArea.Border.ColorIndex = xlWhite

Finally, we'll add a title to the top of the chart, and format the text:

mychart.HasTitle = true
mychart.ChartTitle.Characters.Text = "American League - Runs Scored vs. Runs Allowed"
mychart.ChartTitle.Font.Name = 'Verdana'
mychart.ChartTitle.Font.Size = 16
mychart.ChartTitle.Font.Bold = true

Note that we first have to set the HasTitle property to true. Without first doing this, trying to define the various ChartTitle properties will raise an error.

Our complete code looks like this, and produces a chart that looks like this:



Not a bad start, eh? A quick glance at this chart tells you that Boston and Chicago are doing very well, run-wise, and that it could be a very long season for Kansas City and Seattle. And we see that Texas scores a tremendous amount of runs, but allows even more.

I encourage you to investigate the vast array of chart methods and properties. Towards that end, you might want to check out Jon Peltier's Chart tutorials, which I have found to be helpful.

Questions? Comments? Suggestions? Post a comment here or send me an email.

Thanks for stopping by!

 

Sunday, June 15, 2008

Automating Excel: Chart, Axis, and Legend Titles

A reader writes, "I need to use Ruby to automate Excel... How can I set the chart title, axes title, or legend name, etc in the chart by using Ruby?" I confess that my knowledge of Excel charts is limited, but let's dive in and see what we can learn...

As usual, you'll start by using the win32ole library. For our example, we'll connect to a running instance of Excel:


require 'win32ole'
xl = WIN32OLE.connect('Excel.Application')

Referencing a Chart Object

Let's assume that ws is your worksheet object. You can obtain a reference to a Chart object either...

...via the Charts collection, if it is a chart sheet...

chart = ws.Charts(1).Chart

or via the ChartObjects collection, if it is an embedded chart (not a chart sheet)...

chart = ws.ChartObjects(1).Chart

...or via the ActiveChart method if the chart is the currently active object:

chart = xl.ActiveChart

Setting the Chart Title

In order to set the value of a chart's title, you must ensure that the HasTitle property is enabled:

chart.HasTitle = true

Then set the value of the ChartTitle.Text property:

chart.ChartTitle.Text = "2008 Sales"

Setting the Axes Title

Select one of the chart's axes by calling the Chart object's Axes method and passing it an integer representing either the category, value, or series axis:

XL_CATEGORY = 1
XL_VALUE = 2
XL_SERIESAXIS = 3
axis = chart.Axes(XL_CATEGORY)

Much like with the chart title above, in order to set the value of an axis' title, you must ensure that the HasTitle property is enabled:

axis.HasTitle = true

Then set the value of the ChartTitle.Text property:

axis.AxisTitle.Text = "July Sales"

Setting the Legend Names

You can reference a legend via the Chart object's SeriesCollection method, passing it the (1-based) index of the item you wish to reference:

chart.SeriesCollection(1).Name = "Jan"
chart.SeriesCollection(2).Name = "Feb"
chart.SeriesCollection(3).Name = "Mar"

And there you have it. If you found this to be useful, you may want to check out this Microsoft MSDN Chart object reference.

Got a question, comment, or suggestion? Post your comment here or send me an email.

Thanks for stopping by!

 

Sunday, January 27, 2008

RubyGarden Archives: Scripting Excel

Editor's Note: Once upon a time, there was a website named RubyGarden.org, which contained many helpful links and articles. The website has recently dropped off the face of the earth. The following "Scripting Excel" article was salvaged from the Google cache and is provided here in its entirety.



Although the Pickaxe book has some examples of scripting Excel, I still had quite a lot of digging to do for some of the scripts I needed. I got some by looking at the M$ docs (poor as they are**), and some by looking at Perl examples. So this page is for collecting examples of code that others might adapt (and add to).

See also: ScriptingOutlook, ScriptingAccess

o You can download the Office XP Excel Object Reference help file as part of an expanded help bundle here [1]. It's large [19MB], but it includes a bunch of VBA help files that I couldn't track down otherwise. The file you need for Excel items is VBAXL10.CHM. It appears to have complete docs on all the Excel objects. -- ChrisMorris

o A great resource for general Excel and programming related questions (well with VBA) are Chip Pearsons [Excel Pages]. I learned a lot from his article [Cell References In A Range] for example. -- BernhardLeicher

First of all:

require 'win32ole'

Opening spreadsheets, accessing workbooks and worksheets

excel = WIN32OLE::new('excel.Application')
workbook = excel.Workbooks.Open('c:/examples/spreadsheet.xls')
worksheet = workbook.Worksheets(1) #get hold of the first worksheet
worksheet.Select #bring it to the front -need sometimes to run macros,
# not for working with a worksheet from ruby
excel['Visible'] = true #make visible, set to false to make invisible
# again. Don't need it to be visible for script to work

reading data from spreadsheet

worksheet.Range('a12')['Value'] #get value of single cell
data = worksheet.Range('a1:c12')['Value'] #read into 2D array

finding the first empty row (using empty column A)

line = '1'
while worksheet.Range("a#{line}")['Value']
line.succ!
end #line now holds row number of first empty row

or to read as you go

line = '1'
data = []
while worksheet.Range("a#{line}")['Value']
data << worksheet.Range("a#{line}:d#{line}")['Value']
line.succ!
end

writing data into spreadsheet, example

worksheet.Range('e2')['Value'] = Time.now.strftime
'%d/%m/%Y' #single value
worksheet.Range('a5:c5')['Value'] = ['Test', '25', 'result']

loading all Excel constants into a class

class ExcelConst
end
WIN32OLE.const_load(excel, ExcelConst)

Now the constant xlDown is accessible as

ExcelConst::XlDown

To find out what constants to use you can use this script. You run it by passing in a string which is matched against the constant names.

require 'win32ole'

module ExcelConsts
end

excel = WIN32OLE.new("Excel.Application")
WIN32OLE.const_load(excel, ExcelConsts)
excel.quit()

puts 'Matches for: ' + ARGV[0]
ExcelConsts.constants.each {|const|
match = const.match(/#{ARGV[0]}/)
value = eval("ExcelConsts::#{const}")
puts ' '*4 + const + ' => ' + value.to_s unless match.nil?
}

An example would be looking for the constant to center text. I ran

ruby search_excel_consts.rb Center

and the following results came up:

XlCenterAcrossSelection => 7
XlVAlignCenter => -4108
XlCenter => -4108
XlLabelPositionCenter => -4108
XlPhoneticAlignCenter => 2
XlHAlignCetner => -4108
XlHAlignCenterAcrossSelection => 7

calling macros

excel.Run('SortByNumber')

Setting background colour

worksheet.Range('a3:f5').Interior['ColorIndex'] = 36 #pale yellow
# Set background color back to uncoloured (rnicz)
worksheet.Range('a3:f5').Interior['ColorIndex'] = -4142 # XlColorIndexNone constant
# or use Excel constant to set background color back to uncoloured
worksheet.Range('a3:f5').Interior['ColorIndex'] = ExcelConst::XlColorIndexNone

Adding Formulae

emptyRow = 15
worksheet.Range("t#{emptyRow}")['Formula'] = "=(Q#{emptyRow}+L#{emptyRow}+I#{emptyRow}+S#{emptyRow})"

saving changes

workbook.Close(1)
# or
workbook.SaveAs 'myfile.xls'
# default path is the system defined My Documents folder

ending session

excel.Quit

If you're experimenting from within irb and are having problems with processes hanging around after you've called excel.Quit - try deleting the reference to excel and invoking the garbage collector.

excel.Quit
excel = nil
GC.start

Hopefully this is of some use. Please add anything else you have discovered.

Some further stuff that I learned so far...
It partly overlaps with what ChrisMorris already wrote, maybe we can merge it later on. -- BernhardLeicher

Start Excel, create new workbook and save it:

require 'win32ole'
excel = WIN32OLE.new("excel.application")
excel.visible = true # in case you want to see what happens
workbook = excel.workbooks.add
workbook.saveas('c:/examples/spreadsheet1.xls')
workbook.close

Or, suppose Excel is already started and a few Excel files are opened (=workbooks in Excel jargon): Connect to the running instance of Excel, activate one of the workbooks and write something.

This also shows that Excel collections can be iterated very handy with "each", and that collections can sometimes be indexed by number or by name:

excel = WIN32OLE.connect("excel.application")
excel.workbooks.each{|wb|puts wb.name} # loop through workbooks and display names
excel.workbooks(1).activate # activate by number
excel.workbooks("Mappe1").activate # or by name
excel.range("b5").value="soso" # write something to cell B5

Connecting to Excel is particularly good fun when done interactively from irb: You instantly see what happens!

irb(main):001:0> require 'win32ole'
true
irb(main):002:0> excel=WIN32OLE.connect('excel.application')
#
irb(main):003:0> excel.workbooks.each{|wb|puts wb.name}
PERSONL.XLS
Mappe1
Mappe2
Mappe3
nil
irb(main):004:0> excel.workbooks(1).name
"PERSONL.XLS"
irb(main):005:0> excel.workbooks("Mappe1").activate
true
irb(main):006:0> excel.range("b5").value="soso"
nil

Excel => workbook => worksheet => range(cell)

What always bugged me when browsing through examples were the various ways of referring to a particular cell on a particular worksheet in a particular workbook. When Excel is started and "Mappe1" is the currently active workbook and "Tabelle1" is the currently active worksheet, all following statements do the same thing:

excel.workbooks("Mappe1").worksheets("Tabelle1").range("a1").value
excel.worksheets("Tabelle1").range("a1").value
excel.activeworkbook.activesheet.range("a1").value
excel.activesheet.range("a1").value
excel.range("a1").value

My confusion was probably caused by the fact that a lot of properties/methods can be called on "excel" directly and then default to the currently active workbook/worksheet. It's more a matter of taste to specify "activesheet" and "activeworkbook" or not.

And regarding the hierarchy, it seems to be as simple as: When Excel is up and running, it contains 0 (no file opened) or more workbooks (Workbook? Just Excel jargon for an Excel file!), with each workbook/file containing 1 or more worksheets.

Various methods for addressing a cell or range of cells

In the Excel object model there isn't something like a cell object, it's all covered by the "Range" object. A range can represent only one cell or a whole bunch of them (a column, a row, a rectangular block of cells, ....).

Let's assume for the following examples, that "sheet" contains a reference to an Excel worksheet, obtained e.g. by:

require 'win32ole'
excel = WIN32OLE.connect('excel.application') # connect to running instance of Excel
sheet = excel.activesheet

sheet.range(cellname/cell[, cellname/cell])

A range can be obtained by using the worksheet's Range property. A range with only one cell:

sheet.range("a1")

Or a rectangular block of cells (A1 to C3):

sheet.range("a1", "c3")

Same with one argument:

sheet.range("a1:c3")

Whole Column A:

sheet.range("a:a")

Whole Row 3:

sheet.range("3:3")

A range itself has a range property, thus allowing to write:

sheet.range("c3").range("a1").address # >> "$C$3"

Doesn't make much sense, one might note that the second range's address becomes relative to the first range.

sheet.cells(rowindex, columnindex)

This is all wonderful, but shouldn't there be a way of addressing cells by row and column number? The worksheet's Cells property does that: It gets you a range with one cell by specifying the row and column number. The indices are counted from 1, so Cells(1,1) gives you cell A1:

sheet.cells(3,1).address # >> "$C$3"

Combined with range to get range A1:C3:

sheet.range(sheet.cells(1,1), sheet.cells(3,3))

And when applied to another range, row and column index become relative to the first range:

sheet.range("b2").cells(2,2).address # >> "$C$3"

The index can be negative:

sheet.range("c3").cells(-1,-1).address # >> "$A$1"
range.offset(rowoffset, columnoffset)

If you have a range, this can be used to return another range that is y rows and x columns away (or offset) from this one. This time the offsets count from 0:

sheet.range("a1").offset(0,0).address # >> "$A$1"
sheet.range("b5").offset(1,1).address # >> "$C$6"

While offset somehow reminds of the cells function, this might make a difference: If range contains a block of cells, an offset block of cells is returned too:

sheet.range("a1:c3").offset(0,2).address # >> "$C$1:$E$3"

Negative offsets can be specified too:

sheet.range("b2").offset(-1,-1).address # >> "$A$1"

Getting cell values

There isn't only one method for obtaining a cell's value, but at least three of them: text, value, value2. So which should one use, what is the difference between them?

Sidenote:
An Excel cell's content is somewhat relative, what you see isn't necessarily what is actually inside the cell, because a cell's content is displayed according to a specified format. A cell might contain "0.12345", but is displayed as "0.12" or "0.12 DM" or whatever. It might be good to know, that internally a cell's content is either a text or a floating point number. That's it, nothing else.

Just for curiosity:
Dates are represented internally as floating point values too (more details at Chip Pearson's site: HTTP://www.cpearson.com/excel/datetime.htm):³59³ "Excel stores dates and times as a number representing the number of days since 1900-Jan-0, plus a fractional portion of a 24 hour day: ddddd.tttttt . This is called a serial date, or serial date-time."
So if the content is 37936.0 and its format is "date", it's displayed as "11.11.03" or "Nov 2003".

For the following examples let's assume a content of:

A1 => 10.12345 => formatted with 2 decimal digits => 10.12
B1 => 10.12345 => formatted as currency => 10,00 DM
C1 => 11.11.03 => date => 11.11.03

range.text

Text property retrieves the value as it is displayed, as a string. It's readonly, so only for getting values. Because my country settings are "German", floats are displayed with a comma.

sheet.range("a1").text # >> "10,12"
sheet.range("b1").text # >> "10,12 DM"
sheet.range("c1").text # >> "11.11.03"

range.value

This is the result when retrieved with value:

sheet.range("a1").value # >> 10.12345
sheet.range("b1").value # >> "10,1235"
sheet.range("c1").value # >> "2003/11/11 00:00:00"

Retrieves the "internal" value (A1 becomes a float), whereby currency and date are still returned as strings albeit somewhat different than before.

range.value2

According to the Excel documentation value2 behaves just like value, but additionally retrieves currency and dates as doubles (the actual internal content):

sheet.range("a1").value2 # >> 10.12345
sheet.range("b1").value2 # >> 10.12345
sheet.range("c1").value2 # >> 37936.0

Yes, seems to work as advertised.

Setting values

Seems that only "value" is useful here. An integer or float arrives as expected as number in Excel:

sheet.range("a1").value = 1.2345
sheet.range("a1").value = 2

For strings Excel does the same processing that it does when something is interactively entered. It thus depends on how Excel interprets the string:

sheet.range("a1").value = "10.11.2003" # becomes a date
sheet.range("a1").value = "1,2345" # becomes a number (at least with German country settings)

Iterating ranges...

...with each

Ranges can be iterated with each:

sheet.range("a1:a10").each{|cell|puts cell.value}

If the range is a block of cells the iteration goes from left to right, then down one line, and so on:

sheet.range("a1:b5").each{|cell|puts cell.value}

Iterating block of cells by row and output the first cell of each row:

sheet.range("b3:c7").rows.each{|r|puts r.cells(1,1).value}

and by column:

sheet.range("b3:c7").columns.each{|c|puts c.cells(1,1).value}

Result of row.value is an array within an array:

sheet.range("b3:c7").rows.each{|r|puts r.value.inspect} # >> [1.0, 10.0]?

...with activecell

Like moving around an Excel sheet with the cursor. Moving down one cell:

sheet.activecell.offset(1,0).activate

Walking down from the active cell until an empty cell is encountered:

sheet.activecell.offset(1,0).activate while excel.activecell.value

...with an index

range = sheet.range("b3:c7")
noofrows = range.rows.count
(1..noofrows).each{|i|puts range.cells(i,1).value}

Named Ranges

Named ranges make Excel spreadsheets more usable for the end user. To create a named range "myRange":

sheet.names.Add( { 'Name' => 'myRange', 'RefersTo' => 'A2:A216' } )

One problem! This doesn't work. Use a Range object for RefersTo?, not a String:

myRange = sheet.Range( 'A2:A216' )
sheet.names.Add( { 'Name' => 'myRange', 'RefersTo' => myRange } )

How do you use named ranges in ruby? Named ranges are kept in the worksheet as well as the workbook. You may need to check both locations.

Something like the following works for named ranges manually defined by the user:

rangeString = workbook.names( 'Sheet1!myRange' ).Value

1. Remove "=" prefix (e.g. "=Sheet1!$A$2:$A$4")

rangeString = rangeString.slice( 1, rangeString.length - 1 ) if ( rangeString =~ /^=/ ) worksheet.range( rangeString ).value = 'testing...'

Finding Data Regions...

Data don't always start in row 1, column A, the number of columns might not be fixed, and the number of rows is most often variable. There are two handy methods that can help to find that "block of data".

Let's assume, that B3:C7 contain data.

...with CurrentRegion

Given any cell inside the "data block", CurrentRegion?() finds the surrounding contiguous data region and returns its range:

sheet.range("b5").currentregion.address # >> "$B$3:$C$7"

...by "jumping around"

There's a shortcut key in Excel "+", that allows you to jump to the end/start of regions with content. With our example and the cursor in B3, pressing + would jump the cursor to cell B7, pressing that shortcut again would move the cursor to the last line of the spreadsheet to B65536. There's an equivalent method: "End()".

Finding the last row with data:

sheet.range("b3").end(-4121).address # >> "$B$7"

The parameter indicates the direction, the Excel constants are:

xlDown = -4121
xlToLeft = -4159
xlToRight = -4161
xlUp = -4162

Saving to CSV (or other formats)

Note that Excel can show quite a lot of warnings / confirm request. To supress these:

excel.DisplayAlerts = false

Then:

workbook.SaveAs 'myfile.csv', xlCSV

where xlCSV = 6.

Here are some common file formats:

xlCSV=6
xlCSVMac=22
xlCSVMSDOS=24
xlCSVWindows=23
xlCurrentPlatformText=-4158
xlExcel9795=43
xlTextMSDOS=21
xlTextPrinter=36
xlTextWindows=20

See also: ScriptingOutlook, ScriptingAccess

- How about OpenOffice and Ruby scripting? Anything in this area? The examples all seem to rely on

require 'win32ole'

and I believe this will only work on Windows OSes.


 

Wednesday, January 2, 2008

Parsing Spreadsheets with the Roo Gem

I've talked at length about using the win32ole library to automate Microsoft Excel. But there are alternatives for accessing data in Excel spreadsheets -- some of which don't even require Excel to be installed. One of these is the roo gem, which allows you to extract data from Excel, OpenOffice and Google spreadsheets. Roo provides read-only access to Excel and OpenOffice spreadsheets, but both read and write access to Google spreadsheets.

To install the roo gem, including its dependencies, open a console window and enter:


gem install roo -y

Require the roo library in your script:

require 'roo'

To parse an Excel worksheet, we first create an instance of the Excel workbook object by calling the Excel.new method and passing it the path and filename:

xl = Excel.new('C:/my_workbook.xls')

The next step is to define which worksheet in the workbook we will be working with. We do this by setting the Excel.default_sheet value to one of the worksheets in the Excel.sheets array:

xl.default_sheet = xl.sheets[0]

To extract the value from a particular cell, call the cell method, passing it the row number and either the column number or column letter. Examples:

val = xl.cell(3, 5)
val = xl.cell(3, 'E')

The row method returns an array of values from the specified row number, so...

values = xl.row(3)

...returns the values from the third row.

Similarly, the column method returns an array of values from a column:

values = xl.column(5)

Get the full details from the roo homepage, and the roo Rdoc page.

That's all for now. As always, feel free to post a comment here or email me with questions, comments, or suggestions.

Thanks for stopping by!

Digg my article

 

Friday, September 14, 2007

Ruby & Excel: 911 Characters Only, Please

I was writing some code this week to query an SQL Server database and write the results to an Excel workbook.

I was defining a range in an Excel worksheet and inserting data using code something like this:


worksheet.Range("A2").Resize(data.size, data[0].size).Value = data

The above code defines a range starting at cell A2 and extending the width and length of my 2-dimensional array, then inserts my array into that range of cells.

This worked great -- about 95 percent of the time. But occasionally it would bomb. My debugging uncovered the fact that it would always and only bomb if a string exceeding 911 characters was inserted into a cell.

Some quick googling revealed that this is a known Excel 2003 bug, acknowledged by Microsoft. Microsoft's suggested workaround is "don't do that"; don't try to insert more than 911 characters into a cell as part of an array. Gee, thanks.

I was using the method above because it is significantly faster than inserting data one cell at a time. This isn't a big deal for 100 lines x 10 columns of data, but when you're inserting a million cells of data, there's a huge time difference. I could successfully insert the data one cell at a time, but my users might have to take extended coffee breaks.

I could write the data to a tab-delimited, comma-delimited, or XML text file; then open it in Excel and tweak it if necessary. But I hoped to avoid that if possible.

In case you're wondering, I resolved it with a compromise. Rather than insert the data all at once or cell-by-cell, I inserted it row-by-row:

worksheet.Range("A#{r + 2}").Resize(1, row.size).Value = row

This will still raise an exception when attempting to insert more than 911 characters into a cell. But we'll handle those exceptions by inserting that row's data cell-by-cell.

The revised code looks something like this:

data.each_with_index do |row, r|
begin
# try to insert the entire array into the row
worksheet.Range("A#{r + 2}").Resize(1, row.size).Value = row
rescue
# if exception, then insert each cell individually
row.each_with_index do |field, c|
worksheet.Cells(r + 2, c + 1).Value = field
end
end
end

It's not as fast as an all-at-once insert, but much faster than a cell-by-cell insert.

By the way, the '+1' offset for column number is because Ruby arrays have zero-based indexes, while Excel's column indexes are 1-based. And the '+2' offset for row number is for the same reason, plus 1 to skip the first row of the worksheet, which contains the field names.

So, insert data into worksheets as arrays when you can. But don't let the 911-Characters Bug bite.

 

Friday, June 29, 2007

Using Ruby & ADO to Work with Excel Worksheets

In an earlier article, I discussed using ActiveX Data Objects (ADO) to access a Microsoft Access database. A reader commented that ADO can also be used to access data in an Excel worksheet. Here's a brief demonstration...

As usual, we'll use the win32ole library:


require 'win32ole'

Create a new ADODB.Connection object:

connection = WIN32OLE.new('ADODB.Connection')

To open a connection to your Excel workbook, we'll call the Connection object's Open method and pass it a connection string. You can use same the Microsoft Jet driver used for accessing an MS Access database, but we need to append an "Extended Property" to specify that this is an Excel woorkbook:

conn_string = 'Provider=Microsoft.Jet.OLEDB.4.0;'
conn_string << 'Data Source=c:/my_folder/my_workbook.xls;'
conn_string << 'Extended Properties=Excel 8.0;'
connection.Open(conn_string)

Now, we'll create an ADO recordset object:

recordset = WIN32OLE.new('ADODB.Recordset')

When calling the RecordSet object's Open method, pass it your SQL statement and the open connection object. When working with an Excel worksheet as your table, append '$' to the worksheet table name and wrap it in brackets:

recordset.Open("select * from [Sheet1$];", connection)

The Recordset object's GetRows method returns an array of columns (not rows, as you might expect), so we'll use the Ruby array's transpose method to convert it to an array of rows:

data = recordset.GetRows.transpose

Close the Connection object by calling its Close method:

connection.close

There you have it! My thanks to reader Khaoz for the suggestion of using ADO with Excel.

Other articles about working with ADO can be found under the ado label to the right.

As always, feel free to post a comment here or email me with questions, comments, or suggestions.

Thanks for stopping by!


Digg my article

 

Sunday, June 24, 2007

Automating Excel with Ruby: Formatting Worksheets

If you're going to automate the generation of an Excel worksheet, you might as well make it look good. There are a near-infinite number of methods that can be called upon to format rows, columns, ranges, and cells. Let's take a look at some of the most common.

To format the first row of a worksheet as bold-font, set the Font.Bold value:


worksheet.Rows(1).Font.Bold = true

To format numbers in a range, set the NumberFormat value.

To format column D as Currency:

worksheet.Columns(4).NumberFormat = "$###,##0.00"

To format column A as a date (mm/dd/yy):

worksheet.Column("A").NumberFormat = "mm/dd/yy"

Alternatively, you can set the Style value:

worksheet.Column("A").Style = "Currency"
worksheet.Column("B").Style = "Percent"

To set the alignment on a range, set its HorizontalAlignment value:

worksheet.Rows(1).HorizontalAlignment = 2 # Left
worksheet.Columns("A:F").HorizontalAlignment = 4 # Right
worksheet.Cells(3, 5).HorizontalAlignment = -4108 # Center

To auto-fit the width of a column, or the height of a row, call its AutoFit method:

worksheet.Columns.AutoFit
worksheet.Rows(1).AutoFit

To set the width of a column, set its ColumnWidth value:

worksheet.Columns(4).ColumnWidth = 25.0

To set the height of a row, set its RowHeight value:

worksheet.Rows.RowHeight = 15.0

To apply highlighting to a range, set its Interior.ColorIndex value:

worksheet.Rows(10).Interior.ColorIndex = 6 # Yellow

That's all for now, but feel free to post a comment here or send me email with questions, comments, or suggested topics.

Thanks for stopping by!

Digg my article

 

Thursday, April 26, 2007

Ruby & Excel: The InputBox Hack

A ruby-talk (comp.lang.ruby) reader asked how to get user input on Windows without resorting to console input.

While you can use Windows API calls to create a Yes/No message box, there doesn't appear to be a simple API means for creating a dialog for accepting user string input.

As others on the thread have mentioned, you can use one of the GUI toolkits (ie, wxRuby) and, if you have other GUI needs, this is the best option. If, however, the only GUI need you have is for a single input dialog, you could use the InputBox function from Excel, via the win32ole library (assuming Excel is installed):


require 'win32ole'

def get_input(prompt='', title='')
excel = WIN32OLE.new('Excel.Application')
response = excel.InputBox(prompt, title)
excel.Quit
excel = nil
return response
end

response = get_input('My Prompt', 'My Title')

Granted, this is somewhat of a hack if you're loading Excel into memory merely to display an input dialog. On the other hand, it may be simpler than adding an entire GUI library to your project, especially if you'll eventually be wrapping up all those dependencies into a distributed package, a la rubyscript2exe.

 

Wednesday, April 25, 2007

Ruby & Excel: Inserting and Deleting Rows and Columns

A reader has asked how to insert a row into a worksheet.

To insert a row, call the Insert method on the Row object where you wish to insert a new row. So this...


worksheet.Rows(2).Insert

...will move row 2 down one row and insert a new row above it.

You can insert multiple rows by passing the Insert method a colon-separated string, so...

worksheet.Rows('2:5').Insert

...inserts 4 new rows and moves rows 2 to 5 down. Note that this is a string and must therefore be quoted.

Similarly, to insert a new column, call the Insert method on the Column object where you wish to insert a new column. So either of these...

worksheet.Columns(2).Insert
worksheet.Columns('B').Insert

...will move column B/2 over one column (to the right) and insert a new column B.

This...

worksheet.Columns('2:5').Insert
worksheet.Columns('B:E').Insert

...inserts 4 new columns and moves columns 2 to 5 to the right. Note that this is a string and must therefore be quoted.

Deleting Rows and Columns works the same way, using the Delete method...

worksheet.Rows(2).Delete
worksheet.Rows('2:5').Delete
worksheet.Columns(2).Delete
worksheet.Columns('B').Delete
worksheet.Columns('C:E').Delete

Keep in mind that after inserting or deleting rows, your row (or column) indexes change -- when you delete Rows(2), Rows(3) becomes Rows(2), etc.

That's all for now. Got a question or suggestion? Post a comment or send me email.

Thanks for stopping by!

 

Sunday, April 15, 2007

Automating Excel with Ruby: Defining a Range and Inserting Data

As I previously mentioned, calling the Value method on an Excel Range object returns an array of arrays of values. Here's a quick way to get a 2-dimensional array of all values from a worksheet...


data = worksheet.UsedRange.Value

But, as I failed to mention in that article, you can also insert a 2-dimensional array into a Range of cells:

worksheet.Range("A2:G7").Value = data

If you have a lot of data to insert, this will be much faster than inputting the data cell-by-cell. But it is important that the range of cells be the same size (number of rows, number of columns) as your 2d array.

Your Array: Number of Rows

Assuming you have data, a 2-dimensional array, you can determine the number of rows by calling data.size:

rows = data.size


Your Array: Number of Columns

You can determine the number of columns by calling size on one of the data array's rows:

cols = data[0].size


Defining Your Worksheet Range

But to use the Range method, you need to define the column letters (ie, " A2: G7"). You probably already know that you want your starting point to be cell A1 or A2. But how do you programmatically determine the final column letter for an array that is 104 columns wide?

Each cell (or row, or column) object has an Address method that returns its location in letter-number format, so...

worksheet.Cells(5, 3).Address

...returns "$C$5", and we can use that string in our upcoming Range definition. So if you have a range that is 30 columns wide by 100 rows in length, which we want to insert into a Range, starting at cell A1:

first = worksheet.Cells(1,1).Address # => "$A$1"
last = worksheet.Cells(100,30).Address # => "$AD$100"
worksheet.Range("#{first}:#{last}").Value = data

If your worksheet has a header row and you will be inserting your data after that, bump up the row value by 1:

first = worksheet.Cells(2,1).Address # => "$A$2"
last = worksheet.Cells(101,30).Address # => "$AD$101"
worksheet.Range("#{first}:#{last}").Value = data


I hope that some of you find this useful. My thanks goes to Charlie for suggesting the topic.

 

Saturday, April 7, 2007

Ruby & WIN32OLE: Inspecting Objects

So, you know how to create the major WIN32OLE objects that you need, such as Excel's application, workbook, and worksheet objects. But what can you really do with them? Ruby allows you to use Object.methods to get a list of methods that can be called on the Object:


"Baseball".methods

...returns the following array of String methods...

["methods", "instance_eval", "%", "rindex", "map", "<<", "split", "any?",
"dup", "sort", "strip", "size", "instance_variables", "downcase", "min", "gsub!",
"count", "include?", "succ!", "instance_of?", "extend", "downcase!", "intern",
...
"object_id", "length", "entries", "chomp", "=~", "require", "public_methods",
"upcase", "sub!", "squeeze", "__send__", "upcase!", "crypt", "delete!", "equal?",
"freeze", "detect", "zip", "[]", "lstrip!", "center", "[]=", "to_f"]

For easier browsing, you can sort this array by calling the sort method...

"Baseball".methods.sort

This is handy when you need to call a Ruby method that you don't recall the exact name of. Scan the list of methods for the object you're working with.

Call this on an object created through win32ole, and you see this...

excel = WIN32OLE.new('Excel.Application')
excel.methods
["methods", "instance_eval", "dup", "_setproperty", "instance_variables",
"instance_of?", "extend", "eql?", "ole_func_methods", "each", "hash", "id",
"singleton_methods", "setproperty", "taint", "frozen?", "instance_variable_get",
...
"ole_get_methods", "nil?", "untaint", "gem", "send", "ole_obj_help", "display",
"inspect", "clone", "=~", "object_id", "_getproperty", "require",
"public_methods", "__send__", "equal?", "freeze", "ole_put_methods", "[]", "[]="]

Looks like the fairly generic Object methods, doesn't it? What would be helpful, in regards to OLE/COM automation, would be a method that returns the OLE methods of the object. Fortunately, the win32ole library provides just that in Object.ole_methods. This method returns an array of WIN32OLE objects. To get a list of methods for the Excel application object...

excel = WIN32OLE.new('Excel.Application')
excel.ole_methods
=> [QueryInterface, AddRef, Release, GetTypeInfoCount, GetTypeInfo, GetIDsOfNames,
Invoke, Application, Creator, Parent, ActiveCell, ActiveChart, ActiveDialog,
ActiveMenuBar, ActivePrinter, ActivePrinter, ActiveSheet, ActiveWindow,
...
CalculateFull, FindFile, CalculationVersion, ShowWindowsInTaskbar, ShowW
indowsInTaskbar, FeatureInstall, FeatureInstall, GetTypeInfoCount, GetTypeInfo,
GetIDsOfNames, Invoke]

You can't sort this array of WIN32OLE objects as-is, but you can convert each element to a string and then sort the array of strings...

excel.ole_methods.collect!{ |e| e.to_s }.sort
["ActivateMicrosoftApp", "ActiveCell", "ActiveChart", "ActiveDialog",
"ActiveMenuBar", "ActivePrinter", "ActivePrinter", "ActiveSheet", "ActiveWindow",
"ActiveWorkbook", "AddChartAutoFormat", "AddCustomList", "AddIns", "AddRef",
...
"Width", "Width", "WindowState", "WindowState", "Windows", "WindowsForPens",
"Workbooks", "WorksheetFunction", "Worksheets", "_Default", "_Evaluate",
"_FindFile", "_Run2", "_WSFunction", "_Wait"]

Note that a similar method exists, ole_get_methods, which does not return the exact same results as ole_methods. The WIN32OLE documentation does not make clear (to me) the difference between these two methods. If ole_methods does not provide what you are looking for, try ole_get_methods.

A graphical alternative to the above Ruby methods is to use an OLE Object Browser, such as the one included with Excel's Visual Basic Editor. To launch it from Excel or Word, select Tools => Macro => Visual Basic Editor. From the VB Editor, select View => Object Browser. Similar third-party and open source OLE browsers are also available.

Okay, now you've got a list of WIN32OLE methods reference you can do something with. Browse through this list of meaningful method names and you'll be on the road to gaining some valuable insight into all that you can do with a given WIN32OLE object. Now that you have a method name, go to Google and search for it (ie, excel DisplayInfoWindow) and you'll find details on the method/property and how you may use it.

 

Friday, April 6, 2007

Automating Excel with Ruby: Rows, Columns, Ranges, and Cells

An Excel worksheet object contains rows, columns, ranges, and cells. Let's look at some of the most common methods for working with these objects.

The Cells method of the Worksheet object returns a single cell. It is called with the row number and column number...


cell = worksheet.Cells(2, 5)

The Range method (of either the Application or Worksheet object) returns a collection of cells. It is called with a string value of First Cell and Last Cell, separate by a colon. The following code returns a collection of 20 cells in the rectangle from cell A1 to cell D5....

range = worksheet.Range("A1:D5")

The worksheet.UsedRange method returns a collection of cells, from the upper-left-most used cell to the lower-right-most used cell. This is useful, because the used range is usually much smaller then the entire range of cells in a worksheet.

A Worksheet or Range object also contains a (1-based index) collection of Rows and a (1-based index) collection of Columns, which you can iterate over...

for row in worksheet.Rows
# ...code...
end

for column in worksheet.Columns
# ...code...
end

for row in worksheet.Range("A1:D5").Rows
# ...code...
end

for row in worksheet.UsedRange.Rows
# ...code...
end

To get the number of rows or columns in a range, call the range's Count method...

(1..worksheet.UsedRange.Rows.Count).each do |row|
#...code...
end

You can reference a single Row or Column by index. For example, to apply a bold font to the first row of the worksheet...

worksheet.Rows(1).Font.Bold = true

To set the width of column 4 (D) to 25 pixels...

worksheet.Columns(4).ColumnWidth = 25.0

A cell object has numerous child objects and methods. The Value (read/write) method returns a Cell's value as it is stored internally. A cell's Text (read-only) method returns the value as it is currently displayed. For example, if cell B4 has the value "1234.56789" formatted as Currency, the Value and Text methods return different values...

worksheet.Cells(4, 2).Value # => "1234.56789"
worksheet.Cells(4, 2).Text # => "$1234.57"

Calling the Value method on a Range object returns an array of arrays of values. Here's a quick way to get a 2-dimensional array of all values in a worksheet...

data = worksheet.UsedRange.Value

You cannot call the Text method on a Range object.

As previously mentioned, one of the best hands-on ways of learning the various Excel objects and methods/properties is to record a macro in Excel, then review the macro's VBA code and translate it to Ruby.

As always, feel free to post comments or send email if you have questions, comments, or suggestions for future topics.

 

Sunday, April 1, 2007

Automating Excel with Ruby: The Worksheet Object

Yesterday, we looked at the Excel Workbook object. Today, we'll work with the Worksheets collection and Worksheet objects.

The Workbook object contains a (1-based index) collection of all open worksheets. You could think of the workbook.Worksheets() statement as a method which returns the collection of Worksheets (if called with no argument) or a single Worksheet (if called with an index or name argument).

To create a new worksheet, call the Add method of the Worksheets collection:


worksheet = workbook.Worksheets.Add

You can reference a single worksheet by index:

worksheet = workbook.Worksheets(1)

...or by name:

worksheet = workbook.Worksheets('Sheet1')

... or you can reference the currently active (selected) Worksheet:

worksheet = workbook.ActiveSheet

The Worksheets collection is not like a standard Ruby array. It's has a 1-based index, has no Size property, and does not recognize the '-1' index value. But it has a Count property in lieu of the Size property, so to reference the last worksheet in the collection, you could do this:

worksheet = workbook.Worksheets(workbook.Worksheets.Count)

You can, however, iterate over the Worksheets collections:

for worksheet in workbook.Worksheets
# ...code...
end

Note that the Worksheets collections may contain hidden worksheets. To hide or unhide a Worksheet, set its Visible property:

worksheet.Visible = false
workbook.Worksheets(1).Visible = true

You can get or set the name of the worksheet:

worksheet.Name = 'Amish Computer Scientists'

Note that the worksheet name cannot exceed 31 characters, and cannot include these characters: / / ? : * [ ]

To move a worksheet, call its Move method, passing the Worksheet object that you want to move this worksheet in front of. For example, to move 'Sheet1' to the slot before 'Sheet3':

workbook.Worksheets('Sheet1').Move(workbook.Worksheets('Sheet3'))

...or to move the third worksheet to the first slot:

workbook.Worksheets(3).Move(workbook.Worksheets(1))

To copy a worksheet, use the Copy command as you would the Move command:

workbook.Worksheets(3).Copy(workbook.Worksheets(1))

To print a worksheet, call the Worksheet object's PrintOut method:

worksheet.PrintOut

To delete a worksheet, call its Delete method:

worksheet.Delete

This will normally result in an Alert box being displayed, asking you to confirm the deletion. To disable such alerts, set the Excel Application object's DisplayAlerts property:

excel.DisplayAlerts = false

That's all for now. Soon we'll look at rows, columns, and cells. As always, feel free to leave a comment or send me email if there are specific items you would like to see discussed here.

 

Saturday, March 31, 2007

Automating Excel with Ruby: The Workbook Object

Previously, we looked at the Excel Application object. Today, we'll take it down a level and investigate the Workbooks collection and Workbook objects.

The Excel application object contains a (1-based index) collection of all open workbooks. You could think of the excel.Workbooks() statement as a method which returns the collection of Workbooks (if called with no argument) or a single Workbook (if called with an index or name argument).

To create a new workbook, call the Add method of the Workbooks collection:


workbook = excel.Workbooks.Add

To open an existing workbook, call the Open method of the Workbooks collection and pass it the filename of the workbook:

workbook = excel.Workbooks.Open('c:/temp/MyWorkbook.xls')

You can reference a single workbook (that's already open) by index:

workbook = excel.Workbooks(1)

...or by filename:

workbook = excel.Workbooks('c:/temp/MyWorkbook.xls')

... or you can reference the currently active (selected) Workbook:

workbook = excel.ActiveWorkbook

You can, of course, iterate over the Workbooks collection:

for workbook in excel.Workbooks
# ...code...
end

To save a new workbook, call its SaveAs method, supplying a filename:

workbook.SaveAs('c:/temp/YourWorkbook.xls')

To save changes to a previously-saved workbook, just call its Save method:

workbook.Save

To close a workbook after saving changes:

workbook.Close

Next, we'll move down to the next logical level and work with the Excel Worksheet object.

 

Friday, March 30, 2007

Automating Excel with Ruby: The Application Object

Automating Microsoft Excel can involve hundreds of objects, each with its own properties and methods. Let's start by looking at the top-level Application object.

As previously mentioned here, you'll use the win32ole library to create a new instance of the Excel application object:


require 'win32ole'
excel = WIN32OLE.new('Excel.Application')

...or to connect to an existing instance of the Excel application object:

excel = WIN32OLE.connect('Excel.Application')

Note that a new instance of Excel will be not visible by default. To show it, set the application object's Visible property to true:

excel.Visible = true

You may want to first hide Excel until your data input and formatting process is complete, then make it visible. This may speed things up, and prevents the user from interfering with your program (and vice versa).

To make Excel visible, but block user input, set the Application object's Interactive property:

excel.Interactive = false

To turn off screen updating, set the ScreenUpdating object's Interactive property:

excel.ScreenUpdating = false

This can help speed up processes that involve updating a large number of cells. Just don't forget to return ScreenUpdating and Interactive to true when completed.

You may wish to define the number of worksheets included in a new workbook, to ensure you have the exact number of worksheets that you will be using. This can be done by getting or setting the application object's SheetsInNewWorkbook value:

excel.SheetsInNewWorkbook = 3

This value is retained in the application beyond this session. So, if you're including this code in an app to be run on end-users' machines, it may be polite to reset the SheetsInNewWorkbook value to what the user had previously set, after adding your workbook:

number_of_sheets = excel.SheetsInNewWorkbook
excel.SheetsInNewWorkbook = 3
workbook = excel.Workbooks.Add
excel.SheetsInNewWorkbook = number_of_sheets

Actions such as deleting worksheets or overwriting existing files may result in an alert box asking you to confirm the action. To disable such alerts, set the Excel Application object's DisplayAlerts property:

excel.DisplayAlerts = false

To exit Excel, call the application object's Quit method:

excel.Quit

That's all for now, as I try to keep individual posts brief. Next, we'll look at the Workbook object. Feel free to leave a comment or send me email if there are specific items you would like to see discussed here.

Thanks for stopping by!

Digg my article

 

Saturday, March 17, 2007

Automating Excel with Ruby

EXAMPLE: Assuming that data represents a 2-dimensional array that you wish to write to a Microsoft Excel worksheet, here's a code snippet, with comments...


# Require the WIN32OLE library
require 'win32ole'
# Create an instance of the Excel application object
xl = WIN32OLE.new('Excel.Application')
# Make Excel visible
xl.Visible = 1
# Add a new Workbook object
wb = xl.Workbooks.Add
# Get the first Worksheet
ws = wb.Worksheets(1)
# Set the name of the worksheet tab
ws.Name = 'Sample Worksheet'
# For each row in the data set
data.each_with_index do |row, r|
# For each field in the row
row.each_with_index do |field, c|
# Write the data to the Worksheet
ws.Cells(r+1, c+1).Value = field.to_s
end
end
# Save the workbook
wb.SaveAs('c:/temp/workbook.xls')
# Close the workbook
wb.Close
# Quit Excel
xl.Quit

A key to automating Excel is understanding the Excel Object Model, which defines the objects, their properties and methods. Here's a few methods and resources that may help...

Record an Excel macro, then view the resulting VBA code to determine the necessary objects, properties, and methods.

Pick up a copy of ' Writing Excel Macros with VBA' by Steven Roman.

Google for 'Excel Object Model' to find info on the Excel Object Model.

 

Saturday, March 3, 2007

Ruby, Excel, and SQLite (instead of MS Access)

I used to rely on Microsoft Access for my desktop database needs, but have recently converted to SQLite and its associated Ruby library. SQLite is fast, light, and powerful. Combining SQLite with Ruby gives me the ability to quickly handle most data processing tasks.

Example: Let's say you've been given an Excel workbook with multiple worksheets of data, and you need to cross-reference data between worksheets. In many such cases, you'll want to import the data into a real database and use SQL.

Here's a brief, unpolished snippet of code that reads data from an open Excel workbook and creates an SQLite database with a table for each worksheet in the Excel workbook:


require 'win32ole'
require 'sqlite3'

# Connect to a running instance of Excel
xl = WIN32OLE.connect('Excel.Application')
# Get the active workbook
wb = xl.ActiveWorkbook
# Create the SQLite3 database
db = SQLite3::Database.new('excel.db')
# Create a database table for each worksheet
# in the workbook
wb.Worksheets.each do |ws|
# Grab all values from worksheet into a
# 2-dimensional array
data = ws.UsedRange.Value
# Grab first row of data to use as field names
field_names = data.shift
# Create database table using worksheet name and
# field names
db.execute("CREATE TABLE [#{ws.Name}] /
( #{field_names.join(',')} );")
# For each row of data...
data.each do |row|
# ...single-quote all field values...
row.collect! { |f| f = "'" + f.to_s + "'" }
# ...and insert a new record into the
# database table
db.execute("INSERT INTO [#{ws.Name}] VALUES /
( #{row.join(',')} );")
end
end

Now, you've got your data loaded into a real database, quickly and easily, and can get to work on it... via Ruby or the SQLite command line interface.

I would be happy to provide more info on using SQLite3 with Ruby, but why the lucky stiff gives a pretty good jump-start here.

 

Saturday, February 17, 2007

Simple Output to Excel-Compatible Files

Need to output data to an Excel-compatible file -- but don't need to do any formatting of the resulting worksheet? You could simply write the data to a tab-delimited text file and give it the ".xls" filename extension...


# where recordset is a database recordset,
# array of arrays, etc:
xls = open('worksheet.xls', 'w')
recordset.each do |row|
xls.puts(row.join("/t"))
end
xls.close

There are, of course, several alternatives involving CSV, XML, or WIN32OLE. Sometimes I will output the data to a tab-delimited text file, then use WIN32OLE to open the file in Excel, apply formatting, and then save in native Excel format...

EXCEL_FORMAT= -4143
xl = WIN32OLE.new('Excel.Application')
wb = xl.Workbooks.Open('worksheet.xls')
# insert formatting code here...
xl.DisplayAlerts = 0
wb.SaveAs('worksheet.xls', EXCEL_FORMAT)
xl.DisplayAlerts = 1

This is often much faster than writing the data directly to a new Excel workbook via WIN32OLE.

 

 

<script src="http://www.google-analytics.com/urchin.js" type="text/javascript"></script> <script type="text/javascript"></script> <script src="http://www.blogger.com/widgets/3202355009-widgets.js" type="text/javascript"></script> <script type="text/javascript"></script>

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值