Memory Management in R: A Few Tips and Tricks

421 篇文章 15 订阅
This post discusses a few strategies that I have used to to manage memory in  R.
Stack Overflow Tips
Stack Overflow has a thread on  Memory Management Tricks. I tend to follow these suggestions:
  • .ls.objects(): There's a nice function (.ls.objects()) that lists the memory usage of the objects in the workspace using the most memory. It's good for flagging memory hogging objects that can be deleted.
  • Use scripts: Hadley Wickham suggests recording all R actions as a script and rerunning the script to restore all objects and thus remove temporary objects created in the process of programming the script.
  • Import and Save: Josh Reich mentions the strategy of importing data and then saving these imported objects to disk (see post for details).
Additional Tricks that I use


Develop code on subset of data: I've recently been processing logs of key presses from an experiment on skill acquisition. There are a million records. In order to speed up the process of testing and developing my code, I extract a subset of the data for the purposes of writing the code. A lot of people use this approach within the model testing area where models on the full dataset would take hours to run. Thus, the strategy is to build the model on a subset and then run it on the full dataset.

A tweaked version .ls.objects:
I slightly tweaked the .ls.objects() function. I find it useful to see the size of objects in terms of megabytes. Thus, when I run into the issue of using too much memory, I'll run this function and see if any of the objects using a lot of memory should be removed from the workspace (optionally saving to disk first).

.ls.objects <- function (pos = 1, pattern, order.by = "Size", decreasing=TRUE, head = TRUE, n = 10) {
  # based on postings by Petr Pikal and David Hinds to the r-help list in 2004
  # modified by: Dirk Eddelbuettel (http://stackoverflow.com/questions/1358003/tricks-to-manage-the-available-memory-in-an-r-session) 
  # I then gave it a few tweaks (show size as megabytes and use defaults that I like)
  # a data frame of the objects and their associated storage needs.
  napply <- function(names, fn) sapply(names, function(x)
          fn(get(x, pos = pos)))
  names <- ls(pos = pos, pattern = pattern)
  obj.class <- napply(names, function(x) as.character(class(x))[1])
  obj.mode <- napply(names, mode)
  obj.type <- ifelse(is.na(obj.class), obj.mode, obj.class)
  obj.size <- napply(names, object.size) / 10^6 # megabytes
  obj.dim <- t(napply(names, function(x)
            as.numeric(dim(x))[1:2]))
  vec <- is.na(obj.dim)[, 1] & (obj.type != "function")
  obj.dim[vec, 1] <- napply(names, length)[vec]
  out <- data.frame(obj.type, obj.size, obj.dim)
  names(out) <- c("Type", "Size", "Rows", "Columns")
  out <- out[order(out[[order.by]], decreasing=decreasing), ]
  if (head)
    out <- head(out, n)
  out
}
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值