datapasta 是一个 R 语言 中用于 优化数据复制和粘贴(copy-paste)的 R 包,旨在简化数据导入和转换过程,减少手动格式调整的需求,提高数据整理的效率。
功能介绍
-
将 Excel/CSV/表格数据快速粘贴到 R 代码:可将剪贴板中的数据直接转换为 data.frame、tibble、vector 等格式,无需手动整理格式。
-
从R数据转换为文本格式(适用于论文、报告):支持将 R 变量(如 data.frame、向量等)转换为 Markdown、LaTeX、CSV、TSV 等格式,方便复制到论文、报告或其他文档中。
-
提供 RStudio 加载项(Addins):允许用户在 RStudio 界面内 一键转换数据格式,提高数据输入和导出效率。
分析步骤
1.导入
示例数据:GSE173468
rm(list = ls())
install.packages("datapasta")
library(datapasta)
至网页中复制红色框选的内容
按照Tools-Addins-Browse Addins 选择数据粘贴的格式
2.数据预处理
第一种:选择了Paste as data.frame
请注意,需要修改列名。
data.frame(
stringsAsFactors = FALSE,
check.names = FALSE,
GSM5268284 = c("GSM5268285",
"GSM5268286","GSM5268287",
"GSM5268288","GSM5268289","GSM5268290",
"GSM5268291","GSM5268292",
"GSM5268293","GSM5268294",
"GSM5268295","GSM5268296","GSM5268297",
"GSM5268298","GSM5268299",
"GSM5268300","GSM5268301"),
`Patient.SC003's.normal.tissue.[N3-6-6]` = c("Patient SC005's normal tissue [N5-6-6]",
"Patient SC006's normal tissue [N6-6-6]",
"Patient SC010's tumor tissue [T10-6-6]",
"Patient SC013's tumor tissue [T13-6-6]",
"Patient SC014's tumor tissue [T14-6-6]",
"Patient SC001's tumor tissue [T1-6-6]",
"Patient SC019's tumor tissue [T19-6-6]",
"Patient SC022's tumor tissue [T22_6-6]",
"Patient SC025's tumor tissue [T25_6-6]",
"Patient SC026's tumor tissue [T26_6-6]",
"Patient SC027's tumor tissue [T27_6-6]",
"Patient SC029's metastatic tumor tissue [T29Met_6-6]",
"Patient SC029's primary tumor tissue [T29Primer_6-6]",
"Patient SC003's tumor tissue [T3-6-6]",
"Patient SC005's tumor tissue [T5-6-6]",
"Patient SC006's tumor tissue [T6-6-6]",
"Patient SC008's tumor tissue [T8-6-6]")
)
第二种:选择了Paste as tribble
请注意,也是需要修改列名。除了从tools中点击以外,还可以使用快捷键,不过可能会出现功能冲突。
tibble::tribble(
~GSM5268284, ~`Patient.SC003's.normal.tissue.[N3-6-6]`,
"GSM5268285", "Patient SC005's normal tissue [N5-6-6]",
"GSM5268286", "Patient SC006's normal tissue [N6-6-6]",
"GSM5268287", "Patient SC010's tumor tissue [T10-6-6]",
"GSM5268288", "Patient SC013's tumor tissue [T13-6-6]",
"GSM5268289", "Patient SC014's tumor tissue [T14-6-6]",
"GSM5268290", "Patient SC001's tumor tissue [T1-6-6]",
"GSM5268291", "Patient SC019's tumor tissue [T19-6-6]",
"GSM5268292", "Patient SC022's tumor tissue [T22_6-6]",
"GSM5268293", "Patient SC025's tumor tissue [T25_6-6]",
"GSM5268294", "Patient SC026's tumor tissue [T26_6-6]",
"GSM5268295", "Patient SC027's tumor tissue [T27_6-6]",
"GSM5268296", "Patient SC029's metastatic tumor tissue [T29Met_6-6]",
"GSM5268297", "Patient SC029's primary tumor tissue [T29Primer_6-6]",
"GSM5268298", "Patient SC003's tumor tissue [T3-6-6]",
"GSM5268299", "Patient SC005's tumor tissue [T5-6-6]",
"GSM5268300", "Patient SC006's tumor tissue [T6-6-6]",
"GSM5268301", "Patient SC008's tumor tissue [T8-6-6]"
)
其他的格式得到的结果也是类似的,youtube上还有演示视频~
参考资料:
-
datapasta github: https://milesmcbain.github.io/datapasta/
-
医学和生信笔记:https://mp.weixin.qq.com/s/MapdrpUsqBY0WICxlth6bA
-
生信技能树:https://mp.weixin.qq.com/s/r6VFoAi3szvtg-wCZKrsHw
注:若对内容有疑惑或者有发现明确错误的朋友,请联系后台(欢迎交流)。更多内容可关注公众号:生信方舟
- END -