R语言 rmarkdown包 会调用 pandoc-1.19.2 ,
library(rmarkdown)
render("test.Rmd", "html_document")
试用了,发现转换后的 HTML 视觉效果很差。
决定自己用 python 写一个 R markdown 格式转换为 html
Rmd2htm.py
# -*- coding: utf-8 -*-
import os, sys
import re
if len(sys.argv) ==2:
f1 = sys.argv[1]
else:
print 'usage: Rmd2htm.py file1.Rmd '
sys.exit(1)
if not os.path.exists(f1):
print 'Error: %s not found\n' % f1
sys.exit(1)
fn,ext =os.path.splitext(f1)
if ext != '.Rmd' and ext !='.rmd':
print 'Error: %s is not .Rmd' % ext
sys.exit(1)
head = """<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<style type="text/css">
code {
color: inherit;
background-color: rgba(0, 0, 0, 0.05);
}
</style>
</head>
<body>
"""
foot = """
</body>
</html>
"""
fp = open(f1, 'r')
f2 = fn +'.htm'
fp2 = open(f2, 'w')
fp2.write(head)
iscode = False
for line in fp:
aline = line.strip()
if len(aline) ==0: fp2.write("<p>\n")
elif aline.startswith("---"): fp2.write("<hr/>\n")
elif aline.startswith("```{r"): fp2.write("<code><pre>\n");iscode=True
elif aline.startswith("```"):
if iscode: fp2.write("</pre></code>\n");iscode=False
else: fp2.write("<code><pre>\n");iscode=True
elif iscode: fp2.write(line)
else:
if aline.startswith("######"): aline = "<h6>"+aline[6:]+" </h6>"
elif aline.startswith("#####"): aline = "<h5>"+aline[5:]+" </h5>"
elif aline.startswith("####"): aline = "<h4>"+aline[4:]+" </h4>"
elif aline.startswith("###"): aline = "<h3>"+aline[3:]+" </h3>"
elif aline.startswith("##"): aline = "<h2>"+aline[2:]+" </h2>"
elif aline.startswith("#"): aline = "<h1>"+aline[1:]+" </h1>"
elif aline.startswith("+"): aline = "<li>"+aline[1:]+" </li>"
elif aline.startswith("-"): aline = "<li>"+aline[1:]+" </li>"
elif aline.startswith("**"):
aline = "<strong>"+aline[2:].replace("**","</strong>",1)
if aline.find("`",0) > -1:
aline = aline.replace("`r","<code>").replace("`","</code>")
elif aline.startswith("*"): aline = "<li>"+aline[1:]+" </li>"
elif aline.find("*",0) > -1:
i = aline.find("*",0)
j = aline.find("*",i+1)
aline = aline.replace("*","<em>",1)
if j>i: aline = aline.replace("*","</em>",1)
else:
if aline.find("`",0) > -1:
aline = aline.replace("`r","<code>").replace("`","</code>")
if aline.startswith("<h") or aline.startswith("<li>"): fp2.write(aline+"\n")
else: fp2.write(aline+"<br>\n")
#
fp.close()
fp2.write(foot)
fp2.close()
print f2
运行 cmd
Rmd2htm.py test.Rmd
测试样例来自 [R in Action, 2nd] 第22章 用 R 和 Markdown 创建动态报告
# Regression Report
---
```{r echo=FALSE, results='hide'}
n <- nrow(women)
fit <- lm(weight ~ height, data=women)
sfit <- summary(fit)
b <- coefficients(fit)
```
Linear regression was used to model the relationship between
weights and height in a sample of *n* women. The equation
**weight = `r b[[1]]` + `r b[[2]]` * height**
accounted for `r round(sfit$r.squared,2)`% of the variance
in weights. The ANOVA table is given below.
---
```{r echo=FALSE, results='asis'}
library(xtable)
options(xtable.comment=FALSE)
print(xtable(sfit), type="html", html.table.attributes="border=0")
```
---
The regression is plotted in the following figure.
```{r echo=FALSE, fig.width=5, fig.height=4}
library(ggplot2)
ggplot(data=women, aes(x=height, y=weight)) +
geom_point() + geom_smooth(method="lm")
```