nodejs提取网页内容

最新推荐文章于 2024-04-04 10:05:57 发布

huyinhou

最新推荐文章于 2024-04-04 10:05:57 发布

阅读量6.4k

点赞数 1

文章标签： jsdom nodejs windows vs2013

本文链接：https://blog.csdn.net/hyhnoproblem/article/details/42252567

版权

今天，在公司想用nodejs提取一下http://msdn.microsoft.com/zh-CN/library/windows/desktop/hh802935(v=vs.85).aspx 里面的API函数列表，做一个帮助文档。

谁知道，公司电脑上安装的是vs2005,在安装jsdom进行编译的时候一直报错，node-jquery也是一样。

晚上，回来了在自己电脑上又试了一遍，笔记本上装的是vs2013，能正常编译。

附上写的一段爬取网页代码。

var jsdom = require("jsdom");

jsdom.env("a.html",  // 这里可以使用文件系统路径，或者网页链接url
	["http://code.jquery.com/jquery.js"],
	function (errors, window) {
		var $ = window.$;
		$("table tr").each(function() {
			if ($(this).find("p").length <= 0) {
				return;
			}
			
			var tds = $(this).children("td");
			
			console.log($(tds[0]).text());
			
			var as = $(tds[1]).find("a");
			as.each(function() {
				console.log($(this).attr("href"), $(this).text());
			});
		});
	}
);