Python Webkit DOM Bindings学习

      Scrapy本身不能作为JS的解析器,因而若网页中带有AJAX,带有JS脚本时就无能为力了,看了网上几篇相关文章都介绍说,使用Webkit作为Downloader,于是便想研究一下。

相关文章:http://www.gnu.org/software/pythonwebkit/

相关文章:scrapy结合webkit抓取js生成页面(http://blog.mdcsoft.cn/archives/201111/707.html

先要安装python-webkit,那就让我们看看python-webkit是个什么东西:

The Python Webkit DOM Project makes python a full peer of javascript when
it comes to accessing and manipulating the full features available to
Webkit, such as HTML5.  Everything that can be done with javascript,
such as getElementsbyTagName and appendChild, event callbacks through
onclick, timeout callbacks through window.setTimeout, and even AJAX
using XMLHttpRequest, can also be done from python.
简要翻译:
Python Webkit让Python成为了javascript的完整客户端,我们可以用python来调用javascript,完成任何事情,比如:getElementsbyTagName、appendChild、onclick事件回调(event callback)、甚至是AJAX的XMLHttpRequest。


What is Python-Webkit?

Python-Webkit is a python extension to Webkit to add full, complete access to Webkit's DOM - Document Object Model. On its own, however, Python-Webkit doesn't actually do anything, because it is only through WebkitDFB, WebkitGTK or WebkitQt4 that Webkit "Document Objects" are actually created (and displayed, on-screen). Thus it is necessary to make a small patch to each of PyWebkitGTK and PyWebkitQt4, to "break out" access to the DOM, but for WebkitDFB, as it is very new, has its own c-based python module, included as part of PythonWebkit. Both PyWebkitDFB and PyWebkitGTK have been done, already.
简要翻译:
Python-Webkit是python为Webkit做的一个扩展,它可以完整地获得Webkit的DOM。但它自己实际上并不做任何东西,它仅仅通过WebkitDFB,WebkitGTK或WebkitQt4等去完成DOM的创建和显示。

WebKitDFB is WebKit on top of DirectFB without using GTK+ or Qt


Python-Webkit怎么用?

Here is a simple example of modifying
that script to show the addition of a button, a text node and even event
handling.  To any Web Developer who has used javascript, this should
look incredibly familiar:

    def _button_click_event(self, event):
        print "button click", event

    def _mouse_over_event(self, event):
        print "mouse over", event, event.x, event.y

    def _view_load_finished_cb(self, view, frame):

        doc = frame.get_dom_document()
        nodes = doc.getElementsByTagName('body')
        body = nodes.item(0)

        d = doc.createElement("div")
        b = doc.createElement("Button")
        b.innerHTML = "hello"
        b.onclick = self._button_click_event
        d.appendChild(b)
        txt = doc.createTextNode("hello world")
        body.appendChild(txt)
        body.appendChild(d)
        body.tabIndex = 5
        #body.addEventListener("mouseover", self._mouse_over_event, False)
        body.onmouseover = self._mouse_over_event
简要翻译:
以下是一个简单的翻译,要modify一个script以显示一个额外的按钮。

代码解析:
1、获得dom
2、获得名为“body”的Tag
3、插入HTML代码
4、插入事件侦听器addEventListener

python将webkit中的DOM取出来,动态修改之,webkit便可以将其显示出来。估计scrapy也是这样的处理。






评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值