WebBrowserProgramming - PythonInfo Wiki
Web Browser Programming in Python
TODO: merge in and research these, found on comp.lang.python
Yes, Python can do it... on Windows.
Two (examples) ways:
ActiveScripting (PythonScript), included in PyWin32
Gestalt (who mix Python, Ruby & JavaScript, via Silverlight)
This topic covers ways in which python can be used in web browsers to control, create or manipulate the content within a user's web browser, or a web-based technology (such as WebKit, the technology behind Safari, Midori, the OLPC Browser, Adobe AIR, Google Chrome and much more; XULrunner (the engine behind Firefox and much more); MSHTML (the engine behind IE and much more); and KDE's KHTMLPart.
To clarify what type of technology goes onto this page, some examples of types of technology that can and cannot be added to this section:
Specifically excluded from the list is technology that simply generates static HTML content. So, an HTML pretty-printer library, as the resultant HTML simply uses the browser for "display" purposes rather than using the browser as an "application execution environment", is out. Such technologies can instead be found at WebClientProgramming.
Plugins for Web Browsers that provide direct access to the DOM model of the web browser. In exactly the same way that most web browsers have JavaScript by default as a language that can directly access the DOM model of the web browser, a plugin or other system that can do the same thing (with Python using to Firefox.
AppCelerator's Titanium provides support for , using IronPython and Silverlight.
Ironpython by itself also provides support for .
Firebreath is an NPAPI plugin that extends access to the full features of DOM programming out to other programming languages, including python.
Python Wrappers around Web "Libraries" and Browser Technology
This section describes projects where you can (or have to) create your own web browser application in Python. It includes web browser "engines" that have Python interfaces to access, control and present web pages and web-relevant rich media content (such as Adobe Flash).
PythonWebKit - PythonWebKit is a Python wrapper around Webkit that provides direct access to the DOM model. PyWebKitGtk has been incorporated into the build, rather than being built separately. Unlike the patched version of PyWebKitGtk, PythonWebKit does not go via gobject to access DOM functions but instead calls the Webkit DOM functions direct.
PyWebKitGtk - PyWebkitGtk is a Python wrapper around Webkit that embeds the Webkit "engine" as a GTK widget. The standard version of PyWebKitGtk is unable to provide access to the DOM model, treating pywebkit as a hands-off widget that can be used to write your own Web Browser (see demobrowser.py). However, a patch to webkit and a corresponding patch to PyWebKitGtk will soon bring DOM model manipulation to python: see PyjamasDesktop for details.
PyWebkitQt4 is a python wrapper again around Webkit but this time as a Qt4 widget. An extremely limited subset of bindings to the DOM model have been added to PyWebkitQt4, along with the means to execute JavaScript code snippets. Whilst in principle this sounds like a fantastic idea, in practice it is insane to work with, especially for event callbacks and for anything beyond the most absolute and basic DOM manipulation. PyWebkitQt4 is best avoided for significant DOM manipulation, or is best treated as nothing more than a means to display HTML (and other web-based media such as Flash and Java applications).
PyKDE - KDE contains Python bindings to KHTMLPart (which is very similar to Webkit). This allows you to embed HTML into an application window. The Python bindings to the DOM model are slightly... obtuse. to say the least, and PyKHTML - PyKHTML makes them much more tolerable (see dom.py). However, there are limitations in PyKDE's DOM bindings (that many people will never encounter) that you should investigate thoroughly before utilising PyKDE for seriously heavy-duty DOM model manipulation. To avoid those limitations you should ensure that the entire KDE platform is compiled with C++ RTTI enabled (it is typically disabled by most distributions, by default).
WebKit with the Objective-C bindings (MacOS X users only). Webkit itself has Objective-C bindings, on MacOS X. MacOS X's Objective-C technology comes with automatic bindings to all major programming languages, including Python (using pyobjc). Consequently, you can directly manipulate the DOM model from Python. However, unlike the use of MSHTML, and unlike XULrunner and the patched version of WebKit, the Objective-C WebKit bindings are limited to just the DOM model, and are limited to strict accordance with the W3C standards (rather than the de-facto standards defined by real-world JavaScript usage). So, for example, XMLHttpRequest is not included in the Objective-C bindings (whereas it is in XULRunner); and the embed element takes width and height strictly as integers, rather than accepting 100px and stripping off px.
HulaHop provides Python access to DOM model manipulation - via XUL/Gecko Interfaces. HulaHop is part of the OLPC Sugar Project, but is available stand-alone. It depends on python-xpcom (part of XULRunner).
PyWin32 comtypes can be used (with care!) to create an MSHTML IWebBrowser2 ActiveX window and thus provide access to the full DOM features of the MSHTML (Trident) engine. PyjamasDesktop uses this technique to create the mshtml.py port. Note that creation and use of XMLHttpRequest is also shown in PyjamasDesktop's mshtml.py.
python-wxWebKit is beginning to provide Python access to DOM model manipulation - via python bindings that are auto-generated using SWIG. The goal of the project is to provide full access to the entire DOM model, and this goal is, as of May 2011, approximately 25% completed.
Python Wrappers around Web Browser "Test Suite" Libraries
This section describes projects where you can test web applications, initiated from the command-line with python bindings.
Selenium, the browser test suite, has python bindings: Install HOWTO. Selenium is a suite of tools to automate web app testing across many platforms.
Windmill is a web testing tool designed to let you painlessly automate and debug your web application. Like selenium, it also has Python bindings.