![](https://img-blog.csdnimg.cn/20201014180756925.png?x-oss-process=image/resize,m_fixed,h_64,w_64)
Scrapy
hitman.banker
Thinking in Architecture and Art
展开
-
Site Analysis Note 19
1. Static Resource HTTP Response Headercache-control:public, max-age=30758400cf-cache-status:HITcf-ray:1afc29518836124f-HKGcontent-encoding:gzipcontent-type:text/cssdate:Wed, 28 Jan 2015 09:28:原创 2015-01-28 17:52:17 · 874 阅读 · 0 评论 -
One Cause of java.net.SocketTimeoutException: Read timed out
When I try to get document from a website using jsoup, I got the error after seconds of stucking.java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Met原创 2015-03-27 16:28:45 · 1640 阅读 · 0 评论 -
Example for Simple Login
1. Open chrome and its developer tool2. Conduct the login operation in chrome, record the network intercourse.3. Analyze the HTTP Request and HTTP ResponseRequest HeadersPOST /member/xlogin.ph原创 2015-03-27 19:05:05 · 701 阅读 · 0 评论 -
Great toolset for Web Scraping
www.freeformatter.com is very handy for web scraping stuff, atm I'm using the xpath-tester for validating my xpath expression.http://www.freeformatter.com/xpath-tester.html原创 2015-12-31 14:58:00 · 442 阅读 · 0 评论 -
Python Scraping Tools
scrapy: application framework for web scraping and crawlingbeautifulsoup: library for parsing HTMLmechanizelxml原创 2016-01-05 06:52:01 · 835 阅读 · 0 评论