centos5.8 安装CutyCapt

CutyCapt 是Linux下抓取网页截图的软件,需要先安装QT,以下为Centos 5.8 64 Bit 安装

需求:有些网站长度过长,截图太慢.

例如:163的网站,要抓取就需要多次截图,比较麻烦

原来想用命令curl抓取网页内容,但是发现好多链接图片是无法抓取过来的.

 

1.增加atrpms的yum源

vi /etc/yum.repos.d/atrpms.repo

[atrpms]

baseurl=http://dl.atrpms.net/el$releasever-$basearch/atrpms/testing

enabled=1

gpgcheck=0

 2.安装qt47及相应包(下面的这些安装包,下载下来强制安装吧,上面那个源里面有两个版本,容易造成版本冲突)

qt47-4.7.2-1_18.el5   

qt47-devel-4.7.2-1_18.el5

qt47-x11-4.7.2-1_18.el5

qt47-webkit-4.7.2-1_18.el5

qt47-webkit-devel-4.7.2-1_18.el5

phonon-backend-gstreamer-4.7.2-1_18.el5


rpm -Uvh --force --nodeps qt47-devel-4.7.2-1_18.el5.x86_64.rpm


3. 安装 CutyCapt

svn安装
#yum install subversion

svn co https://cutycapt.svn.sourceforge.net/svnroot/cutycapt

mv cutycapt/CutyCapt /usr/local/cutycapt

cd /usr/local/cutycapt/

#这步网上很多都是直接用qmake,但是我的有问题,因为qmake是qt3的

qmake-qt47  
make

#* 再执行make的时候有可能会报如下错误

# make  

g++ -Wl,-O1 -o CutyCapt CutyCapt.o moc_CutyCapt.o    -L/usr/lib64/qt47 -lQtWebKit -lQtSvg -L/usr/lib64/qt47 -lQtGui -lQtNetwork -lQtCore –lpthread

/usr/lib64/qt47/libQtWebKit.so: undefined reference to `sqlite3_prepare16_v2'

/usr/lib64/qt47/libQtWebKit.so: undefined reference to `sqlite3_column_value'

collect2: ld returned 1 exit status

make: *** [CutyCapt] Error 1

解决:
升级到sqlite-3.6,不要卸载在安装          #如果是6.0,默认就是3.6
yum update sqlite

#* make完就生成了CutyCapt这个可执行文件。


4. 运行环境

# ./CutyCapt --help
CutyCapt: cannot connect to X server

#* 网上很多都是要在装个xvfb-run.sh的,其它不用这么麻烦:

echo "export DISPLAY=':1.0'" >> /etc/profile
source /etc/profile
vncserver

 

[root@zhaoyong cutycapt]# ./CutyCapt --help

 ---------------------------------------------------------------------

 Usage: CutyCapt --url=http://www.example.org/ --out=localfile.png           

 ---------------------------------------------------------------------

  --help                         Print this help page and exit               

  --url=<url>                    The URL to capture (http:...|file:...|...)  

  --out=<path>                   The target file (.png|pdf|ps|svg|jpeg|...)  

 --out-format=<f>              Like extension in --out, overrides heuristic

 --min-width=<int>             Minimal width for the p_w_picpath (default: 800)  

 --min-height=<int>            Minimal height for the p_w_picpath (default: 600)

  --max-wait=<ms>                Don't wait more than (default: 90000, inf: 0)

  --delay=<ms>                   After successful load, wait (default: 0)    

 --user-style-path=<path>      Location of user style sheet file, if any   

 --user-style-string=<css>     User style rules specified as text          

 --header=<name>:<value>        request header; repeatable; some can't be set

  --method=<get|post|put>        Specifies the request method (default: get)

 --body-string=<string>        Unencoded request body (default: none)      

 --body-base64=<base64>        Base64-encoded request body (default: none)

  --app-name=<name>              appName used in User-Agent; default is none

 --app-version=<version>       appVers used in User-Agent; default is none

 --user-agent=<string>         Override the User-Agent header Qt would set

 --javascript=<on|off>         JavaScript execution (default: on)          

  --java=<on|off>                Java execution (default: unknown)           

 --plugins=<on|off>            Plugin execution (default: unknown)         

 --private-browsing=<on|off>   Private browsing (default: unknown)         

 --auto-load-p_w_picpaths=<on|off>   Automatic p_w_picpath loading (default: on)       

 --js-can-open-windows=<on|off> Script can open windows? (default: unknown)

 --js-can-access-clipboard=<on|off> Script clipboard privs (default: unknown)

 --print-backgrounds=<on|off>  Backgrounds in PDF/PS output (default: off)

 --zoom-factor=<float>         Page zoom factor (default: no zooming)      

 --zoom-text-only=<on|off>     Whether to zoom only the text (default: off)

  --http-proxy=<url>             Address for HTTP proxy server (default: none)

 ---------------------------------------------------------------------

  <f> is svg,ps,pdf,itext,html,rtree,png,jpeg,mng,tiff,gif,bmp,ppm,xbm,xpm    

 ---------------------------------------------------------------------

 http://cutycapt.sf.net - (c) 2003-2010 Bjoern Hoehrmann - bjoern@hoehrmann.de

安装中文语言包

# yum install fonts-chinese

最后就可以抓取想要的页面了

[root@zhaoyong ~]# cd /usr/local/cutycapt/

[root@zhaoyong cutycapt]# ./CutyCapt --url=http://www.163.com/ --out=/root/163.jpg     ---> 抓取的页面的位置可以随意指定

 

转换整个页面至第一截屏

[root@zhaoyong ~]# convert -crop 1024x768+0+0 163.jpg 1632.jpg

缩小图片

[root@zhaoyong ~]# convert -resize 40%x40% 1632.jpg 1632.jpg