我是流氓我怕谁(1)

先说说事情的经过:
最近做一个网络的安全模块,需要对网络数据进行处理。
先从浏览器下手,捕获数据。
对于我这一点不懂web的人来说,分析html真是一件遭罪的事情
不过还好,现有抓包工具,抓来分析
一切正常,连浏览器下载数据,都能正常的捕获,分析,虽然那种thunked模式的,很烦躁,好在也可以处理。


终于能用了,长出一口气
突然发现,对有些网站,不能处理,咦,怎么回事,好吧,抓包继续分析
发现content-encoding: gzip,我晕,一看就是压缩的
这下麻烦了,压缩的数据,我要处理的话,意味着我需要解压,处理完后,在压缩


先翻翻协议吧
http://www.w3.org/Protocols/rfc2616/rfc2616.html


5.3 Request Header Fields


The request-header fields allow the client to pass additional information about the request, and about the client itself, to the server. These fields act as request modifiers, with semantics equivalent to the parameters on a programming language method invocation.


       request-header = Accept                   ; Section 14.1
                      | Accept-Charset           ; Section 14.2
                      | Accept-Encoding          ; Section 14.3
                      | Accept-Language          ; Section 14.4
                      | Authorization            ; Section 14.8
                      | Expect                   ; Section 14.20
                      | From                     ; Section 14.22
                      | Host                     ; Section 14.23
                      | If-Match                 ; Section 14.24
                      | If-Modified-Since        ; Section 14.25
                      | If-None-Match            ; Section 14.26
                      | If-Range                 ; Section 14.27
                      | If-Unmodified-Since      ; Section 14.28
                      | Max-Forwards             ; Section 14.31
                      | Proxy-Authorization      ; Section 14.34
                      | Range                    ; Section 14.35
                      | Referer                  ; Section 14.36
                      | TE                       ; Section 14.39
                      | User-Agent               ; Section 14.43




14.3 Accept-Encoding


The Accept-Encoding request-header field is similar to Accept, but restricts the content-codings (section 3.5) that are acceptable in the response.


       Accept-Encoding  = "Accept-Encoding" ":"
                          1#( codings [ ";" "q" "=" qvalue ] )
       codings          = ( content-coding | "*" )
Examples of its use are:


       Accept-Encoding: compress, gzip
       Accept-Encoding:
       Accept-Encoding: *
       Accept-Encoding: compress;q=0.5, gzip;q=1.0
       Accept-Encoding: gzip;q=1.0, identity; q=0.5, *;q=0
A server tests whether a content-coding is acceptable, according to an Accept-Encoding field, using these rules:


      1. If the content-coding is one of the content-codings listed in
         the Accept-Encoding field, then it is acceptable, unless it is
         accompanied by a qvalue of 0. (As defined in section 3.9, a
         qvalue of 0 means "not acceptable.")
      2. The special "*" symbol in an Accept-Encoding field matches any
         available content-coding not explicitly listed in the header
         field.
      3. If multiple content-codings are acceptable, then the acceptable
         content-coding with the highest non-zero qvalue is preferred.
      4. The "identity" content-coding is always acceptable, unless
         specifically refused because the Accept-Encoding field includes
         "identity;q=0", or because the field includes "*;q=0" and does
         not explicitly include the "identity" content-coding. If the
         Accept-Encoding field-value is empty, then only the "identity"
         encoding is acceptable.
If an Accept-Encoding field is present in a request, and if the server cannot send a response which is acceptable according to the Accept-Encoding header, then the server SHOULD send an error response with the 406 (Not Acceptable) status code.


If no Accept-Encoding field is present in a request, the server MAY assume that the client will accept any content coding. In this case, if "identity" is one of the available content-codings, then the server SHOULD use the "identity" content-coding, unless it has additional information that a different content-coding is meaningful to the client.


      Note: If the request does not include an Accept-Encoding field,
      and if the "identity" content-coding is unavailable, then
      content-codings commonly understood by HTTP/1.0 clients (i.e.,
      "gzip" and "compress") are preferred; some older clients
      improperly display messages sent with other content-codings.  The
      server might also make this decision based on information about
      the particular user-agent or client.
      Note: Most HTTP/1.0 applications do not recognize or obey qvalues
      associated with content-codings. This means that qvalues will not
      work and are not permitted with x-gzip or x-compress.


尼玛,好几种压缩算法,以后要在加入新的支持,本屌也要跟着改。。。。。。。
算了,我还是继续流氓吧
先试试在请求的时候,把压缩标识去掉。就是告诉服务器,客户的浏览器不支持压缩
服务器太友好了,返回未压缩的内容,哈哈,暂且这么处理。


好吧,你的友好,就是我流氓的原因。
ps:由于取消了压缩选项,后台发送过来的数据,会大很多,通常能大2-3倍的样子。很明显的,访问网页等待的时间延长了......


不过没事,我是流氓我怕谁呀!!!
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值