w3af代码分析，w3af线程池实现,w3af 调适环境配置，w3af win7开发环境

最新推荐文章于 2024-07-09 14:29:12 发布

leonard_wang

最新推荐文章于 2024-07-09 14:29:12 发布

阅读量2.7k

点赞数

分类专栏： python脚本语言渗透测试文章标签： web应用 sql注入安全漏洞

本文链接：https://blog.csdn.net/Leonard_wang/article/details/52880936

版权

python脚本语言同时被 2 个专栏收录

22 篇文章 1 订阅

订阅专栏

渗透测试

6 篇文章 0 订阅

订阅专栏

w3af介绍

w3af是一个Web应用程序攻击和检查框架.该项目已超过130个插件,其中包括检查网站爬虫,SQL注入(SQL Injection),跨站(XSS),本地文件包含(LFI),远程文件包含(RFI)等.该项目的目标是要建立一个框架,以寻找和开发Web应用安全漏洞,所以很容易使用和扩展.

开发环境：

windows 7

w3af的window安装版

visual studio 2015 community + PTVS插件(python tools for visual studio)

环境配置：

因为w3af要求必须是python2.6, w3af的windows安装版安装好以后，目录里面已经包含了python2.6，直接把这个设置为VS2015的python interpreter即可，这里提醒下最好使用安装版里面的python2.6，因为里面还有一些依赖包已经装好了。配置VS2015的python interpreter可以参考这个链接：https://github.com/Microsoft/PTVS/wiki/Selecting-and-Installing-Python-Interpreters

在新建python工程之前，先将w3af安装目录下面的w3af/w3af_console添加.py后缀，因为在调适的时候就是从这个文件启动的。然后新建一个python工程，将w3af_console.py所在目录作为工程根目录。

在“解决方案资源管理器”那里右键w3af_console.py，选择”start with debugging”，如果可以跟踪那么就可以开发了。

正式开始

w3af的控制台命令是如何初始化的这里就不详细介绍了，感兴趣的可以参考这里。这里我们直接从在控制台输入“start”命令启动扫描后开始。代码会来到这里

w3afCore.py：

def start(self):
    '''
    The user interfaces call this method to start the whole scanning
    process.
    This method raises almost every possible exception, so please do your
    error handling!
    '''
    try:
        self._realStart()

然后进入到self._realStart()的这里

for url in cf.cf.getData('targets'):
   try:
       #
       #    GET the initial target URLs in order to save them
       #    in a list and use them as our bootstrap URLs
       #
       response = self.uriOpener.GET(url, useCache=True)
       self._fuzzableRequestList += filter(
           get_curr_scope_pages, createFuzzableRequests(response))

url就是在命令行配置的target，这里取出之后调用self.uriOpener.GET函数获取httpresponse
有了httpresponse就可以调用discovery插件了

self._fuzzableRequestList = self._discover_and_bruteforce()

这里为什么要先获取一个httpresponse再调用discovery插件呢？因为target只是一个起始点，它里面可能还包括更多的link，所以需要把这个httpresponse作为discovery的输入。

接下来的调用关系是这样的

self._discover_and_bruteforce调用self._discover()调用self._discoverWorker()，就到了这里

try:
       # Perform the actual work
       pluginResult = plugin.discover_wrapper( fr )

plugin.discover_wrapper()是discovery插件类的基类basePlugin的函数，它会调用自己的self.discover()，这个函数作者在注释里说明必须被所有的discovery插件类重写，所以真正调用的是你配置的discovery插件类的self.discover(），这里我配置的是webSpider，这里讲下webSpider的self.discover(）里面的两个代码块，把它们搞清楚对于理解后面的audit有帮助。
webSpider.py的self.discover(）：
第一个代码块：

response = self._sendMutant( fuzzableRequest, analyze=False )

第二个代码块是：

targs = (ref, fuzzableRequest, originalURL, possibly_broken)
self._tm.startFunction( target=self._verify_reference, args=targs, \
ownerObj=self )

先来讲第一个

self._sendMutant()的函数体如下所示，它的主要作用有两个，一个是发送httprequest，通过调用UrlOpenerProxy类的对应方法实现，这里self.urlOpener就是UrlOpenerProxy类的一个实例。另一个作用是决定是否调用一个回调函数，由self._sendMutant()的analyze参数决定，这里送入的是False，所以不会调用。sqli插件调用的时候没送analyze，所以就使用默认值True，后面会讲到是True的话如何处理。

def _sendMutant(self, mutant, analyze=True, grepResult=True,
                analyze_callback=None, useCache=True):
    '''
    Sends a mutant to the remote web server.
    '''
    #
    #
    #   IMPORTANT NOTE: If you touch something here, the whole framework may stop working!
    #
    #
    url = mutant.getURI()
    data = mutant.getData()

    # Also add the cookie header; this is needed by the mutantCookie
    headers = mutant.getHeaders()
    cookie = mutant.getCookie()
    if cookie:
        headers['Cookie'] = str(cookie)

    args = ( url, )
    method = mutant.getMethod()

    functor = getattr( self._urlOpener , method )
    # run functor , run !   ( forest gump flash )
    res = functor(*args, data=data, headers=headers,
                  grepResult=grepResult, useCache=useCache)

    if analyze:
        if analyze_callback:
            # The user specified a custom callback for analyzing the sendMutant result
            analyze_callback(mutant, res)
        else:
            # Calling the default callback
            self._analyzeResult(mutant, res)
    return res

第二个是w3af的核心-线程池实现方式

self._tm其实是threadManagerObj，它位于baseplugin.py，是框架中最基础的一个成员，在程序刚开始的时候已经被import进来。从名字能看出来，threadManagerObj是一个threadManager的实例，定义在threadManager.py的最下面。所以在程序刚开始的时候，管理所有线程的threadManager类的实例已经初始化好了。

下面来看看调用self._tm.startFunction(）后发生了：

class threadManager:

startFunction():

    self.initPool():

        self._threadPool = ThreadPool(5,15)

可以看到是初始化了一个线程池对象self._threadPool，初始化过程是(其实是在ThreadPoolImplementation)

class ThreadPool()

self.requestsQueue = Queue.Queue()

self.resultsQueue = Queue.Queue()

self.workers = []

self.workRequests = {}

self.createWorkers(num_workers)

self.createWorkers()就是初始化num_workers个具体干活的线程，也就是WorkerThread类的实例。WorkerThread类的init函数已经调用了线程的start函数，所以线程这时就开始运行了，run函数就是线程干活的过程：从self.requestsQueue队列里面取出一个，如果队列里面没有元素，那么线程就阻塞在一个while循环里，如果有元素，那么调用送入的回调函数，也就是WorkThread的run函数的这行

self.resultQueue.put( (request, request.callable(*request.args, **request.kwds)) )

这行代码首先是调用回调函数request.callable()，callable来自WorkRequest类的callable参数，最开始是来自threadManager的startFucntion函数的target参数。然后再将结果连同request作为一个元组放入resultQueue里面。
那么WorkerThread的self.requestQueue队列的元素是从哪里来的呢？就是在self._tm.startFunction()里面，看下它的函数体，位于threadManager.py。

threadManager.py：

 def startFunction(self, target, args=(), kwds={}, restrict=True, ownerObj=None):
     if not self._initialized:
         self._initPool()
         self._initialized = True

     if not self._maxThreads and restrict:
         # Just start the function
         if not self.informed:
             om.out.debug('Threading is disabled.' )
             self.informed = True
         target(*args, **kwds)
     else:
         # Assign a job to a thread in the thread pool
         wr = WorkRequest( target, args=args, kwds=kwds, ownerObj=ownerObj )
         self._threadPool.putRequest( wr )
         msg = '[thread manager] Successfully added function to threadpool. Work queue size: '
         msg += str(self._threadPool.requestsQueue.qsize())
         om.out.debug( msg )

可以看到在第14行初始化了wr实例，这就是self.requestQueue里面的元素，后面就被put进去了。

你肯定有点乱了，回顾下先：

basePlugin.py有一个管理线程池的对象threadManagerobj，被命名为tm，tm有一个线程池self._threadpool，线程池有两个队列requestQueue和resultQueue，分别用来记录待处理的request和处理后的结果。在第一次调用tm的startFunction()函数后，线程池里面的线程启动工作(run函数开始运行)，如果requestQueue是空那么进入while循环等待处理。在每次调用tm的startFunction的时候也会给requestQueue压入一个待处理元素。里面有元素，线程就会拿到并处理。

需要说明的是，threadPool类采用了设计模式中的singleton模式，也就是说整个程序运行实例中只有这一个线程池。这一个线程池可以被discovery、audit插件调用，具体是怎么实现，请参考它和threadPoolImplementaion类的关系。至于为什么要使用singleton模式，作者在注释里是这样说的“If two ThreadPools are created, the whole threading system is crazy…”。总之，singleton模式类似把threadPool实例设置成了一个全局变量。

最后再来说一说处理结果是如何保存的，也就是resultQueue是如何处理的？

根据对代码的分析来看，只是将其中有异常报错的元素取出并raise出来，其他就没什么用了。我分析了两个插件，分别是webSpider，用来discovery的，还有一个是sqli，是分析sql注入的。它们在调用完startfunction后都会接着调用self._tm.join( self )，最终会调用到ThreadPoolImplementation的Poll函数，这个poll函数本来就是分析结果的，从代码注释也可以看出，但是从传参来看request.callback是none，第33行的回调函数不会执行，剩下的逻辑就是取出resultQueue的元素，分析它如果有异常就raise，然后从内存del掉，否则再放回resultQueue。我搜索了整个工程，除了threadPool.py没有别的地方使用resultQueue了。

def poll(self, block=False, ownerObj=None, joinAll=False):
  """Process any new results in the queue."""
  while 1:
      try:
          # still results pending?
          if not joinAll:
              owned_work_reqs_len = \
                      len([wr for wr in self.workRequests.values() \
                           if id(wr.ownerObj) == id(ownerObj)])
          else:
              owned_work_reqs_len = len(self.workRequests)

          if not owned_work_reqs_len:
              raise NoResultsPending

          if DEBUG:
              msg = 'The object calling poll("%s") still owns %s work' \
              ' requests.' % (ownerObj, owned_work_reqs_len)
              om.out.debug(msg)

          # Are there still workers to process remaining requests?
          elif block and not self.workers:
              raise NoWorkersAvailable

          # Get back a new result from the queue where the workers put
          # their result.
          request, result = self.resultsQueue.get(block=block, timeout=1)

          if id(request.ownerObj) == id(ownerObj) or joinAll:
              try:
                  # and hand them to the callback, if any
                  if request.callback:
                      request.callback(request, result)

                  # Probably a sys.exc_info tuple of the form 
                  # (type, value, traceback) 
                  if type(result) is tuple and len(result) == 3:
                      exc_type, exc_val, tb = result
                      if type(exc_type) == types.TypeType and \
                          issubclass(exc_type, Exception):
                          # Raise here and handle it in the main thread
                          raise exc_type, exc_val, tb
              finally:
                  del self.workRequests[request.requestID]
          else:
              self.resultsQueue.put((request, result))

      except Queue.Empty:
          if DEBUG:
              msg = 'The results Queue is empty, breaking.'
              om.out.debug( msg )
          break

对于webSpider和来说，它的结果在WorderThread处理的时候就被保存到了self._fuzzableRequests里面，具体参看webSpider.py的_verify_reference()函数。

对于sqli来说，它的WorkerThread的处理函数是basePlugin.py的_sendMutant()函数，在最后面会调用self._analyzeResult(mutant, res)，这个就是它的分析结果的函数，在sqli.py里面被重写，可以看到结果被以vuln类型被保存到了knowledgebase里面。

def _analyzeResult( self, mutant, response ):
 '''
 Analyze results of the _sendMutant method.
 '''
 #
 #   Only one thread at the time can enter here. This is because I want to report each
 #   vulnerability only once, and by only adding the "if self._hasNoBug" statement, that
 #   could not be done.
 #
 with self._plugin_lock:

     #
     #   I will only report the vulnerability once.
     #
     if self._hasNoBug( 'sqli' , 'sqli' , mutant.getURL() , mutant.getVar() ):

         sql_error_list = self._findsql_error( response )
         for sql_regex, sql_error_string, dbms_type in sql_error_list:
             if not sql_regex.search( mutant.getOriginalResponseBody() ):
                 # Create the vuln,
                 v = vuln.vuln( mutant )
                 v.setPluginName(self.getName())
                 v.setId( response.id )
                 v.setName( 'SQL injection vulnerability' )
                 v.setSeverity(severity.HIGH)
                 v.addToHighlight( sql_error_string )
                 v['error'] = sql_error_string
                 v['db'] = dbms_type
                 v.setDesc( 'SQL injection in a '+ v['db'] +' was found at: ' + mutant.foundAt() )
                 kb.kb.append( self, 'sqli', v )
                 break

总结

不管是discovery插件还是audit插件，它们都调用threadManager的startFucntion函数来启动线程池工作，线程的处理函数由startFunction的target参数指定，然后调用threadManager的poll函数取出线程的处理结果。目前来看，处理结果并没有处理，仍然放回了threadpool的resultQueue里面，只是将其中出现的异常raise出来。对于webSpider和sqli插件来说，处理结果在执行函数中已经保存了。目前就看到了这里，其他的细节还有待研究。

如果想和我愉快的玩耍，请关注公众号致新（zhixin991），或者扫描二维码
这里写图片描述