Perl Underground 5

分享一下我老师大神的人工智能教程!零基础,通俗易懂!http://blog.csdn.net/jiangjunshow

也欢迎大家转载本篇文章。分享知识,造福人民,实现我们中华民族伟大复兴!

                 
                     $$$$$$$$$   $$$$$$$$$$$     $$$$$$$$$     $$$$            %%%%%%%%        X  x        $$$$$$$$$$$   $$$$$$$$$$$   $$$$$$$$$$$    $$$$           %%%%%%%%  x     H  H        $$$$    $$$$         $$$$   $$$$    $$$$   $$$$          %%   H    H  H    x   $$$$    $$$$         $$$$   $$$$    $$$$   $$$$         %%    H   H  H   H    $$$$    $$$$     $$$$$$$    $$$$    $$$$   $$$$         %%%%%     H  H  H  H     $$$$$$$$$$$      $$$$$$$    $$$$$$$$$$$    $$$$          %%%%%    %  X  HHHHHHHHH      $$$$$$$$$$           $$$$   $$$$$$$$$$     $$$$           %%   H HHHHHHHHH      $$$$                 $$$$   $$$$  $$$$     $$$$            %%      %%    HHHHHHHHHH      $$$$          $$$$$$$$$$$   $$$$   $$$$    $$$$$$$$$$$      %%     %%%      HHHHHHH       $$$$         $$$$$$$$$$$    $$$$    $$$$    $$$$$$$$$$$            %%%%                                                                                       %%%%%          $$$$     $$$$   $$$$      $$$$   $$$$$$$$$$   $$$$$$$$$$$     $$$$$$$$$$   %%     %%          $$$$     $$$$   $$$$$     $$$$   $$$$$$$$$$$   $$$$$$$$$$$   $$$$$$$$$$$$   %%   %%          $$$$     $$$$   $$$$$$    $$$$   $$$$    $$$$         $$$$   $$$$     $$$$   %% %%          $$$$     $$$$   $$$$$$$   $$$$   $$$$    $$$$         $$$$   $$$$     $$$$    %%%          $$$$     $$$$   $$$$ $$$  $$$$   $$$$    $$$$     $$$$$$$    $$$$     $$$$          $$$$     $$$$   $$$$  $$$ $$$$   $$$$    $$$$     $$$$$$$    $$$$$$$$$$$$    %%%%%%%          $$$$     $$$$   $$$$   $$$$$$$   $$$$    $$$$         $$$$   $$$$$$$$$$$    %%    %%          $$$$     $$$$   $$$$    $$$$$$   $$$$    $$$$         $$$$   $$$$   $$$$    %%%%%%%          $$$$$$$$$$$$$   $$$$     $$$$$   $$$$$$$$$$$   $$$$$$$$$$$   $$$$    $$$$   %%           $$$$$$$$$$$    $$$$      $$$$   $$$$$$$$$$   $$$$$$$$$$$    $$$$     $$$$   %%%%%%%  $$$$$$$$$     $$$$$$$$$$      $$$$$$$$$$$    $$$$     $$$$   $$$$      $$$$   $$$$$$$$$$$ $$$$$$$$$$$   $$$$$$$$$$$$    $$$$$$$$$$$$$   $$$$     $$$$   $$$$$     $$$$   $$$$$$$$$$$$ $$$$   $$$$   $$$$     $$$$   $$$$     $$$$   $$$$     $$$$   $$$$$$    $$$$   $$$$     $$$$ $$$$   $$$$   $$$$     $$$$   $$$$     $$$$   $$$$     $$$$   $$$$$$$   $$$$   $$$$     $$$$ $$$$          $$$$     $$$$   $$$$     $$$$   $$$$     $$$$   $$$$ $$$  $$$$   $$$$     $$$$ $$$$  $$$     $$$$$$$$$$$$    $$$$     $$$$   $$$$     $$$$   $$$$  $$$ $$$$   $$$$     $$$$ $$$$   $$$$   $$$$$$$$$$$     $$$$     $$$$   $$$$     $$$$   $$$$   $$$$$$$   $$$$     $$$$ $$$$   $$$$   $$$$   $$$$     $$$$     $$$$   $$$$     $$$$   $$$$    $$$$$$   $$$$     $$$$ $$$$$$$$$$    $$$$    $$$$    $$$$$$$$$$$$$   $$$$$$$$$$$$$   $$$$     $$$$$   $$$$$$$$$$$$  $$$$$$$$     $$$$     $$$$    $$$$$$$$$$$     $$$$$$$$$$$    $$$$      $$$$   $$$$$$$$$$$That's five, kids.[root@yourbox.anywhere]$ dateSat Mar  1 18:22:16 EST 2008[root@yourbox.anywhere]$ perl game-on.plInitiating...Dumping...$TOC[0x01] = rant(        Intro        => q{  What it's all about                       } );$TOC[0x02] = school(      PHC          => q{  trix are for kids                         } );$TOC[0x03] = school_you(  Damian       => q{  Damian on when to use OO                  } );$TOC[0x04] = rant(        Perl_5_10    => q{  It's here!                                } );$TOC[0x05] = school(      RS_IceShaman => q{  Web hax0rs combined their "skills"        } );$TOC[0x06] = school_you(  nwclark      => q{  Nicolas Clark on speed, old school        } );$TOC[0x07] = school(      n00b         => q{  The nick says it all                      } );$TOC[0x08] = school_you(  merlyn       => q{  Batman uses Scalar::Util and List::Util   } );$TOC[0x09] = school(      ilja         => q{  He poked his nose out again               } );$TOC[0x0A] = school_you(  LR           => q{  Higher-Order Functions                    } );$TOC[0x0B] = rant(        Intermission => q{  Laugh it up                               } );$TOC[0x0C] = school(      kokanin      => q{  PU5 goes retro, have you noticed?         } );$TOC[0x0D] = school_you(  broquaint    => q{  Closure on Closures                       } );$TOC[0x0E] = school(      str0ke       => q{  And of course str0ke contributed a piece  } );$TOC[0x0F] = school_you(  Abigail      => q{  Abigail's points on style                 } );$TOC[0x10] = school(      h4cky0u      => q{  If only they could code                   } );$TOC[0x11] = rant(        Advocacy     => q{  Perl rocks, no doubt.                     } );$TOC[0x12] = school_you(  Roy_Johnson  => q{  Iterators and recursion                   } );$TOC[0x13] = school(      Gumbie       => q{  Whatever makes him sleep at night         } );$TOC[0x14] = school_you(  grinder      => q{  grinder talks about 5.10                  } );$TOC[0x15] = rant(        Reading      => q{  Your reading list for this week           } );$TOC[0x16] = school(      hessamx      => q{  We are critical of friend and fan         } );$TOC[0x17] = school_you(  Ovid         => q{  Ovid's OO points                          } );$TOC[0x18] = school(      tssci        => q{  Some noobs who provide "security"         } );$TOC[0x19] = rant(        Outro        => q{  All good things come to an end            } );Schooling...-[0x01] # Welcome back to the show ---------------------------------------The official theme of Perl Underground 5 is the highly-anticipated, recently-released, Perl 5.10. This theme is more in spirit than in quantity: we have only a couple ofarticles on the topic. Besides that, we bring to you all the exciting Perl material that you can handle. Wehave impressive collections of bad code to create lessons from, and educational piecesby (mostly) established Perl experts. Let's get this party started.-[0x02] # PHC: Had better stuff to not publish ---------------------------#!/usr/bin/perl# usage: own-kyx.pl narc1.txt## this TEAM #PHRACK script will extract the email addresses # out of the narc*.txt files, enumerate the primary MX and NS # for each domain, and grab the SSHD and APACHE server version# from each of these hosts (if possible). ## For educational purposes only. Do not use.# lawl this is old shit (but not past the statute of limitations)# lets rag on old "TEAM #PHRACK"# strict and warnings bitchuse IO::Socket;# lawl you could just do @ARGV or die "...";if ($#ARGV<0) {die "you didn't supply a filename/n";}$nrq =$ARGV[0];# or my $nrq = shift or die "...";# this is probably the dirty way to do it, you could whitelist# with more accuracy and ease# look up qr// plzkthnx$msearch = '([^":/s<>()/;]*@[^":/s<>()/;/.]*.[^":/s<>()/;]*)';# very lame. use a lexical filehandle, specify the open method,# don't quote the variableopen (INF, "$nrq") or die $!;# //i is unnecessary, so is //g, and you could do this without # $&, let alone quoting it, and this is really the gross way to # do it in generalwhile(<INF>){      if (m,$msearch,ig){push(@targets, "$&");}            }close INF;# plus you can do this while you read the file, not read it all # firstforeach $victim (@targets) {        print "=====/t$victim /t=====/n"; my ($lusr, $domn) = split(/@/, $victim); $smtphost = `host -tMX $domn |cut -d/" /" -f7 | head -1`;# whats with random trailers? //e not even used here, you have# an empty replacement! dumbfucks $smtphost =~  s/[/r/n]+$//ge;        print ":: Primary MX located at $smtphost/n";        sshcheq($smtphost);        apachecheq($smtphost);        $nshost = `host -tNS $domn |cut -d/" /" -f4 | head -1`;# //e again? wtf?        $nshost =~  s/[/r/n]+$//ge;        sleep(3);        print ":: Primary NS located at $nshost/n";        sshcheq($nshost); apachecheq($nshost);        print "/n/n";# parens everywhere sleep(3);      }sub sshcheq {# I think someone is confused about where his paren is supposed to go! (my $sshost) = @_;        print ":: Testing $sshost for sshd version/n";# not a single good variable name in this script         $g = inet_aton($sshost); my $prot = 22;        socket(S,PF_INET,SOCK_STREAM,getprotobyname('tcp')) or die "$!/n";        if(connect(S,pack "SnA4x8",2,$prot,$g)) {# omg this line isn't too bad         my @in;         select(S); $|=1; print "/n";         while(<S>){ push @in, $_;}# @in = <S>; # lawl# Parse while reading the file         select(STDOUT); close(S); # man this is old school..                foreach $res (@in) {                 if ($res =~ /SSH/) {# MOST COMPLEX YOUR PROGRAM IS   chomp $res; print ":: SSHD version - $res/n";                        }  }         } else { return 0; } # coulda done this first and saved some     # in-den-tation}# same shit different subroutine, maybe you could have made them into one# with a pair of parameters HMM?sub apachecheq {        (my $whost) = @_;        print ":: Testing $whost for Apache version/n";        $g = inet_aton($whost); my $prot = 80;        socket(S,PF_INET,SOCK_STREAM,getprotobyname('tcp')) or die "$!/n";        if(connect(S,pack "SnA4x8",2,$prot,$g)) {                my @in;                select(S); $|=1; print "HEAD / HTTP/1.0/r/n/r/n";                while(<S>){ push @in, $_;}                select(STDOUT); close(S);                foreach $res (@in) {                        if ($res =~ /ache/) {                        chomp $res; print ":: HTTPD version - $res/n";                        }                }        } else { return 0; }}-[0x03] # Damian Conway's 10 considerations about using OO ---------------On Saturday, June 23rd, Damian Conway had a little free-for-all workshopthat he gave at College of DuPage in Wheaton, IL. Although the whole daywas fascinating, the most useful part for me was his discussion of ``Tencriteria for knowing when to use object-oriented design''. Apparently,Damian was once a member of Spinal Tap, because his list goes to eleven.Damian said that this list, in expanded form, is going to be part of thestandard Perl distribution soon.- Design is large, or is likely to become large- When data is aggregated into obvious structures, especially if there's a  lot of data in each aggregate  For instance, an IP address is not a good candidate: There's only 4 bytes  of information related to an IP address. An immigrant going through  customs has a lot of data related to him, such as name, country of origin,  luggage carried, destination, etc. - When types of data form a natural hierarchy that lets us use inheritance.  Inheritance is one of the most powerful feature of OO, and the ability to  use it is a flag. - When operations on data varies on data type  GIFs and JPGs might have their cropping done differently, even though  they're both graphics. - When it's likely you'll have to add data types later  OO gives you the room to expand in the future. - When interactions between data is best shown by operators  Some relations are best shown by using operators, which can be overloaded.- When implementation of components is likely to change, especially in the  same program- When the system design is already object-oriented- When huge numbers of clients use your code  If your code will be distributed to others who will use it, a standard  interface will make maintenence and safety easier. - When you have a piece of data on which many different operations are  applied  Graphics images, for instance, might be blurred, cropped, rotated, and  adjusted. - When the kinds of operations have standard names (check, process, etc)  Objects allow you to have a DB::check, ISBN::check, Shape::check, etc  without having conflicts between the types of check.-[0x04] # Perl 5.10 has arrived ------------------------------------------First, allow us to explain Perl versions, so you understand just what thismeans. Note, especially, that Perl 5.10 is not Perl 5.1, it's Perl 5.10,which comes after Perl 5.9. It's not Perl 6, it's the latest continuationof the Perl 5 language. Perl 6 is still coming.Major releases:Perl 1 was released in December 1987.Perl 2 was released in June 1988.Perl 3 was released in October 1989.Perl 4 was released in March 1991.Perl 5 (excluding alpha/beta/gamma releases) was released in October 1994.Now, at this point it might seem weird that Perl jumped four versions in seven years, yet in the 14 since then it has not moved on. Partially, it has, Perl 6 has been (roughly) specified and implemented. But it isn'tquite *here*, for various reasons. Secondly, jumping major versions for reasons such as publishing a book seems a bit silly, so they do not do it anymore. Perl 5 introduced a different way of versioning advances in Perl.Thirdly, Perl is more stable and mature now, the rate of growth has slowed.Perl 5.004 was released in May 1997.Perl 5.005 was released in July 1998.Perl 5.6 was released in March 2000. There was no Perl 5.2 or 5.4.Perl 5.8 was released in July 2002.Perl 5.10 has now been released, on December 18, 2007, 20 years to the dayafter Perl 1.That's one long story! The story is that now even decimals represent stablereleases, while odd ones (5.9) represent the working development version.See perlhist for much more detail.Perl 5.10 is a big deal. We have been using Perl 5.8 for six years now.Like any other Perl release, 5.10 has brought some things that will changehow we code Perl. It also brought some things that won't do that, and somethings that we might think better of in a few years.Here are a few of the good ones that you're likely to see.say(). say() is like Ruby puts(), or Python print(), or Perl 6 say(), etc.All it is is a print with a newline. It'll definitely be less of a pain inthe ass than print and a /n, and looks cleaner.The defined-or operator. Sometimes you want to set something to a value,like a configuration value, but also have a default. You can't always do:my $flag = $conf{flag} || $default;, because what if $conf{flag} isexplicably set to 0? So you end up doing: my $flag = defined $conf{flag} ?$conf{flag} : $default;. Here's the new way: my $flag = $conf{flag} //$default;Lexical $_. Instead of being worried about clobbering $_, we can createa lexical version and all is good, leading to shorter syntax.State variables. This is something we should have had a long time ago.They are similar in concept to C static variables. Better than using aclosure (which has also improved in Perl 5.10), usually.The notorious given statement: Perl finally has a switch statement. Kindof. Take a look, the syntax is kind of a hassle and will make you wonderwhy you aren't just using if blocks. Until you read how it uses smartmatching. The naming is smartly in-tune with the linguistic character ofPerl.Last and not least, smart matching!Possibly the single most pressing change in Perl 5.10 is smart matching.Smart matching is just that, you give two operands and Perl compares themin a natural way. Gives us a whole new area to be confused in, and tocreate data-dependent runtime bugs.perlsyn has been updated, and this is the juicy bit:~~~~~The behaviour of a smart match depends on what type of thing its argumentsare. It is always commutative, i.e. $a ~~ $b behaves the same as $b ~~ $a.The behaviour is determined by the following table: the first row thatapplies, in either order, determines the match behaviour.    $a     $b       Type of Match Implied    Matching Code    ======  =====     =====================    =============    (overloading trumps everything)    Code[+] Code[+]   referential equality     $a == $b    Any     Code[+]   scalar sub truth        $b->($a)    Hash    Hash      hash keys identical      [sort keys %$a]~~[sort keys %$b]    Hash    Array     hash slice existence     grep {exists $a->{$_}} @$b    Hash    Regex     hash key grep        grep /$b/, keys %$a    Hash    Any       hash entry existence     exists $a->{$b}    Array   Array     arrays are identical[*]    Array   Regex     array grep        grep /$b/, @$a    Array   Num       array contains number    grep $_ == $b, @$a    Array   Any       array contains string    grep $_ eq $b, @$a    Any     undef     undefined         !defined $a    Any     Regex     pattern match        $a =~ /$b/    Code()  Code()    results are equal        $a->() eq $b->()    Any     Code()    simple closure truth     $b->() # ignoring $a    Num     numish[!] numeric equality        $a == $b    Any     Str       string equality        $a eq $b    Any     Num       numeric equality        $a == $b    Any     Any       string equality        $a eq $b + - this must be a code reference whose prototype (if present) is not ""     (subs with a "" prototype are dealt with by the 'Code()' entry lower     down) * - that is, each element matches the element of same index in the other     array. If a circular reference is found, we fall back to referential     equality. ! - either a real number, or a string that looks like a numberThe "matching code" doesn't represent the real matching code, of course:it's just there to explain the intended meaning. Unlike grep, the smartmatch operator will short-circuit whenever it can.~~~~Smart matching is one of those fancy Perl 6 features that some peopledid not want backported to Perl 5. The official PU position is that whenPerl 6 comes to the show, the world will probably use it, sooner or later. But until then, don't hold anything back, Perl 5 is beautiful and we can continue to make it better.More on Perl 5.10 at the end of the zine. If you can't wait, check outthese pieces right now. Or do it later, but either way, read them. Thereis a lot more than just what we have summarized here.http://dev.perl.org/perl5/news/2007/perl-5.10.0.htmlhttp://search.cpan.org/dist/perl-5.10.0/pod/perl5100delta.pod-[0x05] # RSnake is RJoke, and IceShaman isn't much better ---------------#!/usr/bin/perl########################################## Fierce v0.9.9 - Beta 03/24/2007# By RSnake http://ha.ckers.org/fierce/# Threading and additions by IceShaman########################################## Finally, something with some length to it.. let's do this...use strict; # Nice, but no warnings?use Net::hostent;use Net::DNS;use IO::Socket;use Socket;use Getopt::Long; # props.    # command line optionsmy $class_c;my $delay = 0;my $dns;my $dns_file;my $dns_server;my @dns_servers;my $filename;  my $full_output;my $help; my $http_connect;  my $nopattern; my $range;my $search;my $suppress;my $tcp_timeout;my $threads;my $traverse; my $version;   my $wide;  my $wordlist;  # You know that my() can take a comma seperated list of arguments, right?     my @common_cnames;my $count_hostnames = 0;my @domain_ns;my $h;my @ip_and_hostname;my $logging;my %options = ();my $res = Net::DNS::Resolver->new;my $search_found;my %subnets;my %tested_names;my $this_ip;my $version_num = 'Version 0.9.9 - Beta 03/24/2007';my $webservers = 0;my $wildcard_dns;my @wildcards;my @zone;my $count;my %known_ips;my %known_names;my @output;my @thread;my $thread_support;# Wow, nice load of variables there.# Way to embrace the concept of lexical variables by having 40 of them beglobal $count = 0; # Why not set it to zero when you declare it?# ignore all errors while trying to load up thead stuffBEGIN {  $SIG{__DIE__}  = sub { };  $SIG{__WARN__} = sub { };}  # try and load thread modules, if it works import their functionsBEGIN {  eval {    require threads;    require threads::shared;    require Thread::Queue;    $thread_support = 1;  };  if ($@) { # got errors, no ithreads  :(      # awww... what a shame... there's always 505threads though    $thread_support = 0;  } else { #safe to haul in the threadding functions    import threads;    import threads::shared;    import Thread::Queue;  }}# turn errors back onBEGIN {  $SIG{__DIE__}  = 'DEFAULT';  $SIG{__WARN__} = 'DEFAULT';}# OK really, why did you need three BEGIN blocks?# Why not just use() them in the eval, because you catch failure # anyways?# Do you think your signal catching is actually useful here?# We will see more confusion as we gomy $result = GetOptions (    'dns=s'  => /$dns,    'file=s' => /$filename,   'suppress' => /$suppress,   'help'  => /$help,    'connect=s' => /$http_connect,   'range=s' => /$range,   'wide'  => /$wide,   'delay=i' => /$delay,   'dnsfile=s' => /$dns_file,   'dnsserver=s' => /$dns_server,   'version' => /$version,   'search=s' => /$search,   'wordlist=s' => /$wordlist,   'fulloutput' => /$full_output,   'nopattern' => /$nopattern,   'tcptimeout=i' => /$tcp_timeout,   'traverse=i' => /$traverse,   'threads=i' => /$threads,   );help()    if $help; # excellent oneliner therequit_early($version_num) if $version;if (!$dns && !$range) { # Try 'not' and 'and'  output("You have to use the -dns switch with a domain after it.");  quit_early("Type: perl fierce.pl -h for help");} elsif ($dns && $dns !~ /[a-z/d.-]/.[a-z]*/i) { # you want + not *  output("/n/tUhm, no. /"$dns/" is gimp. A bad domain can mess up yourday.");  quit_early("/tTry again.");}if ($filename && $filename ne '') { # If it has a value and if it's not equal to '' eh? # Does anyone else see the redundancy there?# If it passes the first condition, it will ALWAYS pass the second#  $logging = 1;  if (-e $filename) { # file exists    print "File already exists, do you want to overwrite it? [Y|N] ";    chomp(my $overwrite = <STDIN>);    if ($overwrite eq 'y' || $overwrite eq 'Y') {      open FILE, '>', $filename  or quit_early("Having trouble opening $filename anyway"); # nice. a 3 arg open and a good use of an 'or' !    } else { # Your paren style sucks.      quit_early('Okay, giving up');    }   } else {    open FILE, '>', $filename       or quit_early("Having trouble opening $filename");  } # man you could have made this cleaner, could have just done a# quit_early for a n/N and then open otherwise  output('Now logging to ' . $filename); } if ($http_connect) {  unless (-e $http_connect) {    open (HEADERS, "$http_connect") # Why'd you quote the scalar here, but        # not above? And don't you know about         # the security risks of using open()        # like this      or quit_early("Having trouble opening $http_connect");    close HEADERS; # uh... open... and close... Are you just testing that                   # you can? -r for that  } } # if user doesn't provide a number, they both end up at 0quit_early('Your delay tag must be a positive integer')    if ($delay && $delay != 0 && $delay !~ /^/d*$/); # Try 'and' instead of'&&'. Also, lose the parens.# You still don't understand how this works: if the first condition# passes, the second ALWAYS will.# what you probably think is happening is this:# if ( defined $delay && $delay != 0 && $delay !~ /^/d*$/)# But it isn't. You're just a noob.quit_early('Your thread tag must be a positive integer')   if ($threads && $threads != 0 && $threads !~ /^/d*$/);  # isn't if ($threads and not $thread_support) pretty smooth to read?# smooth like silkif ($threads && !$thread_support) {   quit_early('Perl is not configured to support ithreads');}  if ($dns_file) {  open (DNSFILE, '<', $dns_file)    or quit_early("Can't open $dns_file");   for (<DNSFILE>) {     chomp;     push @dns_servers, $_; # yucky sucky   }     if (@dns_servers) {     output("Using DNS servers from $dns_file");   } else {     output("DNS file $dns_file is empty, using default options");   }}# OK these guys are just too lame to profile much more of their code# We're gonna cut almost all of it out and just point out a few especially# funny parts# lol how about $tcp_timeout ||= 10; # or $res->tcp_timeout($tcp_timeout || 10 );if ($tcp_timeout) {  $res->tcp_timeout($tcp_timeout);} else {  $res->tcp_timeout(10);}# lawl someone meant > 255! Someone did not test his shitty code!  quit_early('The -t flag must contain an integer 0-255') if $traverse <255;# This line here makes those or's look kinda dumb, huh?  $wordlist = $wordlist || 'hosts.txt';  if (-e $wordlist) {    # user provided or default    open (WORDLIST, '<', $wordlist)   or     open (WORDLIST, '<', 'hosts.txt') or    quit_early("Can't open $wordlist or the default wordlist");# how about just ++ it? 0 + 1 = 1  if ( $subnets{"$bytes[0].$bytes[1].$bytes[2]"} ) {    $subnets{"$bytes[0].$bytes[1].$bytes[2]"}++;  } else {    $subnets{"$bytes[0].$bytes[1].$bytes[2]"} = 1;  }}# wasted variables, didn't check if the regex matched, used * instead of +  if ($wide) {    ($lowest, $highest) = (0, 255);  } else { # user provided range    if ($octet[3] =~ /(/d*)-(/d*)/) {      ($lowest, $highest) = ($1, $2);      quit_early("Your range doesn't make sense, try again")     }# WHAT COMPLEX FEATURES YOU LACK    #TODO: add port selection and range support    my $socket = new IO::Socket::INET (      PeerAddr => "$ip_and_hostname[0]",     PeerPort => 'http(80)',     Timeout  => 10,     Proto  => 'tcp',          )# It's just all very silly and stupid. To think that these guys wrote this up,# didn't clean it, didn't even test it, and then released it to the world like# it was big shit and they were bigger. kids, just keep your shitty code to# yourself. Or send it to us for PU+ certification.# RSnake needs to stick to his nice easy PHP world, where he can be a god # among retards. Same for IceShaman and HTS. Neither can play with grown-ups.-[0x06] # Nicolas Clark with some (old) notes on speed -------------------Nicholas Clark - When perl is not quite fast enoughIntroductionSo you have a perl script. And it's too slow. And you want to do somethingabout it. This is a talk about what you can do to speed it up, and alsohow you try to avoid the problem in the first place.Obvious thingsFind better algorithmYour code runs in the most efficient way that you can think of. But maybesomeone else looked at the problem from a completely different directionand found an algorithm that is 100 times faster. Are you sure you have thebest algorithm? Do some research. Throw more hardware at itIf the program doesn't have to run on many machines may be cheaper tothrow more hardware at it. After all, hardware is supposed to be cheap andprogrammers well paid. Perhaps you can gain performance by tuning yourhardware better; maybe compiling a custom kernel for your machine will beenough.mod_perlFor a CGI script that I wrote, I found that even after I'd shavedeverything off it that I could, the server could still only serve 2.5 persecond. The same server running the same script under mod_perl could serve25 per second. That's a factor of 10 speedup for very little effort. Andif your script isn't suitable for running under mod_perl there's alsofastcgi (which CGI.pm supports). And if your script isn't a CGI, you couldlook at the persistent perl daemon, package PPerl on CPAN.Rewrite in C, er C++, sorry Java, I mean C#, oops no ...Of course, one final "obvious" solution is to re-write your perl programin a language that runs as native code, such as C, C++, Java, C# orwhatever is currently flavour of the month. But these may not be practical or politically acceptable solutions.CompromisesSo you can compromise.XSYou may find that 95% of the time is spent in 5% of the code, doingsomething that perl is not that efficient at, such as bit shifting. So youcould write that bit in C, leave the rest in perl, and glue it togetherwith XS. But you'd have to learn XS and the perl API, and that's a lot ofwork.InlineOr you could use Inline. If you have to manipulate perl's internals thenyou'll still have to learn perl's API, but if all you need is to call outfrom perl to your pure C code, or someone else's C library then Inlinemakes it easy. Here's my perl script making a call to a perl function rot32. And here's aC function rot32 that takes 2 integers, rotates the first by the second,and returns an integer result. That's all you need! And you run it and itworks.    #!/usr/local/bin/perl -w    use strict;        printf "$_:/t%08X/t%08X/n", rot32 (0xdead, $_), rot32 (0xbeef, -$_)      foreach (0..31);        use Inline C => <<'EOC';        unsigned rot32 (unsigned val, int by) {      if (by >= 0) return (val >> by) | (val << (32 - by));      return (val << -by) | (val >> (32 + by));    }    EOC    __END__    0:     0000DEAD     0000BEEF    1:     80006F56     00017DDE    2:     400037AB     0002FBBC    3:     A0001BD5     0005F778    4:     D0000DEA     000BEEF0    ...Compile your own perl?Are you running your script on the perl supplied by the OS? Compiling yourown perl could make your script go faster. For example, when perl iscompiled with threading, all its internal variables are made thread safe,which slows them down a bit. If the perl is threaded, but you don't usethreads then you're paying that speed hit for no reason. Likewise, you mayhave a better compiler than the OS used. For example, I found that withgcc 3.2 some of my C code run 5% faster than with 2.9.5. [One of myhelpful hecklers in the audience said that he'd seen a 14% speedup, (if Iremember correctly) and if I remember correctly that was from recompilingthe perl interpreter itself]Different perl version?Try using a different perl version. Different releases of perl are fasterat different things. If you're using an old perl, try the latest version.If you're running the latest version but not using the newer features, tryan older version. Banish the demons of stupidityAre you using the best features of the language?hashesThere's a Larry Wall quote - Doing linear scans over an associative arrayis like trying to club someone to death with a loaded Uzi. I trust you're not doing that. But are you keeping your arrays nicelysorted so that you can do a binary search? That's fast. But using a hashshould be faster.regexpsIn languages without regexps you have to write explicit code to parsestrings. perl has regexps, and re-writing with them may make things 10times faster. Even using several with the /G anchor and the /gc flags maystill be faster.     if ( //G.../gc ) { ...    } elsif ( //G.../gc ) { ...    } elsif ( //G.../gc ) {pack and unpackpack and unpack have far too many features to remember. Look at themanpage - you may be able to replace entire subroutines with just oneunpack.undefundef. what do I mean undef? Are you calculating something only to throw it away?For example the script in the Encode module that compiles characterconversion tables would print out a warning if it saw the same charactertwice. If you or I build perl we'll just let those build warnings scrolloff the screen - we don't care - we can't do anything about it. And itturned out that keeping track of everything needed to generate thosewarnings was slowing things down considerably. So I added a flag todisable that code, and perl 5.8 defaults to use it, so it builds morequickly.IntermissionVarious helpful hecklers (most of London.pm who saw the talk (and I'mcounting David Adler as part of London.pm as he's subscribed to the list))wanted me to remind people that you really really don't want to beoptimising unless you absolutely have to. You're making your code harderto maintain, harder to extend, and easier to introduce new bugs into.Probably you've done something wrong to get to the point where you need tooptimise in the first place.I agree.Also, I'm not going to change the running order of the slides. There isn'ta good order to try to describe things in, and some of the ideas thatfollow are actually more "good practice" than optimisation techniques, sopossibly ought to come before the slides on finding slowness. I'll markwhat I think are good habits to get into, and once you understand thetechniques then I'd hope that you'd use them automatically when you firstwrite code. That way (hopefully) your code will never be so slow that youactually want to do some of the brute force optimising I describe here.TestsMust not introduce new bugsThe most important thing when you are optimising existing working code isnot to introduce new bugs.Use your full regression tests  :-) For this, you can use your full suite of regression tests. You do haveone, don't you? [At this point the audience is supposed to laugh nervously, because I'mbetting that very few people are in this desirable situation of havingcomprehensive tests written]Keep a copy of original programYou must keep a copy of your original program. It is your last resort ifall else fails. Check it into a version control system. Make an off sitebackup. Check that your backup is readable. You mustn't lose it. In the end, your ultimate test of whether you've not introduced new bugswhile optimising is to check that you get identical output from theoptimised version and the original. (With the optimised version takingless time). What causes slownessCPUIt's obvious that if you script hogs the CPU for 10 seconds solid, then tomake it go faster you'll need to reduce the CPU demand.RAMA lesser cause of slowness is memory. perl trades RAM for speedOne of the design decisions Larry made for perl was to trade memory forspeed, choosing algorithms that use more memory to run faster. So perltends to use more memory.getting slower (relative to CPU)CPUs keep getting faster. Memory is getting faster too. But not asquickly. So in relative terms memory is getting slower. [Larry was correctto choose to use more memory when he wrote perl5 over 10 years ago.However, in the future CPU speed will continue to diverge from RAM speed,so it might be an idea to revisit some of the CPU/RAM design trade offs inparrot]memory like a pyramidYou can never have enough memory, and it's never fast enough.Computer memory is like a pyramid. At the point you have the CPU and itsregisters, which are very small and very fast to access. Then you have 1or more levels of cache, which is larger, close by and fast to access.Then you have main memory, which is quite large, but further away soslower to access. Then at the base you have disk acting as virtual memory,which is huge, but very slow.Now, if your program is swapping out to disk, you'll realise, because theOS can tell you that it only took 10 seconds of CPU, but 60 secondselapsed, so you know it spent 50 seconds waiting for disk and that's yourspeed problem. But if your data is big enough to fit in main RAM, butdoesn't all sit in the cache, then the CPU will keep having to wait fordata from main RAM. And the OS timers I described count that in the CPUtime, so it may not be obvious that memory use is actually your problem.This is the original code for the part of the Encode compiler (enc2xs)that generates the warnings on duplicate characters:    if (exists $seen{$uch}) { warn sprintf("U%04X is %02X%02X and %02X%02X/n",       $val,$page,$ch,@{$seen{$uch}});    }    else { $seen{$uch} = [$page,$ch];    }It uses the hash %seen to remember all the Unicode characters that it hasprocessed. The first time that it meets a character it won't be in thehash, the exists is false, so the else block executes. It stores anarrayref containing the code page and character number in that page.That's three things per character, and there are a lot of characters inChinese.If it ever sees the same Unicode character again, it prints a warningmessage. The warning message is just a string, and this is the only placethat uses the data in %seen. So I changed the code - I pre-formatted thatbit of the error message, and stored a single scalar rather than thethree:    if (exists $seen{$uch}) { warn sprintf("U%04X is %02X%02X and %04X/n",       $val,$page,$ch,$seen{$uch});    }    else { $seen{$uch} = $page << 8 | $ch;    }That reduced the memory usage by a third, and it runs more quickly.Step by stepHow do you make things faster? Well, this is something of a black art,down to trial and error. I'll expand on aspects of these 4 points in thenext slides.What might be slow?You need to find things that are actually slow. It's no good wasting youreffort on things that are already fast - put it in where it will getmaximum reward.Think of re-writeBut not all slow things can be made faster, however much you swear atthem, so you can only actually speed things up if you can figure outanother way of doing the same thing that may be faster.Try itBut it may not. Check that it's faster and that it gives the same results.Note resultsEither way, note your results - I find a comment in the code is good. It'simportant if an idea didn't work, because it stops you or anyone elsegoing back and trying the same thing again. And it's important if a changedoes work, as it stops someone else (such as yourself next month) tidyingup an important optimisation and losing you that hard won speed gain. By having commented out slower code near the faster code you can look backand get ideas for other places you might optimise in the same way.Small easy thingsThese are things that I would consider good practice, so you ought to bedoing them as a matter of routine.AutoSplit and AutoLoaderIf you're writing modules use the AutoSplit and AutoLoader modules to makeperl only load the parts of your module that are actually being used by aparticular script. You get two gains - you don't waste CPU at start uploading the parts of your module that aren't used, and you don't waste theRAM holding the the structures that perl generates when it has compiledcode. So your modules load more quickly, and use less RAM. One potential problem is that the way AutoLoader brings in subroutinesmakes debugging confusing, which can be a problem. While developing, youcan disable AutoLoader by commenting out the __END__ statement marking thestart of your AutoLoaded subroutines. That way, they are loaded, compiledand debugged in the normal fashion.  ...  1;  # While debugging, disable AutoLoader like this:  # __END__  ...Of course, to do this you'll need another 1; at the end of the AutoLoadedsection to keep use happy, and possibly another __END__.Schwern notes that commenting out __END__ can cause surprises if the mainbody of your module is running under use strict; because now yourAutoLoaded subroutines will suddenly find themselves being run under usestrict. This is arguably a bug in the current AutoSplit - when it runs atinstall time to generate the files for AutoLoader to use it doesn't addlines such as use strict; or use warnings; to ensure that the split outsubroutines are in the same environment as was current at the __END__statement. This may be fixed in 5.10.Elizabeth Mattijsen notes that there are different memory use versusmemory shared issues when running under mod_perl, with different optimalsolutions depending on whether your apache is forking or threaded.=pod @ __END__If you are documenting your code with one big block of pod, then youprobably don't want to put it at the top of the file. The perl parser isvery fast at skipping pod, but it's not magic, so it still takes a littletime. Moreover, it has to read the pod from disk in order to ignore it.   #!perl -w  use strict;  =head1 You don't want to do that  big block of pod  =cut  ...  1;  __END__  =head1 You want to do thisIf you put your pod after an __END__ statement then the perl parser willnever even see it. This will save a small amount of CPU, but if you have alot of pod (>4K) then it might also mean that the last disk block(s) of afile are never even read in to RAM. This may gain you some speed. [Ahelpful heckler observed that modern raid systems may well be reading in64K chunks, and modern OSes are getting good at read ahead, so not readinga block as a result of =pod @ __END__ may actually be quite rare.]If you are putting your pod (and tests) next to their functions' code(which is probably a better approach anyway) then this advice is notrelevant to you.Needless importing is slowExporter is written in perl. It's fast, but not instant.Most modules are able to export lots of their functions and other symbolsinto your namespace to save you typing. If you have only one argument touse, such as    use POSIX;  # Exports all the defaultsthen POSIX will helpfully export its default list of symbols into yournamespace. If you have a list after the module name, then that is taken asa list of symbols to export. If the list is empty, no symbols areexported:    use POSIX (); # Exports nothing.You can still use all the functions and other symbols - you just have touse their full name, by typing POSIX:: at the front. Some people arguethat this actually makes your code clearer, as it is now obvious whereeach subroutine is defined. Independent of that, it's faster:use POSIX; use POSIX ();0.516s 0.355suse Socket; use Socket ();0.270s 0.231sPOSIX exports a lot of symbols by default. If you tell it to export none,it starts in 30% less time. Socket starts in 15% less time.regexpsavoid $&The $& variable returns the last text successfully matched in any regularexpression. It's not lexically scoped, so unlike the match variables $1etc it isn't reset when you leave a block. This means that to be correctperl has to keep track of it from any match, as perl has no idea when itmight be needed. As it involves taking a copy of the matched string, it'sexpensive for perl to keep track of. If you never mention $&, then perlknows it can cheat and never store it. But if you (or any module) mentions$& anywhere then perl has to keep track of it throughout the script, whichslows things down. So it's a good idea to capture the whole matchexplicitly if that's what you need.     $text =~ /.* rules/;    $line = $&;   # Now every match will copy $& - slow    $text =~ /(.* rules)/;    $line = $1;   # Didn't mention $& - fastavoid use English;use English gives helpful long names to all the punctuation variables.Unfortunately that includes aliasing $& to $MATCH which makes perl thinkthat it needs to copy every match into $&, even if you script neveractually uses it. In perl 5.8 you can say use English '-no_match_vars'; toavoid mentioning the naughty "word", but this isn't available in earlierversions of perl.avoid needless capturesAre you using parentheses for capturing, or just for grouping? Capturinginvolves perl copying the matched string into $1 etc, so it all you needis grouping use a the non-capturing (?:...) instead of the capturing(...)./.../o;If you define scalars with building blocks for your regexps, and then makeyour final regexp by interpolating them, then your final regexp isn'tgoing to change. However, perl doesn't realise this, because it sees thatthere are interpolated scalars each time it meets your regexp, and has noidea that their contents are the same as before. If your regexp doesn'tchange, then use the /o flag to tell perl, and it will never waste timechecking or recompiling it.but don't blow itYou can use the qr// operator to pre-compile your regexps. It often is theeasiest way to write regexp components to build up more complex regexps.Using it to build your regexps once is a good idea. But don't screw up(like parrot's assemble.pl did) by telling perl to recompile the sameregexp every time you enter a subroutine:     sub foo { my $reg1 = qr/.../; my $reg2 = qr/... $reg1 .../;You should pull those two regexp definitions out of the subroutine intopackage variables, or file scoped lexicals.Devel::DProfYou find what is slow by using a profiler. People often guess where theythink their program is slow, and get it hopelessly wrong. Use a profiler.Devel::DProf is in the perl core from version 5.6. If you're using anearlier perl you can get it from CPAN.You run your program with -d:DProf    perl5.8.0 -d:DProf enc2xs.orig -Q -O -o /dev/null ...which times things and stores the data in a file named tmon.out. Then yourun dprofpp to process the tmon.out file, and produce meaningful summaryinformation. This excerpt is the default length and format, but you canuse options to change things - see the man page. It also seems to show upa minor bug in dprofpp, because it manages to total things up to get 106%.While that's not right, it doesn't affect the explanation.    Total Elapsed Time = 66.85123 Seconds      User+System Time = 62.35543 Seconds    Exclusive Times    %Time ExclSec CumulS #Calls sec/call Csec/c  Name     106.   66.70 102.59 218881   0.0003 0.0005  main::enter     49.5   30.86 91.767      6   5.1443 15.294  main::compile_ucm     19.2   12.01  8.333  45242   0.0003 0.0002  main::encode_U     4.74   2.953  1.078  45242   0.0001 0.0000  utf8::unicode_to_native     4.16   2.595  0.718  45242   0.0001 0.0000  utf8::encode     0.09   0.055  0.054      5   0.0109 0.0108  main::BEGIN     0.01   0.008  0.008      1   0.0078 0.0078  Getopt::Std::getopts     0.00   0.000 -0.000      1   0.0000      -  Exporter::import     0.00   0.000 -0.000      3   0.0000      -  strict::bits     0.00   0.000 -0.000      1   0.0000      -  strict::import     0.00   0.000 -0.000      2   0.0000      -  strict::unimportAt the top of the list, the subroutine enter takes about half the totalCPU time, with 200,000 calls, each very fast. That makes it a goodcandidate to optimise, because all you have to do is make a slight changethat gives a small speedup, and that gain will be magnified 200,000 times.[It turned out that enter was tail recursive, and part of the speed gain Igot was by making it loop instead]Third on the list is encode_U, which with 45,000 calls is similar, andworth looking at. [Actually, it was trivial code and in the real enc2xs Iinlined it]utf8::unicode_to_native and utf8::encode are built-ins, so you won't beable to change that.Don't bother below there, as you've accounted for 90% of total programtime, so even if you did a perfect job on everything else, you could onlymake the program run 10% faster.compile_ucm is trickier - it's only called 6 times, so it's not obviouswhere to look for what's slow. Maybe there's a loop with many iterations.But now you're guessing, which isn't good.One trick is to break it into several subroutines, just for benchmarking,so that DProf gives you times for different bits. That way you can seewhere the juicy bits to optimise are.Devel::SmallProf should do line by line profiling, but every time I use itit seems to crash.BenchmarkNow you've identified the slow spots, you need to try alternative code tosee if you can find something faster. The Benchmark module makes thiseasy. A particularly good subroutine is cmpthese, which takes codesnippets and plots a chart. cmpthese was added to Benchmark with perl 5.6.So to compare two code snippets orig and new by running each for 10000times you'd do this:    use Benchmark ':all';        sub orig {       ...    }        sub new {       ...    }        cmpthese (10000, { orig => /&orig, new => /&new } );Benchmark runs both, times them, and then prints out a helpful comparisonchart:    Benchmark: timing 10000 iterations of new, orig...    new:  1 wallclock secs ( 0.70 usr + 0.00 sys =  0.70 CPU) @14222.22/s (n=10000)   orig:  4 wallclock secs ( 3.94 usr + 0.00 sys =  3.94 CPU) @2539.68/s (n=10000)     Rate orig  new    orig  2540/s   -- -82%    new  14222/s 460% --and it's plain to see that my new code is over 4 times as fast as myoriginal code.What causes slowness in perl?Actually, I didn't tell the whole truth earlier about what causes slownessin perl. [And astute hecklers such as Philip Newton had already told methis]When perl compilers your program it breaks it down into a sequence ofoperations it must perform, which are usually referred to as ops. So whenyou ask perl to compute $a = $b + $c it actually breaks it down into theseops:Fetch $b onto the stackFetch $c onto the stackAdd the top two things on the stack together; write the result to thestackFetch the address of $aPlace the thing on the top of stack into that addressComputers are fast at simple things like addition. But there is quite alot of overhead involved in keeping track of "which op am I currentlyperforming" and "where is the next op", and this book-keeping often swampsthe time taken to actually run the ops. So often in perl it's the numberof ops your program takes to perform its task that is more important thanthe CPU they use or the RAM it needs. The hit list isOpsCPURAMSo what were my example code snippets that I Benchmarked?It was code to split a line of hex (54726164696e67207374796c652f6d61) intogroups of 4 digits (5472 6164 696e ...) , and convert each to a number    sub orig {       map {hex $_} $line =~ /(....)/g;    }    sub new {       unpack "n*", pack "H*", $line;    }The two produce the same results:orig 21618, 24932, 26990, 26400, 29556, 31084, 25903, 28001, 26990, 29793,26990, 24930, 26988, 26996, 31008, 26223, 29216, 29552, 25957, 25646  new21618, 24932

给我老师的人工智能教程打call!http://blog.csdn.net/jiangjunshow
这里写图片描述
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
### 回答1: 以下是一个简单的Python代码,可以爬取https://samples.vx-underground.org/samples/Families/网址下的内容并将其下载到本地: ```python import requests url = "https://samples.vx-underground.org/samples/Families/" response = requests.get(url) if response.status_code == 200: contents = response.content with open("downloaded_content.zip", "wb") as f: f.write(contents) else: print("Failed to download content.") ``` 这个代码使用requests库向指定的URL发出GET请求,如果返回状态码为200,说明请求成功。然后,将请求返回的内容保存在一个变量中,再通过Python内置的`open()`函数将内容写入一个文件中,文件名为`downloaded_content.zip`,并指定以二进制模式写入。如果请求失败,则打印一条错误信息。 ### 回答2: 以下是一个用Python爬取网页内容并下载的示例代码: ```python import requests # 定义要爬取的网址 url = 'https://samples.vx-underground.org/samples/Families/' # 发起请求获取网页内容 response = requests.get(url) # 获取响应的二进制内容 content = response.content # 定义要保存的文件路径 file_path = 'downloaded_file.mp3' # 将响应的二进制内容保存到文件中 with open(file_path, 'wb') as file: file.write(content) print('文件下载完成!') ``` 以上代码使用了Python中的requests库来发送HTTP请求,并将网页响应的二进制内容保存到指定文件中。在示例中,我们指定了要下载的网址为'https://samples.vx-underground.org/samples/Families/',文件将被保存为'downloaded_file.mp3'。你可以根据实际需要修改这些值来适应不同的网址和文件路径。 ### 回答3: 以下是一个使用Python爬取https://samples.vx-underground.org/samples/Families/网站内容的示例代码: ```python import requests url = 'https://samples.vx-underground.org/samples/Families/' response = requests.get(url) if response.status_code == 200: content = response.content # 这里可以对获取到的内容进行进一步的处理,比如保存到文件或者解析网页等 print(content) else: print('请求失败') ``` 首先,我们使用`requests`库来发送GET请求获取网页的内容。将目标网址赋值给变量`url`,然后使用`requests.get(url)`发送请求。接着,通过判断响应的状态码`response.status_code`是否等于200来确定请求是否成功。 如果请求成功,就可以通过`response.content`获取网页的原始内容。你可以根据自己的需求对这个内容进行进一步的处理,比如保存到本地文件、解析网页等。 如果请求失败,会打印出"请求失败"的提示信息。你可以根据实际情况进行错误处理。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值