"To be a warrior is not a simple matter of wishing to be one. It is rather an endless struggle that will go on to the very last moment of our lives. Nobody is born a warrior, in exactly the same way that nobody is born an average man. We make ourselves into one or the other." --Kokoro
This is a newer API library into the CPAN repositories. It has a bunch of added functionality that CPAN doesn’t have. To install it, you can use CPAN:
1
2
perl-MCPAN-eshell
install CPANPLUS
Once done, exit
1
perl-MCPANPLUS-eshell
Has added functionality: Some of the nicer features are ability to search by modules (m Test::TCP) or authors (a DMAKI) from within the API. The output displays numbers next to the outputted modules so you can quickly install “i 1” (install #1 in the list). Ability to automatically install all dependencies a particular module install needs. Print error stacks (p), able to self-update itself. I’m sure there are many other cool features that I’m not covering but these are the main ones I tend to use.
So this sometimes drives me crazy, having to sit there and watch your CPAN perl module installation as it will stop the process only to ask you ..do you wish to also install whatever other needed dependency module. [yes] enter [yes] enter [yes] enter..
cpan> o conf build_requires_install_policy yes
cpan> o conf commit
Or using by using CPANPLUS, you can choose yes to all for each individual module.
Documentation
A unique feature of Perl is that it doensn’t come with just one man page, but with a whole slew of man pages that describe various aspects of Perl, and serve as tutorials, reference manuals, and FAQ pages. Here are some of the most useful of these pages for beginners:
1
2
man perl
Main man page,lists the various auxiliary pages available.
1
2
man perlintro
Perl introduction forbeginners.
1
2
man perlrequick
Perl regular expressions quick start.
1
2
man perlcheat
Perl cheat sheet(very neat!).
1
2
man perlfaq1
The first Perl FAQ page.(There are9such pages,perlfaq1 through perlfaq9.)
1
2
perldoc-qkeyword
Extracts entries matching"keyword"from the perlfaq pages.Example:"perldoc -q reverse".
1
2
perldoc-ffunction
Man page forperl function"function".Example:"perldoc -f reverse".
1
2
3
perldoc-qbooks
The Perl books section of the Perl FAQ pages.Alarge listing of recommended books,classified by level.
My own top three recommendations are"Learning Perl"by Schwartz/Phoenix/Foy;"Programming Perl"by Wall/Christiansen/Orwant;and"The Perl Cookbook"(Christiansen/Torkington),all published byO'Reilly. The first is a beginner'stutorial;the second isa"must have"reference foranyone seriously into Perl;andthe third isicing on the cake,with lots of nifty tricks.
Line-based operation
The simplest way to use perl in commandline mode is as a filter that operates on a file (or on standard input), manipulates the file one line at a time, and outputs the result to standard output, much like standard Unix utilities like grep, sed, and awk work.
The basic structure of the command is one of the following:
1
2
perl-lape'.....'file
Foreachline in"file",apply the command(s)(e.g.,asubstitution)'....',thenprint the line tostandard output.
1
2
perl-lane'.....'file
Foreachline in"file",apply the command(s)'....',but donotprint the line.Inthiscase,'....'usually will containaprint command(possibly conditional),andonly output generated by such an explicit print command will get printed.
The commandline options used here have the following meaning:
The “e” option indicates that the following string is to be interpreted as a perl script (i.e., sequence of commands). To prevent interfering with the shell, it is best to enclose the script in single right quotes (‘).
The “l” (character “ell”, not the bar symbol) option ensures proper end-of-line handling; without it, linebreaks may get chopped off.
The “a” option causes perl to autosplit each line into an array of fields $F[0], $F[1], …, with blank space acting as default field separator. Note that in Perl, array indices start at 0, so the first array element has index 0.
The “p” and “n” options indicate whether or not each line is printed by default.
The following examples illustrate the use of Perl for line-by-line processing of files.
1
2
perl-lane'print $F[1]'file
Print second field of eachline(i.e.,the output consists of the second column of the file).
1
Note the$F[0]isthe first field,$F[1]the second,etc.
1
2
3
perl-lane'print $F[-1]'file
Print the last column of the file.
InPerl,negative arrayindices denote arrayelements counted from the right.Thus$F[-1]denotes the last field(column),$F[-2]the second last,etc.
1
2
perl-lane'print "$F[2],$F[1]"'file
Print the second andthird columns of the file inreverse order,separated byacomma.
1
2
3
4
5
perl-lpe's/\s+/,/g'file
Replace any sequence of consecutive blank spaces byacomma.
Thisconvertsatabular list with fields separated by blanks toone inwhich the fields are separated by commas.(The latter format isthe csv format,acommon spreadsheet format that can be used toimport files into Excel).
Thes/.../.../gsyntax issimilar tothat of sed;The"g"modifier inthe substitution command denotesa"global"substitution;without it,only the first occurrence of the substitution pattern would get substituted."\s"stands forany whitespace character(blank,tab,etc.).The plus sign"+"indicates that the substitution pattern should match one ormore instances of"\s";thus,any chunk of consecutive whitespace characters gets replaced byasingle comma.
(The"a"(autosplit)option isnotneeded here since no useof the field array$F[...]ismade;however,it would nothurt toleave it in.)
1
2
3
perl-pe's/3/1/g'file
Replace3inthe file by1.
(Here the"a"(autosplit)option isnotneeded,nor isthe"l"(end-of-line processing)option,though one could,of course,leave those options in.)
1
2
3
perl-i.bak-pe's/3/1/g'file
The same,but with"in place"editing.
The"i"option isapowerful option of Perl that causes the commands tobe performed on the file itself.Thus,there isno need tosave the modified file underatemporary filename andthencopy that file over the original file.Inthe above form of thisoption,the original version of the file issaved ontoafile with extension".bak".Saving the original version ontoabackup file issafety mechanism;the name of the backup file can be changed by replacing the string".bak"by something else.Ifno such stringisprovided inthe"-i"option,thenthe file ismodified without backing up.
1
2
3
perl-lpe's/\d+/NNN/g'file
Replace any stringof digits by"NNN".
Here"\d"stands forany single digit,the plus sign indicates one ormore instances of whatever precedes it.Thus,\d+stands forany stringof digits.
1
2
3
perl-lpe's/^/$. /'file
Print the file,with line numbers prepended toeachline.
The"$."variable denotes the line number;the caret symbol(^)denotesamatch at the beginning of the line.Inthiscasethe substitution pattern ins/.../.../isempty,so the"substitution"simply amounts totacking on the replacement stringat the beginning of the line.
1
2
perl-lpe's/^\s+//'file
Delete any blank spaces at the beginning of eachline.
1
2
3
perl-lpe's/^\s+//;s/\s+$//'file
Same,but also delete any blank spaces at the endof eachline.
The two substitutions specified by thes/.../.../commands are separated byasemicolon andare executed sequentially.Inthe second substitution command,the dollar sign($)playsarole analogous tothe caret sign anddenotes the endof the line.
1
2
3
perl-lane'print if (/\d\d\d\d/)'file
Print all lines infile that contain(at least)four consecutive digits.
The stringenclosed in/.../isinterpreted asapattern that needs tobe matched inorder forthe ifclause toevaluate astrue.The string\d\d\d\dstands for4consecutive digits.(Thisisagrep-like operation,but accomplishing the same with grep would be messy since grep has very limited regular-expression matching capabilities.)
1
2
3
perl-lane'print if (/\S/)'file
Print any line that containsanon-whitespace character.Thiseffectively deletes blank lines(orlines containing only whitespace)from the file.
"\S"stands forthe complement of"\s",i.e.,any character that isnotawhitespace.
1
2
perl-lane'print length($_)'file
Print the length(measured incharacters)of eachinput line.
1
2
perl-lane'print if (length($_) > 40)'file
Print all lines infile that have length(measured incharacters)greater than40.
Operating on entire files
Perl’s power really shines when one wants to perform operations on chunks of files that extend over multiple lines (e.g., deleting line breaks in paragraphs). Standard Unix utilities like sed or awk are ill-suited for that, but with Perl this is easy by changing the record separator (which defaults to a linebreak) to something else using the ‘-0’ option. Of particular interest are the following cases:
1
2
3
4
Slurp mode:perl-0777
The"0777"string(note that"0"here isthe digit0)causes the record separator tobe set to"undefined",which inturn causes Perl tooperate on the entire file asifit were one line.("slurp mode").
Paragraph mode:perl-00
The"00"(two digits0)stringcauses Perl tointerpret one ormore consecutive blank lines asrecord separator.Thus Perl operates on eachparagraph asifit werealine.
Here are some examples using these modes:
1
2
3
perl-00-lpe's/\n/ /g'file
Delete all linebreaks within eachparagraph,replacing them byasingle blank space.The net effect isthat eachparagraph becomesasingle line.
Here"\n"stands foralinebreak character.
1
2
3
4
perl-00-lpe's/\n/ /g; s/\.\s*/\.\n/g'file
Same,but after having deleted all linebreaks within paragraphs reinsert linebreaks at the endof eachsentence.Asaresult,eachsentence gets its own line.
The asterisk(*)in"\s*"denotes0ormore instances of"\s".Thus,"\.\s*"matchesaperiod,plus any whitespace following it.
The period isused here asan end-of-sentence marker.It must be escaped withabackslash(\.)since an unescaped period hasadifferent meaning inPerl.
Same asbefore,but pipe the output into another command that prints out the number of"words"(inthe sense of any consecutive stringof nonblanks)foreachsentence.
$#F denotes the last index in the array $F[...]. Since the indexing starts with 0, one has to add 1 to obtain the number of elements in this array.
1
2
3
perl-0777-lape's/\s+/,/g'file
Replace all whitespace infile by commas,crossing line boundaries.
The resulting file consists ofasingle longline,with fields separated by commas.Suchaformat may be useful forimporting toother programs.
1
2
3
perl-0777-lape's/\s+/\n/g'file
Replace eachchunk of one ormore whitespace characters infile byasingle newline.
The resulting file consists of one"word"per line.Thisisuseful forgetting wordstatistics,asinthe next example.
Aone line wordfrequency counter:It generatealist of all distinct"words"inthe file,with their frequency of occurrence,andsorted from the most frequent toleast frequent.
The"sort"command sorts the words alphabetically.The"uniq"command eliminates duplicate words;with the"-c"option it also prints the number of occurrences.The second"sort"command,with the"-nr"option sorts the resulting file numerically indescending order.Finally,the"less"command shows the result one page atatime.
Cool stuff
1
2
perl-lne'print if "$_" eq reverse'file
Finds all palindromic lines infile.Inparticular,ifeachline containsasingle number,it displays all palindromes among these numbers.Ifapplied toadictionary file(such as/usr/share/lib/dict/words),with one wordper line,it displays the palindromes among the words.
1
2
perl-e'$n=1;while ($n++){sleep 1;print "\n$n is prime" if (("p" x $n) !~ /^((p)\2+)\1+$/)}'
Print out all prime numbers,one per second.(Note that the entire command must be onasingle line.)