Sunday, January 25, 2009

RankSearch - Part II: parsing the Perl command line

RankSearch: The design
The design of our little script is laid out in the comment header from last post:

1. Get the parameters from the user
  • In a first version, ensure that all parameters are filled
  • Later, we can provide a default value for the search engine (Google)
  • Or even display all the results for a list of supported search engines
2. Launch an http request with the parameters given by the user
  • Spawn a process able to communicate back to our script
  • It will probably be in the form of "http:\\$engine-blabla-search_criteria-moreblabla"
  • Need to investigate different urls for different search engines
3. Parse and display the results transmitted by the http process
  • Search for the target URL
  • Keep track of rank count
  • Launch new http request with updated page number if target not found
  • Display result to user
Today, I'll strike off the first item of the list. Time to get interactive!
In order to ease the handling of user input, I discovered that Perl includes the Getopt::Long module by default. The link on CPAN will show you all possible uses of the module.
One must be careful not to omit the "\" character before the variable name (like I did at first).
We'll only use string (character chain) inputs:
use Getopt::Long;
GetOptions (" engine="s" => \$SearchEngine);
This will store in $SearchEngine the parameter entered from the following perl command line:
perl "$(FULL_CURRENT_PATH)" --engine www.google.com --target damienlearnsperl.blogspot.com --keyword "learn perl"
or
perl "$(FULL_CURRENT_PATH)" -engine www.google.com -target damienlearnsperl.blogspot.com -keyword "learn perl"
or even
perl "$(FULL_CURRENT_PATH)" -e www.google.com -t damienlearnsperl.blogspot.com -k "learn perl"
(provided you only have one entry in Getoptions starting with "e")

Here's the script:
#!/usr/bin/perl -w
# --------------------------------------------------
# File   : RankSearch.pl
# Author : DLP
# Date   : January 24th 2009
# Object : Looks in a search engine what is the rank
#          for a given website and a given keyword
# Input  : - Search engine URL, eg. "www.google.com"
#          - URL of website to monitor
#          - Search expression to investigate
# Bugs   : None
# To do  : - Launch http request
#          - Read html result
#          - Analyse result and display website rank
# --------------------------------------------------
use strict;
use Getopt::Long;   #Load module

# Global variable
my $PROG_NAME = "RankSearch";
my $VERSION   = "v0.0.1";
my $PROG_DATE = "January 24th 2009";

# --------------------------------------------------
# Main program
# --------------------------------------------------
# More global variables
my $SearchEngine = "";
my $TargetURL = "";
my $Keyword = "";

#Parse command line arguments
GetOptions ("engine=s"  => \$SearchEngine,  #string
    "target=s"  => \$TargetURL,
    "keyword=s" => \$Keyword);

# Check user input
if ($SearchEngine eq "" ||
$TargetURL eq "" || $Keyword eq "")
{
print "
You must enter a valid string for:
--engine  = search engine URL
--target  = the target of the search
--keyword = the search criteria
";
exit;
}

print "
$TargetURL is ranked nth on the $SearchEngine
search engine for the \"$Keyword\" criteria.";

__END__
Jan 24 2009 (0.0.1): first version of RankSearch
Jan 25 2009 (0.0.2): get params from command line

After getting the parameters, we check to see if anything was entered at all.
If $SearchEngine, $TargetURL or $Keyword are still empty chains (or undefined) then we print an error message and exit the program (the operator for a logical OR is "||" or.. "or"! I don't get the differences yet).

In Notepad++ you can modify the execute command line (via the F6 shortcut) to:
perl "$(FULL_CURRENT_PATH)" -e www.google.com --target damienlearnsperl.blogspot.com --keyword learn perl

This result will appear as:

Notepad++ execution console
Note that the criteria entered by the user was "learn perl" and it was displayed as "learn" by the script. We'll just have to make sure that double quotes (") are used when the string input has a blank space.

French expression of the day:
"Ce que femme veut, Dieu le veut": A woman's will is God's will
As you can see, God and strong-minded women are universal.

Next posts:
  • More about CPAN
  • Our first Perl program - Part III: Launch a HTTP request
  • How to install Google Analytics on your Blogger blog
  • Our first Perl program - Part IV: Read results from a HTML page
  • Perl help resources
  • Our first Perl program - Part V: Result analysis
  • POD
  • Our first Perl program - Part VI: Add a GUI interface

1 comment :

  1. Thank you for both posts! It was really helpful and useful... I think that everyone which offers help to the "slower" users should be appreciated. keep up with your good work!

    ReplyDelete