Thursday, February 12, 2009

Hashes and regular expressions

StrawberryPerl v5.10.0.4
I installed StrawberryPerl v5.10.0.4 on top of my existing directory yesterday and it cleaned up my module installs. I didn't have that many installed but it is a good thing to know for next time.

Perl open source project: Padre
Someone commented on my previous post about a Perl open source project to look at. Padre, the Perl integrated development environment (IDE) does look like an interesting project. I will start reading about it.

Twit.pl v0.1.0: reading from file, hashes and regexes
Version 0.1.0 of twit.pl adds the ability to read your twitter and identica login info (username and password) from a file.
The text file must have the following format:
#Should I be writing down my password in a file?
Twitter:TwitterUserName:TwitterPassword
Identica:IdenticaUserName:IdenticaPassword
  • Lines starting with '#' are comments and will be ignored by the script
  • The last line starting with "Twitter" will be parsed for username and password. Each of these fields must be separated by a semi-column ':'
  • Same with "Identica"
  • Save it as "twit.txt" in the same directory as twit.pl or use the --file (or -f) option on the command line just like this:
    perl twit.pl -s "Reading DamienLearnsPerl!" -f "~/secretstuff/mytwitterpassword.txt"
    or
    perl twit.pl -s "Watching the snow falling" -f "C:\temp\pass.txt"
With this in mind, let's see how we can parse the file inside our script:
# ------------------------------------------------------------------------------
# Main
# ------------------------------------------------------------------------------
my $Status;
my $PasswordFile;
my %TwitLogin;
my %IdenticaLogin;

#Parse command line arguments
GetOptions ("status=s" => \$Status,
"file=s" => \$PasswordFile);

# Read Password file passed as argument or twit.txt by default
$PasswordFile = "twit.txt" unless ($PasswordFile);
open(LOGINFILE, $PasswordFile) or die "Cannot open \"$PasswordFile\" file: $!\n";
while (<LOGINFILE>) {
my $line = $_;
my $PlaceHolder;

chomp $line; # Remove trailing newline character
next if ($line =~ m/^#/); # Ignore lines starting with '#'
if ($line =~ m/^twitter/i) { # /^ indicates the beginning of the line
($PlaceHolder, $TwitLogin{"UserName"}, $TwitLogin{"Password"}) = split (/:/, $line);
}
if ($line =~ m/^identica/i) { # /i to ignore alphabetic case
($PlaceHolder, $IdenticaLogin{"UserName"}, $IdenticaLogin{"Password"}) = split (/:/, $line);
}
}
close (LOGINFILE);

say SendMessage("Twitter", $Status, %TwitLogin) if ($TwitterUse);
say SendMessage("Identica", $Status, %IdenticaLogin) if ($IdenticaUse);


New Perl concepts
  • Notice how global scalar variables $TwitUser = 'twitterlogin' and $TwitPass = 'twitterpasswd' are replaced by %TwitLogin
  • The % in front of TwitLogin means that this variable is a hash. For more details about hashes, see here. Right now, we just need to know that a hash is a list of unsorted scalars. The goal is to use the TwitLogin hash to regroup the user name and password in a single entity. They are individually accessible as scalars with $TwitLogin{"UserName"} and $TwitLogin{"Password"}.
  • Note how the GetOptions command got an extra field for the --file option where the login information will be stored
  • Line open(LOGINFILE, $PasswordFile) will associate the LOGINFILE filehandle to an external file specified in $PasswordFile. By default, it opens the file in input (read) mode. If it is successful, the interpreter will not execute the second part of the line after the or operator. It will consider that the result of the expression (a or b) is true if a == true and will not assess the right part of the operator.
  • The die "msg" function will exit from the program with a message
  • The last error is stored by Perl in the $! string, so printing it in the die message will give you an idea of where the faulty logic was.
  • while() will read each line of the file handled by LOGINFILE until the end of the file.
  • The current line being read is stored in $_. This is another special Perl variable for the default input.
  • The chomp function removes the newline character from a string. In the twit.txt file, each line is terminated by a newline character, so we need to remove it before we can do operations on the line string.
  • next will tell the while loop to stop the execution of statements within the block and go back to testing the exit condition (the 'EOF' (End oF File) character in this case).
  • next if ($line =~ m/^#/); will execute the next command if the current line matches the regular expression between //. There is lots to say about regexes, I'll keep it for another post if you don't mind. ^ marks the beginning of the line and # is the '#' character. The statement then means: "go to the next line in the file if the current line starts with the '#' character".
  • Finally, the split command will separate the line into several strings, based on the separator character (':' in our case). The unnamed array ($PlaceHolder, $IdenticaLogin{"UserName"}, $IdenticaLogin{"Password"}) will contain three elements created from a single string.
Changes to the SendMessage routine
The last two arguments have been replaced by a hash:
# ------------------------------------------------------------------------------
# Name : SendMessage
# Comment : Sends message to chosen macroblogging site
# Input : $_[0] = Input string with value
# "twitter" -> twitter.com instance
# "identica" -> identi.ca instance
# $_[1] = Message string to be sent
# $_[2] = Hash with "UserName" and "Password" elements
# Output : Return string: "Error" if couldn't create object or string from SendUpdate
# ------------------------------------------------------------------------------
sub SendMessage {
my $ReturnString;
my $Instance;
my ($SiteName, $Message, %Login) = @_;

$Instance = CreateObject($SiteName, %Login);
if ($Instance) {
$ReturnString = SendUpdate($Instance, $Message);
}
else {
$ReturnString = "Error with $SiteName creation process";
}
return $ReturnString;
} #End of SendMessage
  • The @_ array is split into individual elements $SiteName, $Message, %Login to improve the readability
  • The CreateObject prototype is also changed to accept a hash as argument
# ------------------------------------------------------------------------------
# Name : CreateObject
# Comment : Creates and returns an instance of the Net::Twitter class
# Input : - Input string with value
# "twitter" -> twitter.com instance
# "identica" -> identi.ca instance
# all other values return an error
# - Hash with "UserName" and "Password"
# Output : Object newly created or 0 if error
# ------------------------------------------------------------------------------
sub CreateObject {
my $SiteInstance = 0;
my $NameString = shift;
my %Login = @_;

$NameString =~ tr/A-Z/a-z/;
if ($NameString eq "twitter") {
$SiteInstance = Net::Twitter->new(username => $Login{"UserName"}, password => $Login{"Password"});
}
elsif ($NameString eq "identica") {
$SiteInstance = Net::Twitter->new(identica => 1, username => $Login{"UserName"}, password => $Login{"Password"});
}
return $SiteInstance;
} # End of CreateObject
SendUpdate() is unchanged except for the inclusion of a suggestion from kreetrapper.
    my $SiteName = "twitter.com";
$SiteName = "identi.ca" if ($Site->{identica});
can be written in a single line using the ? operator:
    my $SiteName = ($Site->{identica})?"identi.ca":"twitter.com";
  • The ? conditonal operator assesses the statement to the left. If it is true, then it will assign "identi.ca" to $Sitename, else "twitter.com" will be stored in the variable.
The complete script can be downloaded from here (or go to http://sites.google.com/site/damienlearnsperl/DLP-scripts to select your version).

Larry Wall quote of the day:
"Hubris itself will not let you be an artist. "

Possible next posts:

  • How to install and use Google Analytics on your Blogger blog
  • Improving on twit.pl: using more of the Net::Twitter API
  • Improving on twit.pl: Graphical User interface
  • Perl help resources
  • POD

No comments :

Post a Comment