RabbitFarm

2020-10-25

Perl Weekly Challenge 083

Part 1

You are given a string $S with 3 or more words. Write a script to find the length of the string except the first and last words ignoring whitespace.

Solution


use strict;
use warnings;
##
# You are given a string $S with 3 or more words.
# Write a script to find the length of the string 
# except the first and last words ignoring whitespace.
##
sub count_most_words{
    my ($s) = @_;
    my $count = 0;
    my @a = split(/\s/, $s);
    map {$count += tr/a-zA-Z//d} @a[1 .. (@a - 2)];
    return $count;
}

MAIN:{
    my $S;
    $S = "The Weekly Challenge";
    print "$S --> " . count_most_words($S) . "\n";
    $S = "The purpose of our lives is to be happy";
    print "$S --> " . count_most_words($S) . "\n";
}

Sample Run


$ perl perl/ch-1.pl
The Weekly Challenge --> 6
The purpose of our lives is to be happy --> 23

Notes

Anytime I need to count characters I immediately think of tr. tr will, of course, do character replacements but it’s return value is the number of characters which have been effected. Here with the d option the matching characters are just deleted. This code would work exactly just as well with the d but I figured it’d be actually less confusing to have it there and make it more clear what it was doing.

Part 2

You are given an array @A of positive numbers. Write a script to flip the sign of some members of the given array so that the sum of the all members is minimum non-negative.


use strict;
use warnings;
##
# You are given an array @A of positive numbers.
# Write a script to flip the sign of some members 
# of the given array so that the sum of the all 
# members is minimum non-negative.
##
sub try_all_flips{
    my(@a) = @_;
    my @minimum = (undef, undef, []); 
    for my $i (0 .. (2**(@a) - 1)){
        my $b = sprintf("%0" . @a . "b", $i); 
        my @b = split(//, $b);
        my $flip_count = 0;
        map {$flip_count++ if $_ == 1} @b;
        my @f;
        for my $i (0 .. (@b - 1)){
            if($b[$i] == 1){
                push @f, (-1) * $a[$i];
            }
            else{
                push @f, $a[$i];
            }
        }
        my $sum = unpack("%32I*", pack("I*", @f)); 
        if(!defined($minimum[0]) || ($sum <= $minimum[0] && $sum >= 0)){
            if(defined($minimum[0]) && $sum == $minimum[0] && $flip_count < $minimum[1]){
                $minimum[0] = $sum;
                $minimum[1] = $flip_count;
                $minimum[2] = \@f;
            }
            elsif(!defined($minimum[0])){
                $minimum[0] = $sum;
                $minimum[1] = $flip_count;
                $minimum[2] = \@f;
            }
            elsif($sum < $minimum[0]){
                $minimum[0] = $sum;
                $minimum[1] = $flip_count;
                $minimum[2] = \@f;
            }
        }
    }
    return @minimum;
}

MAIN:{
    my @A;
    my @minimum;
    @A = (3, 10, 8);
    @minimum = try_all_flips(@A);
    print "[". join(", ", @A) . "] --> ";
    print  " [". join(", ", @{$minimum[2]}) . "] = " . $minimum[0] ."\n";
    @A = (12, 2, 10);
    @minimum = try_all_flips(@A);
    print "[". join(", ", @A) . "] --> ";
    print  " [". join(", ", @{$minimum[2]}) . "] = " . $minimum[0] ."\n";
}

Sample Run


$ perl perl/ch-2.pl
[3, 10, 8] -->  [3, -10, 8] = 1
[12, 2, 10] -->  [-12, 2, 10] = 0

Notes

This is a brute force approach. I use the same method for generating all combinations that I used in Challenge 077 with the variation that here I use the generated combination to determine which elements of the list are to be flipped. After calculating the sum of each new list (with flipped elements) I check to see if this is a new minimum positive value and, if so, if it has been done with fewer flips.

posted at: 01:01 | path: /perl | permanent link to this entry

2020-10-18

Perl Weekly Challenge 082

Part 1

You are given 2 positive numbers $M and $N. Write a script to list all common factors of the given numbers.

Solution


use strict;
use warnings;
##
# You are given 2 positive numbers $M and $N.
# Write a script to list all common factors of the given numbers.
##
sub factor{
    my($n) = @_;
    my @factors = (1);
    foreach my $j (2..sqrt($n)){
        push @factors, $j if $n % $j == 0;
        push @factors, ($n / $j) if $n % $j == 0 && $j ** 2 != $n;
    }    
    return @factors;  
}

sub common_factors{
    my($m, $n) = @_;
    my @common_factors = grep { my $f = $_; grep { $f == $_ } @{$n}} @{$m};
    return @common_factors;
}


MAIN:{
    my $M = 12;
    my $N = 18;
    my @m_factors = factor($M);
    my @n_factors = factor($N);
    print "(" . join(",", common_factors(\@m_factors, \@n_factors)) . ")\n";
}

Sample Run


$ perl perl/ch-1.pl
(1,2,6,3)

Notes

I have used sub factor previously, back in Challenge 008. The most interesting thing in this solution is probably the nested grep’s. In order to nest them properly you need to create a local variable to hold the element being examined in the outer grep block. Here I use $f. Although we need just two grep’s here this trick can be used to nest them more deeply.

Part 2

You are given 3 strings; $A, $B and $C. Write a script to check if $C is created by an interleaving of $A and $B. Print 1 if check is success otherwise 0.


use strict;
use warnings;
##
# You are given 3 strings; $A, $B and $C.
# Write a script to check if $C is created by interleave $A and $B.
# Print 1 if check is success otherwise 0.
##
sub find_remove{
    my($s, $x) = @_;
    my $i = index($s, $x);
    if($i != -1){
        substr $s, $i, length($x), "";
        return $s;
    }
    return undef;
}
MAIN:{
    my $A = "XY";
    my $B = "X";
    my $C = "XXY";
    my $s = find_remove($C, $A);
    if($s && $s eq $B){
        print "1\n";
        exit;
    }
    else{
        $s = find_remove($C, $B);
        if($s && $s eq $A){
            print "1\n";
            exit;
        }
    }
    print "0\n";
}

Sample Run


$ perl perl/ch-2.pl
1

Notes

I believe this is the most straightforward way of tackling this problem. By checking for both $A and $B as substrings and removing them if found we can determine if there was an interleaving by checking to see if the other remains.

posted at: 01:42 | path: /perl | permanent link to this entry

2020-10-11

Perl Weekly Challenge 081

Part 1

You are given 2 strings, $A and $B. Write a script to find out common base strings in $A and $B.

Solution


use strict;
use warnings;
##
# You are given 2 strings, $A and $B.
# Write a script to find out common base strings in $A and $B.
## 
use boolean;

sub contains{
    my($s0) = @_;
    return sub{
        my($s) = @_;
        return [true, $s0] if($s =~ m/^($s0)+$/g);
        return [false, $s0];
    }
}

sub make_checks{
    my($s) = @_;
    my @letters = split(//, $s);
    my @checks;
    for my $i (0 .. (int(@letters/2 - 1))){
        push @checks, contains(join("", @letters[0 .. $i]));
    }
    return @checks;
}

MAIN:{
    my($A, $B);
    #$A = "aaaaaaa";
    #$B = "aaaaaaaaaaaaaaaaaa";
    $A = "abcdabcd";
    $B = "abcdabcdabcdabcd";
    my @checks = make_checks($A);
    for my $check (@checks){
        if($check->($A)->[0] && $check->($B)->[0]){
            print $check->($A)->[1] . "\n";
            exit;
        }
    }
}

Sample Run

$ perl perl/ch-1.pl
abcdabcd, abcdabcdabcdabcd --> (abcd abcdabcd)
aaa, aa --> (a)

Notes

I used a similar technique to what I used in Challenge 003 and Challenge 004 where I create an array of anonymous functions which each check different substrings. The functions are created using what is called currying whereby we pass in a parameter (or multiple parameters if we needed to!) to a function which creates a closure around those parameters and returns a new function. This is, I admit, not all that necessary! I repeat the method though out of a sense of nostalgia. We are now on Challenge 081!

Part 2

You are given file named input. Write a script to find the frequency of all the words. It should print the result as first column of each line should be the frequency of the word followed by all the words of that frequency arranged in lexicographical order. Also sort the words in the ascending order of frequency.


use strict;
use warnings;
##
# You are given file named input.
# Write a script to find the frequency of all the words.
# It should print the result as first column of each line should be the frequency of the 
# word followed by all the words of that frequency arranged in lexicographical order. Also 
# sort the words in the ascending order of frequency.
##
MAIN:{
    my %counts;
    my %count_words;
    my $s;
    {    local $/;
         $s = ;
    }
    $s =~ s/'s//g;
    $s =~ tr/."(),//d;
    $s =~ tr/-/ /;
    my @words = split(/\s+/, $s);
    for my $word (@words){
        $counts{$word}++;
    }
    for my $k (keys %counts){
        my $count = $counts{$k};
        push @{$count_words{$count}}, $k;
    }
    for my $k (sort keys %count_words){
        print $k . "\t" . join(" ",  sort {$a cmp $b} @{$count_words{$k}}) . "\n";
    }
}


__DATA__ 
West Side Story

The award-winning adaptation of the classic romantic tragedy "Romeo and
Juliet". The feuding families become two warring New York City gangs,
the white Jets led by Riff and the Latino Sharks, led by Bernardo. Their
hatred escalates to a point where neither can coexist with any form of
understanding. But when Riff's best friend (and former Jet) Tony and
Bernardo's younger sister Maria meet at a dance, no one can do anything
to stop their love. Maria and Tony begin meeting in secret, planning to
run away. Then the Sharks and Jets plan a rumble under the
highway--whoever wins gains control of the streets. Maria sends Tony to
stop it, hoping it can end the violence. It goes terribly wrong, and
before the lovers know what's happened, tragedy strikes and doesn't stop
until the climactic and heartbreaking ending.

Sample Run

$ perl perl/ch-2.pl
1   But City It Jet Juliet Latino New Romeo Side Story Their Then West York adaptation any anything at award away become before begin best classic climactic coexist control dance do doesn't end ending escalates families feuding form former friend gains gangs goes happened hatred heartbreaking highway hoping in know love lovers meet meeting neither no one plan planning point romantic rumble run secret sends sister streets strikes terribly their two under understanding until violence warring what when where white whoever winning wins with wrong younger
2   Bernardo Jets Riff Sharks The by it led tragedy
3   Maria Tony a can of stop
4   to
9   and the

Notes

I have to admit that sometimes, even after many years of using Perl, that if I don’t use a certain feature often enough that I end up getting a little surprised. The surprise here is that Perl is clever enough to know that if I am trying a push onto a hash value which is undef, such as when I first do a push @{$count_words{$count}}, $k; for a new value of $k, that a new array is created. No need to check for undef and create a new one manually. This is called autovivification and while a very common thing in Perl for whatever reason it managed to catch me a little by surprise this time around. Probably due to my working a lot in other languages recently that don;t have this feature! Gabor has a nice writeup on autovivification for anyone interested in reading more.

posted at: 17:56 | path: /perl | permanent link to this entry

2020-10-04

Perl Weekly Challenge 080

Part 1


use strict;
use warnings;
##
# You are given an unsorted list of integers @N.
# Write a script to find out the smallest positive number missing.
##
sub least_missing{
    my(@numbers) = @_;
    @numbers = sort @numbers;
    for my $i ($numbers[0] .. $numbers[@numbers - 1]){
        my @a = grep { $_ == $i } @numbers;
        return $i if(!@a && $i > 0);
    }
    return undef;
}

MAIN:{
    my @N;
    @N = (5, 2, -2, 0);
    my $least_missing = least_missing(@N);
    print "The least mising number from (" .
        join(",", @N) . ") is $least_missing\n";
    @N = (1, 8, -1);
    $least_missing = least_missing(@N);
    print "The least mising number from (" .
        join(",", @N) . ") is $least_missing\n";
    @N = (2, 0, -1);
    $least_missing = least_missing(@N);
    print "The least mising number from (" .
        join(",", @N) . ") is $least_missing\n";
}

Sample Run

$ perl perl/ch-1.pl
The least mising number from (5,2,-2,0) is 1
The least mising number from (1,8,-1) is 2
The least mising number from (2,0,-1) is 1

Notes

The list is given in arbitrary order so the first thing to do is to sort it. Once in sorted order iterate from the least number to the highest, incrementing by one at each step. Perl makes this easy with the range (aka flip-flop) operator. Each each iteration see if the current number is from the original list or not and if not, then if it is the smallest positive number not yet seen, which really just means the first positive number not from the original list.

As I am writing this I realize that it’d make sense to use grep to remove all the negative numbers from the list before even bothering to sort them. If the list were presented as, say, 1 million negative numbers and then three positive ones why waste doing anything with all the negatives!

Part 2


use strict;
use warnings;
##
# You are given rankings of @N candidates.
# Write a script to find out the total candies needed for all candidates. 
# You are asked to follow the rules below:
#     a) You must given at least one candy to each candidate.
#     b) Candidate with higher ranking get more candies than their immediate
#        neighbors on either side.
##
sub count_candies{
    my(@candidates) = @_;
    my $candies = @candidates;
    for my $i (0 .. (@candidates - 1)){
        if($candidates[$i - 1]){
            $candies++ if $candidates[$i] > $candidates[$i - 1];
        }   
        if($candidates[$i + 1]){
            $candies++ if $candidates[$i] > $candidates[$i + 1];
        }
    }
    return $candies;
}


MAIN:{
    my @N;
    my $number_candies;
    @N = (1, 2, 2);
    $number_candies = count_candies(@N);
    print "The number of candies for (" .
        join(",", @N) . ") is $number_candies\n";
    @N = (1, 4, 3, 2);
    $number_candies = count_candies(@N);
    print "The number of candies for (" .
        join(",", @N) . ") is $number_candies\n";
}

Sample Run

$ perl perl/ch-2.pl
The number of candies for (1,2,2) is 4
The number of candies for (1,4,3,2) is 7

Notes

I don’t think there are any surprises in this approach. In fact, I could not think of a better way, in terms of efficiency, than this. Still, this is not exactly exciting code to read!

posted at: 15:26 | path: /perl | permanent link to this entry

2020-09-27

Perl Weekly Challenge 079

Part 1

You are given a positive number $N. Write a script to count the total number of set bits of the binary representations of all numbers from 1 to $N and return $total_count_set_bit % 1000000007.

Solution

“ch-1.pl”

Sample Run

$ perl perl/ch-1.pl
5 % 1000000007 = 5
4 % 1000000007 = 4

Notes

The approach here is to continuously shift bits off to the right, checking to see if the bit about to be shifted off is set or not. This is a pretty standard pattern and it looks pretty much the same in C++ and Prolog too!

Part 2

You are given an array of positive numbers @N. Write a script to represent it as Histogram Chart and find out how much water it can trap.

“ch-2.pl”

Sample Run

ch-2.pl output

Notes

This is one of the more fun sorts of problems that come up in these challenges! It is somewhat similar to the “leader problem” from last week in that we are given an array of numbers and need to do a similar set of look ahead comparisons. Here we look ahead in the array to determine if what I call buckets exist. Whatever buckets are found are then used to compute the total volume as specified.

References

https://users.cs.cf.ac.uk/Dave.Marshall/PERL/node36.html

posted at: 17:34 | path: /perl | permanent link to this entry

2020-09-20

Perl Weekly Challenge 078

Part 1

You are given an array @A containing distinct integers. Write a script to find all leader elements in the array @A. Print (0) if none found.

Solution

“ch-1.pl”

Sample Run

$ perl perl/ch-1.pl 6
@A = (9,10,7,5,6,1)
Leaders = (10,7,6,1)
@A = (3,4,5)
Leaders = (5)

Notes

The approach here is to just repeating the checks for smaller elements. Nothing too fancy. In fact, I actually thought that there might be room for some fun and over engineered way of doing this but I couldn’t really come up with anything that wouldn’t just be obfuscated!

Part 2

You are given array @A containing positive numbers and @B containing one or more indices from the array @A. Write a script to left rotate @A so that the number at the first index of @B becomes the first element in the array. Similary, left rotate @A again so that the number at the second index of @B becomes the first element in the array.

“ch-2.pl”

Sample Run

$ perl perl/ch-2.pl
@A = (10,20,30,40,50)
@B = (3,4)
Rotations:
[40,50,10,20,30]
[50,10,20,30,40]
@A = (7,4,2,6,3)
@B = (1,3,4)
Rotations:
[4,2,6,3,7]
[6,3,7,4,2]
[3,7,4,2,6]

Notes

Another straight forward one. For each value in @B I shift and push the respective number of times.

posted at: 00:27 | path: /perl | permanent link to this entry

2020-09-13

Perl Weekly Challenge 077

Part 1

You are given a positive integer $N. Write a script to find out all possible combination of Fibonacci Numbers required to get $N on addition.

Solution

“ch-1.pl”

Sample Run

$ perl perl/ch-1.pl 6
1 + 5 = 6
1 + 2 + 3 = 6
1 + 5 = 6
1 + 2 + 3 = 6

$ perl perl/ch-1.pl 9
1 + 8 = 9
1 + 3 + 5 = 9
1 + 8 = 9
1 + 3 + 5 = 9
1 + 1 + 2 + 5 = 9

Notes

The approach here is to generate all Fibonacci terms up to $n and then evaluate all combinations of these terms and see which sum to $n. The most interesting part here is perhaps how the combinations are generated. Here we are making use of a convenient property that there are 2^n -1 subsets of the set {1..n}. So we generate numbers $b as an n-digit binary number using sprintf. We correspond the binary digits of k to indices into the Fibonacci sequence and select a Fibonacci term if its bit value is 1.

Also note that the Fibonacci sequence starts off as 1, 1, 2, 3, 5, … and so while we may see two 1s these are separate, and not repeated, Fibonacci terms.

Part 2

You are given m x n character matrix consists of O and X only. Write a script to count the total number of X surrounded by O only. Print 0 if none found.

“ch-2.pl”

Sample Run

perl perl/ch-2.pl
O O X
X O O
X O O
1 X found at Row 1 Column 3.

O O X O
X O O O
X O O X
O X O O
1st X found at Row 1 Column 3.
2nd X found at Row 3 Column 4.

Notes

I made use of Neil Bowers Lingua::EN::Numbers::Ordinate to make the output look a little nicer.

Otherwise think this is straight forward enough…for each test matrix identify the Xs and then check to see if they are “lonely”. The number of possible adjacent space to check is no more than eight as shown in the code so we check these all, if they exist.

posted at: 16:54 | path: /perl | permanent link to this entry

2020-09-06

Perl Weekly Challenge 076

I did this week's Perl Weekly Challenge in both Perl and Prolog. The Prolog solutions were good practice in shaking the rust off my logic programming but I won't discuss them here, to keep things short. The code for the Prolog solutions for Part 1 and Part 2 are on GitHub. 

Part 1

You are given a number $N. Write a script to find the minimum number of prime numbers required, whose summation gives you $N. For the sake of this task, please assume 1 is not a prime number.

For this solution I used a pre-computed list of the first 1000 prime numbers. Larger pre-computed lists are available and of course computing them directly is always an option too! For the purposes of this challenge it seemed a pre-computed list would be OK.

Code

ch-1.pl (with list of pre-computed primes cropped)
ch-1.pl (with list of pre-computed primes cropped)

Sample Run

$ perl perl/ch-1.pl 9
7 + 2 = 9

$ perl perl/ch-1.pl 87
83 + 2 + 2 = 87

Part 2

Write a script that takes two file names. The first file would contain word search grid as shown below. The second file contains list of words, one word per line. You could even use local dictionary file.

Print out a list of all words seen on the grid, looking both orthogonally and diagonally, backwards as well as forwards.

I opted to hard code the grid and load a small dictionary file. This file was obtained from http://www-personal.umich.edu/~jlawler/wordlist.html and contains 60,000 of the most common english words. Ultimately this yield a lot of awkward looking two and three letter words which, frankly, I do not personally consider all that common. I used the full dictionary but filter the results to only words with 4 or more letters.

ch-2.pl (with some utility functions cropped)
ch-2.pl (with some utility functions cropped)

The approach is straightforward:

Sample Run

$ perl perl/ch-2.pl
Found the following words: align,alls,ante,arare,aras,aras,argos,baas,bide,blunt,bosc,broad,buff,butea,cart,cess,cold,cord,demi,depart,departed,doth,dust,ebro,enter,etna,eves,filch,garlic,gila,goat,gram,grieve,grit,hani,hazard,heed,laic,lien,luge,lune,mali,malign,malignant,mall,mein,mero,midst,need,oats,ough,ought,ovary,part,parte,parted,quash,rape,rara,rare,rast,road,roccus,ruse,sara,sara,shed,shrine,slag,slug,social,spasm,spasmodic,succor,succors,theorem,togo,trap,tsar,vary,virus,visa,wigged,zing

posted at: 21:45 | path: /perl | permanent link to this entry

2020-08-30

Largest Rectangle Histogram

Perl Weekly Challenge 075 asks us to find the maximum rectangle of a histogram as follows

You are given an array of positive numbers @A.
Write a script to find the largest rectangle histogram created by the given array.

Looking at the above histogram, the largest rectangle (4 x 3) is formed by columns (4, 5, 3 and 7).
Output: 12

This is a fun sort of problem that seems to be right at home in a programming challenge and seldom seen in more typical scenarios!

My approach to this is as follows:

  1. Compare each value V in the histogram against the others. 
  2. If V is greater than the one being compared against set rectangle size to 0.
  3. If V is less than or equal to the one being compared against increase rectangle by V.
  4. We determine the maximum rectangle size by comparing all the computed rectangle sizes as they are generated.

In code:

ch-2.pl
ch-2.pl

Sample Run

sample output from ch-2.pl
sample output from ch-2.pl


posted at: 06:09 | path: /perl | permanent link to this entry

2020-08-27

All Combinations Equal to a Sum in Perl and Prolog

A problem that comes up surprisingly often is that we want to determine lists of numbers which sum to some given number S.

For example, let's look at Perl Weekly Challenge 075

You are given a set of coins @C, assuming you have infinite amount of each coin in the set.

Write a script to find how many ways you make sum $S using the coins from the set @C.

Example:
Input:     
   @C = (1, 2, 4)     
   $S = 6  
Output: 6
There are 6 possible ways to make sum 6.
a) (1, 1, 1, 1, 1, 1)
b) (1, 1, 1, 1, 2)
c) (1, 1, 2, 2)
d) (1, 1, 4)
e) (2, 2, 2)
f) (2, 4)

Such a problem is a nice application for a logic programming approach!

Here is a first attempt with a pure Prolog solution (with SWI-Prolog).

ch-1.p
ch-1.p

Sample Run

$ swipl -s prolog/ch-1.p -g main

[[1,1,1,1,1,1],[2,1,1,1,1],[1,2,1,1,1],[1,1,2,1,1],[2,2,1,1],[4,1,1],[1,1,1,2,1],[2,1,2,1],[1,2,2,1],[1,4,1],[1,1,1,1,2],[2,1,1,2],[1,2,1,2],[1,1,2,2],[2,2,2],[4,2],[1,1,4],[2,4]]

We have some duplicate solutions to clean up, but since this is a Perl centric challenge let's do it in Perl and while we're at it, we'll also have Perl handle the input and output.

ch-1.pl
ch-1.pl

Sample Run

$ perl perl/ch-1.pl 6 "1,2,4"
(1,1,1,1,2)
(1,1,4)
(1,1,1,1,1,1)
(1,1,2,2)
(2,2,2)
(2,4)

The deduplication is handled by storing the resulting lists as hash keys. In order to use array references as hash keys we need to use Hash::MultiKey. The target sum and coin value list are set in the Prolog code via string substitution. Of course these could have been set as parameters and passed in the normal way but this seemed slightly more fun!

posted at: 21:37 | path: /perl | permanent link to this entry

2020-08-09

n! has how many trailing zeroes?

Perl Weekly Challenge 072 asks 

You are given a positive integer $N (<= 10). Write a script to print number of trailing zeroes in $N!.

Note that the number of zeroes is equivalent to the number of factors of 10 in n!. For example, 4! = 4 * 3 * 2 * 1 which has none. 6! = 6 * 5 * 4 * 3 * 2 * 1 = 6 * 4 * 3 *1 * (5 * 2) has one. 10! = 10 * 9 * 8 * 7 * 6 * 4 * 3 * 1 * (5 * 2) has two.

In this case, for n <= 10, we can can these cases directly.

sub count_10s{
   my($n) = @_;
   return 2 if $n == 10;
   return 1 if $n >=5 && $n < 10;
   return 0 if $n < 5;
}
 

If we wanted to solve this for the general case we would have to follow an algorithm that would look like this (pseudo code)

Function count 10s   

    Pass In: n, a whole number   

    Doing: For each term of n! = 1 * 2 * 3 * ... * n perform a prime factorization.
                Save all prime factors for all terms to an array. Count each pair of
               (2,5). This is the number of factors of 10 in n!

    Pass Out: the number of factors of 10 in n! 

Endfunction 

posted at: 18:46 | path: /perl | permanent link to this entry

Using -s with Perl One Liners

For Perl Weekly Challenge 072 we are asked the following:

You are given a text file name $file and range $A -  $B where $A <= $B. Write a script to display lines range $A and $B in the given file.

I was a bit surprised to realize how this could be done with a one liner. In the example below input.txt is a plain text file which has 100 lines that all look like LX for 1 <= X <= 100.

$ perl -s -n -e 'print if $. >= $A && $. <= $B' — -A=4 -B=12 < input.txt

L4
L5
L6
L7
L8
L9
L10
L11
L12

The use of the special variable $. to track input line numbers is a common Perl idiom. What was surprising to me was that I was unfamiliar with the -s command line option. This option allows you to set variables on the command line. Anything after the is interpreted to be a variable initialized to the given value. You can see that in the example above where -A=4 and -B=12 creates variables $A and $B initialized to 4 and 12 respectively. 

If you do not set the variable to something then it is just initialized as true. For example:

$ perl -s -e 'print "$x\n";' — -x
1


posted at: 01:34 | path: /perl | permanent link to this entry

2019-05-26

Perl Weekly Challenge 009

This week's Perl Weekly Challenge presented a problem I hadn't really thought of before: how to do a ranking or what is mathematically referred to as a weak ordering.

Part 1

Sample Run

$ perl perl5/ch-1.pl
First square with five distinct digits: 12769 (= 113 * 113)

I think this was straightforward enough of a problem with a straightforward solution: start iterating through integers until you find one that has the property we want. The number must have a square root of at least 100 since it has at least five digits so that is our starting point. Counting the unique digits (see line 12) is done using a clever trick suggested by Prakash Kailasa via a Gabor Szabo Perl Maven blog. The trick is to create an anonymous hash with the values from the array as the keys. This will remove the duplicates since hash keys must be unique. Originally this could be done without the %{} around the map {$_ => 1} because for a time keys() could be applied to a hash reference however this is no longer allowed in perl-5.28.0 (I think it was removed in perl-5.14.0?).

Part 2

Sample Run

$ perl perl5/ch-2.pl
Name Score Rank
NEH 2 1
FED 3 2
OKN 3 2
DRZ 4 4
NPK 5 5
XRK 5 5
WXV 7 7
QHC 9 8
IEE 10 9
CYP 10 9
================
NEH 2 1
FED 3 3
OKN 3 3
DRZ 4 4
NPK 5 6
XRK 5 6
WXV 7 7
QHC 9 8
IEE 10 10
CYP 10 10
================
NEH 2 1
OKN 3 2
FED 3 2
DRZ 4 3
XRK 5 4
NPK 5 4
WXV 7 5
QHC 9 6
CYP 10 7
IEE 10 7

This presented some nice opportunities for creativity. It's a standing rule, as far as I know, that if a challenge statement has any ambiguity than you can simply interpret it as you like. This leads to solutions having a variety of styles and flavors. The problem statement didn't suggest what in particular we should rank so I decided to do a ranking of basic Perl objects. These are stored in an array, sorted, and then sent to the ranking functions. The sorting and ranking functions were written as generically as possible to allow for any sort of object as long as it has an accessor that returns a single numeric value. A rare bit of explicitly polymorphic code in Perl.

I think that the ranking logic would have been a little less convoluted with a better choice of data structures. I think a doubly linked list (i.e. with forward and previous references) would have been best. Still, the Standard and Dense rankings did not cause too much trouble to implement. The Modified ranking was the most complex. A concise definition from Wikipedia is that, for the Modified ranking, each item's ranking number is equal to the number of items ranked equal to it or above it. But what about ties for first place? And multi-way ties? Based on the sample output shown I think I achieved all the goals of the modified ranking but this is a good example of having someone else check your work! Especially a domain expert, someone who may have worked with this ranking many times in the past. Some searches didn't reveal much use of the Modified ranking. I did find a suggestion on how to address ties for first in Modifed Ranking here. Overall documentation of examples implementing the Modified ranking was otherwise not turning up in my searches.

I did make use of a core module which I do not ordinarily use, Tie::RefHash. For this particular problem I found it convenient to use the object references as hash keys and to do so requires the use of this module.

posted at: 12:47 | path: /perl | permanent link to this entry

2019-05-18

There is nothing wrong with use Threads;

Fight me

In a previous article I showed how a significant performance improvement can be made by re-implementing key parts of your Perl code in C++. This was done as an example of improving a brute force computation of the first five perfect numbers. In my writeup of that initial problem I discussed how the best way to improve the speed of that computation was an algorithmic improvement, using well known mathematical relationships to compute the perfect numbers directly and not doing a large scale brute force evaluation of tens of millions of numbers. Still, the fact remains that substantial performance improvements can be made through either one or both of (a) parallelizing the problem and spreading the computation out to run concurrently in pieces and (b) implementing the most computationally intense parts of the code in a compiled language such as C++. (b) was discussed previously, as mentioned above, here we'll take a look at (a). Specifically we'll look at using the perl Threads module. First, though, let's discuss the use of the Threads module.

Threads in Perl

If you want to take full advantage of your hardware and divide computationally intensive tasks across all your CPU cores, and have your code be fully portable, then you need to use Threads. Perhaps the only other option would be to use an MPI implementation, but that introduces significant complexity. Ignoring portability you could you use a fork(), perhaps using one of several modules that makes using fork more convenient. If you choose to ignore having any meaningful benefit to dividing your computationally intense code into different threads of execution than there are modules such as Coro which provide co-operative threads. Co-operative threads share the same CPU so using them for a significant computation would be for cosmetic code organizing purposes only. For non-computational tasks they may provide some benefit beyond the merely cosmetic. For example, when reading from several data sources you can avoid reading them completely in sequence and instead read a little from each until all reads complete. Note that this mediocre sort of threading is as good as you will ever get if you are a Python developer. Well, at least if you use the main CPython implementation of that language. Jython makes full use of the JVM's very nice multi-threading abilities. Most Python developers don't know or care about this limitation because they are mostly just calling wrappers to powerful libraries written in C++ such as TensorFlow. Those libraries of course make full use of C++'s multi-threading as well as GPU APIs. 

Now, if you look at the Threads documentation you will find it's written in this extremely hand wringing tone all but begging you not to use them. This is ridiculous and the result of an absurd argument from people that are loud pedantic jerks mean well. The fact is that the Threads module is not implemented in the way an experienced programmer might expect and so it comes with a decent list of use cases which must be avoided. This is mostly because each thread is given its own Perl interpreter. Many all of these cases to avoid are fairly obscure such as "don't use END blocks in a thread". Yeah, sure. Thanks for the heads up.

Indeed I am not aware of any actual horror stories of the Threads module causing trouble in practice. Those that are aware of their limitations either work around them as necessary or write test cases in an effort to bemoan these limitations.

If you're breaking up a tough computation into smaller pieces to take advantage of multiple CPUs Threads provide a simple interface to unleash a lot of processing power. Just read the documentation first and you'll be fine. 

The Code

Original code but now threaded

This is the same code as used in the initial problem but with Threads. Getting a rough idea of run time with time we see that it takes about 78 minutes. That is coming down from the original take ~257m. So a little over a 3x speedup. 

$ time perl ch-1t.pl
6 28 496 8128 33550336 

real 77m58.142s
user 268m17.491s
sys 0m21.845s

We already saw that a substantial speedup is obtained by re-writing the factoring and perfect checking in C++. What if we combined that code with Threads?

Native Library + Threads

$ time perl -Iext -Iext/blib/arch/auto/Perfect ch-1xt.pl
6 28 496 8128 33550336 

real 5m40.543s
user 20m6.924s
sys 0m4.594s

Great! We are now able to find the first five perfect numbers with brute force in a little over 5½ minutes. That's a roughly 45x speedup from the original pure (single threaded) perl code.

Notes:

Summary

This is the final chapter in a story that started by taking a naive computational approach to identifying perfect numbers and improving it first by re-implementing parts in C++ and then by introducing Threads. Both lead to noteworthy improvement in performance, in fact, a very significant improvement when taken together. 


posted at: 13:10 | path: /perl | permanent link to this entry

2019-05-16

Integrating C++ and Perl with SWIG

In another article I mentioned that it's possible to call code written in C++ from Perl. Perl offers two ways to do that:

  1. Perl's XS (eXternal Subroutines)
  2. SWIG

Here SWIG will be used. I like SWIG better than XS because it is somewhat more straightforward to integrate with C++ (XS is much more C-centric) and also because the steps necessary to wrap C++ code for use by Perl can be readily modified to work with other languages whereas XS will only let you integrate with Perl. In other words, if you have a significant C++ library that you would like to make available to a variety of programming languages SWIG is your best bet!

This came up when discussing the solution to one of the Perl Weekly Challenges in which the problem was to computer Perfect Numbers. The issue is that Perfect Numbers occur very sparsely and a brute force method to find them when written in even a fast interpreted language such as Perl will take a long time as it searches through many millions of numbers. The preferred solution to speeding up that computation is to use a little mathematical sophistication and use one of the several well known methods for computing Perfect Numbers directly. Algorithmic improvements lead to the greatest performance gains.

Often times, however, in practice a significant algorithmic improvement is not forthcoming or the time to spend researching it is disallowed by, say, project schedules. In these cases what we may want to do is write the computationally intensive parts of the code in a compiled language such as C++ and call it from our Perl code. This will allow the use of Perl to more rapidly develop the system as a whole while, as needed, certain parts are re-written in a way that improve overall performance.

The original code is shown in the previous article. That script, written only in Perl took nearly four and half hours to find the first five Perfect Numbers. 

$ time perl perl5/ch-1.pl    
6 28 496 8128 33550336 

real 257m33.621s
user 255m52.169s
sys 0m24.603s

Let's see what speedup we get from re-implementing the most computationally intense parts of the code in C++.

Overview

First be sure to install swig! Otherwise it's assumed you already have a C++ compiler and Perl installation.

I am using OS X 10.14 "Mojave" with perlbrew installation of perl-5.28. I use macports and have installed gcc9.

The swig installation is done in two parts: 

    $ port install swig
   $ port install swig-perl

Other systems, such as Linux, will have similarly named packages available.

I am using swig 3.0.12. swig 4.0 was released a few weeks prior to this writing but is not yet available on macports. That is fine, none of the latest features are necessary.

Also, I will save the extra issues related to creating a distributable module (e.g. suitable for uploading to CPAN) for a later article. 

Required Project Files

  1. All the above files are linked to GitHub if you'd like to download the code.
  2. Other files will be generated by swig but these are the ones that we must create.

Building (the easier way)

  1. swig -c++ -perl perfect.i  generates perfect.pm and perfect_wrap.cxx. perfect.pm is the Perl side of your new module. perfect_wrap.cxx is, as the name implies, the native wrapper side of your module. Both of these files are generated based on what you've put into your interface file (e.g. perfect.i). If you want to modify these files do so by changing your interface file!
  2. make Makefile.PL this will create a Makefile. This step is optional in the sense that you can definitely compile and link everything yourself but MakeMaker does some nice extra stuff, such as create all the necessary perl packaging files. It's amazing how convenient this is!
  3. make assuming you ran the command in the previous step this will compile and link the C++ code to a dynamic library as well as package it. At this point you might consider yourself done.
  4. make install this will install your new module in a location that Perl can easily find it for your use. This is definitely optional. While still developing your module you'll likely want to leave it in a local directory for testing.

Note that we didn't create any tests in this basic use case. That should be something you add in later and when you do be sure to run a make test before you install to make sure everything is OK. 

Also, none of the above is concerned with distribution of your module. To do so requires a little more packaging which we will discuss later. For now we are focussing on building everything up to a usable state for a single developer.

Running

After going through all the above you will have a directory that looks something like this.

MakeMaker did us quite a favor in creating several additional files and directories. To run the script ch-1x.pl from within this directory you would execute as shown next. The -I arguments are necessary to tell perl where to find the perfect.pm (Perl module) and perfect.dylib (OS X dynamically linked library) files.

Sample Run

perl -I. -Iblib/arch/auto/Perfect ch-1x.pl
6 28 496 8128 33550336 

That seemed to run quicker. But I wasn't watching closely. What does time tell us?

time perl -I. -Iblib/arch/auto/Perfect ch-1x.pl
6 28 496 8128 33550336 

real 19m31.171s

user 19m24.311s

sys 0m2.196s

That's great! That is about 14x faster than the pure Perl version! Again, for this particular problem this is not the method for getting the absolute best performance but it definitely shows that some great performance gains can be achieved by re-writing computationally intensive parts of your system in C++ vs plain Perl.

Details

Let's take a deeper look at what we did, starting by examine the files required for this project.

Required Project Files

We are declaring a pretty basic class. A constructor, destructor, and single method isPerfect().

We don't do much with the constructor or destructor, the code is really all in the isPerfect() method which is a very close translation of the previous pure Perl program.

The syntax of the interface files is a little unusual but this is small enough to not be too confusing. We declare that we are creating a module called Perfect. According to the swig docs The %{ %} block provides a location for inserting additional code, such as C header files or additional C declarations, into the generated C wrapper code. So that is why we need to include our header file perfect.h there, so it gets included in the generated perfect_wrap.cxx file. The seeming duplicate mentioning of perfect.h in the line %include "perfect.h" is not actually redundant. The %{%} block includes that header in the wrapper but what comes later is what is parsed by swig to actually generate the wrappers. In this smaller example they appear redundant but in larger more complex examples the different sections of the interface file become more distinct.

Here we actually get to put all our work to use! Notice how we use Perfect; and declare and use a Perfect object.

The contents of this file are small, just state your module name and the object files output by the compilation step and you should be good to go.

This is "optional" because it is possible execute the commands to build and link everything yourself, either in a Makefile you write or just executing the right commands in sequence. By using MakeMaker you get all the extra packaging files such as the blob directory and meta files with no extra effort. You also get the ability to easily install the new module on your system. By using MakeMaker you are probably 80% of the way to having created a distributable module, suitable for inclusion on CPAN.

For the sake of completeness and a slightly fuller understanding here is what doing the build manually looks like ...

Building (the harder way)

$ swig -c++ -perl perfect.i

$ g++ -c perfect.cxx perfect_wrap.cxx `/Users/adamcrussell/perl5/perlbrew/perls/perl-5.28.0/bin/perl -MExtUtils::Embed -e ccopts`

$ g++ -dynamiclib -single_module -flat_namespace -undefined suppress -o perfect.dylib perfect.o perfect_wrap.o -L /Users/adamcrussell/perl5/perlbrew/perls/perl-5.28.0/lib/5.28.0/darwin-thread-multi-2level/CORE/ -lperl

Sample Run 

$ perl -I. -MPerfect -e 'my $p=new Perfect::Perfect();print $p->isPerfect(6);'
1

Just what we hoped for, a true value returned. We can now confidently run             ch-1x.pl as we did before.

We use the -I. to make sure that the interpreter sees perfect.pm and perfect.dylib in the current directory. If they occur in other locations than be sure to specify that with -I. In earlier examples you may have seen -Iblib/arch/auto/Perfect for this same reason.

Clearly this was done on a Mac OS X system and is very specific to my particular configuration with perlbrew. You'll need to adjust accordingly. In particular on Linux systems, not using perlbrew, for example your last line might look more like

g++ -shared perfect.o perfect_wrap.o -L /usr/lib/ -lperl -o perfect.so

Summary

This article has shown the construction of a Perl module with portions written in C++ and integrated with SWIG. The example module is small and straightforward but demonstrates the main capabilities of SWIG. More complexity comes into play when using advanced C++ features such as templates and nested namespaces, however SWIG is able to handle these well and the integration of more advanced C++ code and Perl may be the subject of a future article.


posted at: 19:36 | path: /perl | permanent link to this entry

2019-05-14

Perl Weekly Challenge 008

As with the previous weekly challenges the problem statements are short and are included in the first comment of the code. The code blocks  shown  link to GitHub Gists.

Part 1

This is straightforward enough of an implementation, without much code that requires a lot of explanation. For each candidate number we compute the factors and then add them up. If the sum of the factors is equal to the number itself it's perfect, print it and continue the search. I am re-using the pack/unpack trick to do the summing, like I have done before.  So what could go wrong!?!?

Look at the sample run below though. I executed it use the time command. time is a quick way to see just how long a command takes to execute and here we see that to find the first five the above code takes nearly four and half hours!

Sample Run

$ time perl perl5/ch-1.pl    
6 28 496 8128 33550336 

real 257m33.621s
user 255m52.169s
sys 0m24.603s

The reason for this long run time is that Perfect Numbers become extremely sparse very quickly. The first four are typically found within seconds but the fifth takes the rest of the time because 33,542,208 numbers must be checked before it is found. Even a fast interpreted language like Perl running on modern hardware will take a long time to examine so many numbers. So how to make things faster? There are three ways:

  1. Algorithmic improvement
  2. Parallelization
  3. Write the computationally intensive parts in a compiled language such as C++. 

The answer for this problem is definitely (1). Useful properties of Perfect Numbers are well documented. A good web page to reference would be http://mathworld.wolfram.com/PerfectNumber.html. There you would find that you can compute Perfect Numbers in several direct ways due to properties related to Triangular Numbers, Mersenne Primes, and Hexagonal Numbers. The compute time to find the first five would be at most a few seconds even when written in an interpreted language such as Perl when running on contemporary hardware. 

Even though (1) is the best answer for this and most any performance problem you may ever encounter (2) and (3) are worth discussing because in many practical situations they offer the best solution (usually in terms of time and money!).

(3) is discussed in detail here. The results are that we can find a ~14x speedup, from 257m33.621s to 19m31.171s.

A combination of (2) and (3) is discussed here and results in a ~45x speedup, from 257m33.621s to 5m55.348s.

Part 2

Sample Run

Details

The logic used here is that we must first find the longest line. Once we determine the longest line find the midpoint of that line. The midpoint of the longest line will be where we want to center all the other lines. The padding needed is computed as the difference between the middle of the longest line and the middle of any other line. In the case of the longest line itself no special handling is needed, its padding length is zero. I use the same eval() and tr() method to find line lengths as I have done before.

posted at: 19:19 | path: /perl | permanent link to this entry

2019-05-09

Perl Weekly Challenge 007

As with the previous weekly challenges the problem statements are short  and are included in the first comment of the code. The code blocks shown  link to GitHub Gists.

Part 1

I think this is straightforward enough, with the exception of the use of pack and unpack.

Sample Run

$ perl ch-1.pl
1 2 3 4 5 6 7 8 9 10 12 18 20 21 24 27 30 36 40 42 45 48 50 54 60 63 70 72 80 81 84 90 100 102 108 110 111 112 114 117 120 126 132 133 135 140 144 150 152 153 

pack/unpack

The line unpack("%32C*", pack("C*", @digits)) is a bit mysterious looking at first so let's unpack it (pun intended!) .

We are first, in the inner pack() call, creating a binary representation of the @digits array according to a template.

So then we use unpack()'s checksum function to sum the digits, which is really all we are after here anyway.

The two functions share the same template syntax which is documented on the pack manpage.

More on binary formats

If you are curious just what the raw binary created by pack() looks like you can use the xxd command with the -b option. For example, suppose I ran a command like the following:

perl -e '$i="2019";@d=split(//,$i); print pack("C*", @d);' > /tmp/pack

can you guess what A is based on this output from xxd?

$ xxd -b /tmp/pack
00000000: 00000010 00000000 00000001 00001001                    ....

If you find this figuring out of the meaning behind binary strings interesting than perhaps you have a future as a reverse engineer! If not, than you are a normal well adjusted person. :D

Part 2

Sample Run

$ perl ch-2.pl
cold -> cord -> card -> ward -> warm

Overview

To compute a word ladder we can calculate the shortest path between the two words, as represented by vertices, on a graph. To create the graph                              sub build_graph() creates a vertex in the graph for each word and does a pairwise comparison between all words. If they differ only by one letter then an edge is drawn between the two vertices. Once the graph is created we perform a Dijkstra Single Source Shortest Path computation which will compute all shortest paths from a given source vertex to all other vertices. This may seem wasteful, we are only concerned with one such path after all, right? Interestingly, the worst case time complexity is the same whether we abort the algorithm after finding the path we want or keep going and compute them all. There is some wasted calculation of course but, asymptotically speaking, we don't need to be too anxious about it.

The code shown above gets a little dense in place although I think at a high level the approach is not too hard to comprehend.

  1. Create the graph. I use the Graph module from CPAN. 
  2. Select a source word (vertex). Compute all shortest paths from this source.
  3. return the path we are interested in for our word ladder.

Details

As mentioned before some parts of the code get a bit dense. This is not out of some intent to be purposefully obfuscated! As always with Perl There Is More Than One Way to Do It and the way I chose fits a balance between readability and still not being overly verbose. I think.

In sub build_graph these lines

 my $length_w0 = do{    
   $w0 =~ tr/[a-z]//;           
};

compute the string length of $w0 by using the return value from tr which is the number of matches made. The same technique is used a few lines down with

my $differences = eval "\$w1 =~ tr/$w0//";

but here we need to wrap the tr in an eval because tr does not perform any variable interpolation since its transliteration table is built at compile time and not run time.

In sub dijkstra_sssp the lines 

local *by_total_edges = sub {
   return 1 if $total_edges{$a} eq OO;
   return -1 if $total_edges{$b} eq OO;
   return $total_edges{$a} <=> $total_edges{$b};
};

define a nested subroutine. The need for the use of local is to avoid creating a closure. The perlref manpage describes the situation: ... named subroutines do not nest properly and should only be declared in the main package scope. This is because named subroutines are created at compile time so their lexical variables get assigned to the parent lexicals from the first execution of the parent block. If a parent scope is entered a second time, its lexicals are created again, while the nested subs still reference the old ones.

Anonymous subroutines get to capture each time you execute the sub operator, as they are created on the fly. If you are accustomed to using nested subroutines in other programming languages with their own private variables, you'll have to work at it a bit in Perl.  The intuitive coding of this type of thing incurs mysterious warnings about "will not stay shared" due to the reasons explained above.

So by using local what we are doing is creating a temporary (i.e. re-created for each call to the enclosing subroutine) assignment of the anonymous subroutine. This has normal access to the lexical variables from the enclosing scope at the time it is is invoked.

This has the interesting effect of creating a function local to another function, something not normally supported in Perl. I'll freely admit this is definitely not idiomatic Perl. I'd argue that stylistically, in any language, this should be the preferred way of organizing this code. The small function used for sorting vertices has no use outside of sub dijkstra_sssp. Indeed I could imagine many programmers just inlining the code in the call to sort! This use of local is essentially the same thing, but cleaner looking than inlining. The only complication is that perl will complain with

Name "main::by_total_edges" used only once: possible typo at perl5/ch-2.pl line 41.

unless we add the line no warnings "once"; I am still investigating this. At face value the warning makes no sense: by_total_edges is called from within a loop. Is this a slight defect in the interpreter to not recognize this? I'll write up what I learn about this another time.

Notes


posted at: 20:05 | path: /perl | permanent link to this entry

2019-05-05

Perl Weekly Challenge 006

Perl is an extremely expressive language which gives a tremendous amount of flexibility to the programmer. Combined with a deep module ecosystem (i.e. CPAN) a small amount of Perl code can be extremely powerful!

This week I was not feeling well and allowed myself some shortcuts. Rather than take the time to implement necessary routines myself in some interesting (to me) way I instead opted to make use of a couple of modules and be done with it. All the sooner to get back to feeling sick and tired.

As with the previous weekly challenges the problem statements are short and are included in the first comment of the code. The code blocks shown link to GitHub Gists.

Part 1

Data::Dump is one of my favorite debugging tools. The pp() function "pretty prints" the contents of a data structure and, in doing so, performs the sort of compaction required in the first part of this week's challenge. All that is necessary is to call pp() and then do a bit of polishing on the final output by removing unnecessary parentheses and white space.

Sample Run

$ perl ch-1.pl 1,2,3,4,9,10,14,15,16

1-4,9,10,14,15,16

Ok, maybe it's somewhat off the requirements since pp() wants to compact 4 or more continuous numbers and not 3 or more. Did I mention I was feeling sick? Close enough!

Part 2

I had dreams of writing my own arbitrary precision code for this. Fever dreams that is! Math::BigFloat helped make quick work of this.

Sample Run

$ perl ch-2.pl

262537412640768743.9999999999992500725947

posted at: 20:56 | path: /perl | permanent link to this entry

2019-04-26

Perl Weekly Challenge 005

Anagrams!

This weeks challenge involves anagrams. 

The problem statements are short and are included in the first comment of the code. The code blocks shown link to GitHub Gists.

The Approach

For both parts I used the same general approach. 

By the Fundamental Theorem of Arithmetic every integer greater than 1 is either a prime number itself or can be represented as the unique product of prime numbers.

Using this information the idea used in both Part 1 and Part 2 is to have each letter mapped to a prime number and each word represented by the product of each its letter's corresponding numbers. For example let's consider the word "live":

    live = l * i * v * e = 31 * 11 * 73 * 2 = 49786.

Any word also spelled with an 'l' an 'i' a 'v' and an 'e' will have the same product of 49786 even if the letters are arranged differently. In this way we have a straightforward path to determining anagrams.

Assumptions:

  1. Only single word anagrams are considered
  2. Words shall contain only the 26 letters of the English alphabet
  3. Only lower case letters are used.
  4. The values assigned to each letter are inverse to their overall frequency. 
  5. No bad actors: all inputs are safe and clean and need not be checked.

That last point is a minor detail. I thought that to minimize the overall size of the numbers generated I would assign smaller numbers to the most frequently used letters. Letter frequency uses the Lewand ordering described in the wikipedia article. It seems that this was unnecessary for the test corpus used (the standard Unix dictionary file) but I decided to leave it in the final code anyway as it was an interesting design point nonetheless.

Part 1

The code for Part 1 takes a  test word from the command line and computes it's product. The product for every word in the standard Unix dictionary file is also computed and stored in a hash. That hash is then searched for matches against the test word and the results printed. The line 

    delete($word_product{$test_word});

is just a simple way to make sure that the test word itself is omitted from the printed results. Interestingly there are exactly two words in the dictionary file which contain a hyphen! Those two entries are "Jean-Christophe" and "Jean-Pierre". Because of these I make sure to remove hyphens.

Sample Run

$ perl ch-1.pl live
live: vlei levi vile veil evil

Part 2

Again using the same approach Part 2 words are read in from stdin and the word/product hash created. The values in the hash (the products) are counted and the sequence of letters corresponding to the product with the highest number of entries is printed.

Sample Run

$ perl ch-2.pl < /usr/share/dict/words
e a s l p

Note: In the case there is a tie the first one in the sorted list is taken. Co-incidentally in the dictionary file there is a tie! In addition to the one shown above is a o n r g which also has ten anagrams.

$ perl ch-1.pl easlp

easlp: salep speal pales spale saple slape lapse elaps sepal lepas

$ perl ch-1.pl aonrg

aonrg: ronga orang angor grano goran argon nagor groan rogan organ

posted at: 19:37 | path: /perl | permanent link to this entry

2019-04-16

Perl Weekly Challenge 004

Another week, another chance to get creative with some Perl!

Part 1

This code is done to solve the stated problem as generally as possible! That means that instead of golf-ing the code down to such a small size that I could have just used sprintf("%1.Xf", ...) where X is the size of the, presumably, one-liner I instead put the code in a file and use stat() to get the size of that file. The digits of π  are then computed as needed using a so-called spigot algorithm. This name comes form the fact that the algorithm delivers an endless supply of digits, like water flowing from a spigot.

This computation of the digits of π uses continued fractions. The method, the mathematics behind the algorithm, and some Ruby code implementing the method (which I drew heavily from!) can be found here.

For the record the one-liner solution would look like this:

    perl -e 'print(sprintf("%1.37f",4*atan2(1,1)))'

Part 2

This code is inspired greatly by the solution to Challenge 003's first part. This time instead of creating a few individual closures I create an array of them, one for each of the given letters. These are then used in a recursive loop to remove letters from each word being examined. If we use up all the closures (letters) and the word still i snot empty then the word does not pass. If the word is reduced to the empty string (whether we have closures left or not) indicates the word passes.

Note: I did not follow the exact wording of the challenge in that the letters and words are not coming from separate files. Changing the code to get the words and letters from a file would not change the logic of the main parts of the code at all, however. 

"

posted at: 17:03 | path: /perl | permanent link to this entry

2019-04-14

Perl Weekly Challenge 003

Both parts of this week's challenge seemed to have the same theme of performing a calculation that inspires recursion. I wrote a bit about recursion in Perl last week so there isn't too much new to add on that subject in describing the code.

The problem statements are short and are included in the first comment of the code. The code blocks shown link to GitHub Gists.

Part 1

This a pretty straightforward solution to the stated problem. To jazz things up a little I used currying to generate the three is_divisible_by_X functions. This is neat to see in practice, but is never required. In fact, I've never seen currying used seriously in Perl. Just these sorts of toy problems and examples. If anyone is interested in why someone would want to do this I refer you to a nice Perlmonks post on why currying may have some practical usage in other programming environments.

I generally avoid using any CPAN modules for these problem sets but I made a small exception in the code for Part 1 and use boolean; this module simply allows the use of true and false as boolean values. Completely unnecessary, of course, but I have always preferred the aesthetics of boolean keywords.

Part 2

The computation for Pascal's Triangle (binomial co-efficients) is, similar to the Hamming Numbers of Part 1, straightforward. This week perhaps my creativite energy was a bit low. The solution for Part 2 is about as vanilla as you can get!


posted at: 11:09 | path: /perl | permanent link to this entry

2019-04-04

Perl Weekly Challenge 002

Below is my take on each of the two parts to challenge 002. The problem statements are short and are included in the first comment of the code. The code blocks shown link to GitHub Gists.

Part 1

As described this can be done using sprintf(). I was slightly conflicted at first to simply use sprintf() but it's a core function of the language! Probably it's not in the spirit of these challenges to simply import a CPAN module but a core function should be fair game.


Part 2

Introduction

If I needed to do a base conversion for real project work I probably wouldn't write my own function, several modules already exist that provide this functionality, and I definitely wouldn't use recursion! This being a programming challenge and an opportunity to try out things for fun I took the approach of (1) using recursion to loop over the number being converted and (2) use chr() and ord().   chr() and ord() are convenient little functions that are the opposites of each other. ord() will return the numeric value of  a character and chr() will return the character represented by a number. What these characters and numbers represent is determined by the "character set". W3.org has a comprehensive article on the subject for anyone that wants to dive deeper into the details of character sets. For our purposes we really just need to know that ord('a') is 97 and it follows then that chr(97) is 'a'. [Note: this information has only ever been useful to me in computer science class homework, programming challenges, and job interviews! I believe these sorts of tricks were more commonly used in practice back in the days when statically typed languages such as C made frequent use of char primitives.]

The Code

I've always had a strong distaste for string literals and numeric literals appearing in code out without any context. Although this code is short enough to not really need them I stick to having some use constants to define some sample values for testing purposes as well as the base in the challenge. Also, to make practical use of the knowledge that ord('a') is 97 we define an offset of 87. Why is that? Well, we want 'a' to represent 10 and so ord('a') — 87 is 10, ord('b') — 87 is 11, ..., ord('y') is 34. In this way we are able to use the letters a — y as needed in the challenge statement.

The two conversion functions use recursion and are described next.

sub base10tobase35

    given a number return a string representing that number

In this function we check to see if we've reached our base case of the number being 0. In this case we return an empty string. Otherwise we calculate the number mod 35, the base we are converting from. If the result of that calculation is a digit 0-9 then we are done, otherwise we use chr() to find the right representation a-y. The recursive step is to reduce the number by doing an integer division by 35 (in Perl we need to use the int() function to achieve this otherwise we would get a floating point result) and calling the function itself again. This repeats, with each successive digit added to the accumulating result until the division result is 0. This procedure is working from right to left and that is why the line return $r .= $value; builds by adding the return string on the left and not the right, to show the resulting digits in the correct left to right order. The uc() function is used to return everything in upper case, even though we work in lower case, by way of lc(). I just thought upper case looks better when printed in the terminal window. :)

sub base35tobase10

Here the number is split into an array of individual digits. We shift off and inspect each digit. This means that we are now working left to right. If the digit is a numerical digit 0-9 then we can leave it alone but if it is a letter then we use ord() to convert it to a number. We then take this and build up our return value via recursion. Since we used shift() the array of digits is shorted at each step, the recursion terminates when we only have a single number left. The line

return $value * (BASE ** ($length-1)) + base35_to_base10($digits);

takes the value and multiples it by the appropriate power of 35 and adds up the result recursively. $length is the length of the array of remaining digits and since we are working left to right it corresponds to the correct power of 35 to use with the shifted digit.

Recursion In Perl

Using recursion like this in Perl is kind of fun but in practice it is generally frowned upon. Why is that? Mostly because for each recursive call the perl interpreter needs to keep track of the previous state of the program and that can become very memory intensive. By default Perl will warn about recursive loops that repeat more than a fixed limit. That limit is, the last I checked, 100 which is pretty conservative. On modern hardware it should take much more than than to exhaust Perl's memory and crash. If you want to play around with recursion in Perl and not get warned about it you can set the following:

no warnings 'recursion';

I found out a while back that I saw Perl start to fall apart around 5000 recursive calls when running this code: https://github.com/adamcrussell/n-link .

Finally, there are ways to compile a perl interpreter with a higher fixed limit but I've never heard of anyone needing or wanting to do that.

posted at: 13:03 | path: /perl | permanent link to this entry

2019-04-02

Perl Weekly Challenge 001

I just learned about https://perlweeklychallenge.org/ which posts a small coding challenge each week. They seem to have started up 25 March 2019 so, as of this writing, there have been two challenges posted. Below is my take on each of the two parts to challenge 001. The problem statements are short and are included in the first comment of the code. The code blocks shown link to GitHub Gists.

Part 1

The problem statement seems like a clear use for tr. tr will perform the substitution on the string and return the number of substitutions made. This can be done in one line, of course, and would look like: 

    my $number=$challenge_string=~tr/e/E/;

It's easy to argue that is not the most easily read line of code. I opted to perform the substitution in a do block. The return value from the block is the value returned from the last (in this case only) statement and so it is more clear just what is being performed and what is being assigned to $number.


Part 2

This is a classic problem and since I had blocks on my mind from part one I decided to use the less common redo to perform the looping. redo will cause the block it is enclosed in to be performed again, provided a given condition is true. The use of until is a bit gratuitous but using it here in this way, I'd argue, makes the code read, as an English expression, very clearly: the block will repeat until $i exceeds 20. Finally, I decided to use a Perl's switch statements which have a given/when syntax. Switch statements in Perl have a complicated history. The original implementation had bugs and it's use has been deemed experimental. This seems mostly due to its use of the so-called smart match operator ~~. This code doesn't rely on the ~~ operator's sometimes confusing behavior so it's use here doesn't relate to any of the controversy.



posted at: 16:04 | path: /perl | permanent link to this entry