May 2020 – Page 2 – Programming Excursions in Perl and Raku

THE WEEKLY CHALLENGE – PERL & RAKU #60

TASK #1 › Excel Column

Challenge by Ryan Thompson

Write a script that accepts a number and returns the Excel Column Name it represents and vice-versa.

Excel columns start at A and increase lexicographically using the 26 letters of the English alphabet, A..Z. After Z, the columns pick up an extra “digit”, going from AA, AB, etc., which could (in theory) continue to an arbitrary number of digits. In practice, Excel sheets are limited to 16,384 columns.

Example

Input Number: 28
Output: AB

Input Column Name: AD
Output: 30

Method

So it’s base 26, using capital letters for the symbols. Cool.

Ok, well, sort of. Turns out it’s a little different, a little more complicated than that.

“How so?” you say. Glad you asked. It seems we have a little problem with Z.

“Z?” you say. Yes, Z.

As usual, the first order of business is to unpack the challenge and nail down a few outstanding questions. Then the problem will make a little more sense.

All we are told in the text about Excel is that the column identifiers start with A and continue through Z, then recommence labelling with AA to AZ, etcetera. In order for the example mapping 28 → AB to occur, 27 maps then to AA, and what’s left is A-Z mapping 1 to 26. So the indexing starts at 1. Fair enough.

The cycle is undoubtably 26; converting to base 26 seems a reasonable start. We can then substitute the letters A-Z for the normal representation 0-9 followed by A-P. The problems start immediately, with the number 10₂₆, or if you look at it another way, 0.

To make visualization a little easier it might make sense to relate to base 10, our standard counting system. To count in base 10, we use the 10 digits 0 through 9, then we start to positionally combine these digits to make the numbers 10, 11, and so on. So if we choose to use letters for our digits, do we map the letters to 0-9 or 1-10? In either case we are confounded by 10, being the last member of the mapping, but being composed of two digits in the number system.

But back to base 26 and Z, if we use the tried and true method of dividing out the base, we’ll never get a remainder of 26, but rather 0. So do we make 0 → Z ? Then 26 → AZ, which is wrong. Z has to be 26, because the next number, 27 is AA (think 11 here if you will, the association is correct) Similarly, fudging our numbers by 1 in other ways, either altering the input or the remainder to match 0 indexing, fails as well, either starting on the wrong letter or erring at the transition to two digits. Sometimes we start at B, or later we get BA instead of AA. There’s a lot of ways down this mountain, I assure you. We are being confounded by the fact that our counting is indexed to 1 and our base is indexed to 0.

The solution is to shift the input number to a 0-based lookup within the dividing out itself, by subtracting 1 from the number during the computation of the remainder. Then 26 becomes 25 and the arithmetic all works out; 27 follows 26 as it should be and AA follows Z. This gives us the correct translation, finally. Mixing indexing systems is always a red flag for danger, and adding in modular arithmetic to the mix just adds to the confusion.

To do the conversion from Excel to decimal is a little less complicated, fortunately for us. To run the dividing out in reverse we first need to establish a lookup mapping the letters A-Z to 1-26, and then we progress through the string taking off letters from the right and summing the product of the lookup value and 26 to the power of the (0-based) position for each place. In base-10, this multiplier is 1, 10, 100 and so on. In base-26 we use 26⁰, 26¹, 26²… To implement this, we dice the string into an array of chars, reverse() the array and shift() from the left. Then we can keep track of the loops, counting from 0, to get the appropriate power.

Perl Solution

use warnings;
use strict;
use feature ":5.26";

## ## ## ## ## MAIN:

my ($input, $EXAMPLE) = @ARGV;

if    ($input =~ /^\d*$/) {
    say dec2excel($input);
}
elsif ($input =~ /^[A-Z]*$/) {
    say excel2dec($input);
}
else {
    say "input: decimal number or capital alphabetic sequence of characters A-Z";
}

$EXAMPLE && printf "%-2d  %-2s  %-2d  %-2s\n", $_, dec2excel($_), excel2dec(dec2excel($_)), b26($_) for (1..90);


## ## ## ## ## SUBS:

sub dec2excel {
    my $num = shift;
    my @alpha = ( "A".."Z" );            
    my $out = "";
    my $rem = 0;
    while ( $num > 0  ) {
        ## magic here, note we do the math on num - 1
        ($num, $rem) = (int( ($num-1)/26 ), ($num-1) % 26);   
        $out = $alpha[$rem] . $out;
    }
    return $out;  

}

sub excel2dec {
    my $excel = shift;
    my @alpha = ( "A".."Z" );
    my %alpha = map { $alpha[$_] => $_+1 } (0..25);
    
    my @rev_26 = reverse( map { $alpha{$_} } split //, $excel );
   
    my $out;
    for (0..@rev_26 - 1) {
        my $val = shift @rev_26;
        $out += $val * (26 ** $_);
    }
    return $out;
}

Raku Solution

The input for the script will either be a number to be converted to an alphabetic column index, or letters to be converted into a number. This gives us an opportunity to parse the input using Raku’s given/when syntax, with a default output for malformed input.

The algorithms are what they are, so remain unchanged. The chaining of routines in the column to number conversion flows nicely.

I’ve included an optional verbose tabulation, where adding a second argument that evaluates to true will produce a translation table of all columns up to 16,384

sub MAIN ( Str:D $input, $EXAMPLE = 0) {

    my $usage = qq:to/__END__/;
    Usage: 
        argument is decimal number or capital alphabetic sequence of characters A-Z
        optional second argument to produce list of all possible excel columns 1 to 16,384
    __END__
    
    given $input {
        when        /^\d*$/             { dec2excel($input).say }
        when        /^<[A..Za..z]>*$/   { excel2dec($input.uc).say }
        default                         { $usage.say }
    }   
       
    $EXAMPLE && printf "%-2d  %-2s  %-2d\n", $_, dec2excel( $_ ), 
                                excel2dec(dec2excel($_)) for (1..16_384);

}

sub dec2excel ($num is copy) {
## Int number --> Str excel column
    my @alpha = ("A" .. "Z");             
    my $out = '';
    my $rem;

    while $num > 0  {
        ($num, $rem) = (($num-1)/26 ).floor, ($num-1) % 26; 
        $out = @alpha[$rem] ~ $out;
    }
    return $out;  
}

sub excel2dec ($excel) {
## Str excel column --> Int number
    my %alpha;
    %alpha{"A".."Z"} = 1..26;
    my @reverse_26 = $excel.comb.map({ %alpha{$_} }).reverse;
    my $out;

    for (0..@reverse_26.end) {
        my $val = @reverse_26.shift;
        $out += $val * (26 ** $_);
    }
    return $out;
}

TASK #2 › Find Numbers

Challenge by Ryan Thompson

Write a script that accepts list of positive numbers (@L) and two positive numbers $X and $Y.

The script should print all possible numbers made by concatenating the numbers from @L, whose length is exactly $X but value is less than $Y.

Example

Input:

@L = (0, 1, 2, 5);
$X = 2;
$Y = 21;

Output:

10, 11, 12, 15, 20

Method

This is another example of what I call concrete number theory, where we conflate the ideas of numbers and the symbols used to represent them. We’re going to assemble digits positionally rather than mathematically, and then evaluate those resultant numbers according to some criteria. In this case the number of positions is equal to an input x, and the resultant number is less than an input y.

To manufacture the number, we’re going to combine elements from an input array. Considering the choices as sets, we are looking for permutations of combinations with repetitions of length k elements from set n. This is known in Set Theory as

k-element variations of n-elements with repetition

as distinguished from variations without repetition. For a list of n elements we have

VR_n,k = n^k

variations; because every position can contain every element at any time, so the product is

n × n × n …

for as many positions as required. There is one small caveat to this, however, and that is the number 0.

Despite the given specification given a list of positive numbers, zero is neither positive nor negative, so does not belong according to that rule. However the example list explicitly includes it, so we will follow suit and allow it, despite the specification. Sometimes we must figure out that what the customer wants is not exactly what they asked for.

To create our master list of variations, we’ll create a list, and then iterate through the elements one at a time, adding an new value and creating a new set of variations for every option. Bu carefully keeping track of indexes we can just keep just one list as a queue, shifting elements off the front and adding new variations on the end. As we allow repetition, it makes no sense to duplicate elements in the input; it’s a set, not an ordered list. Thus if the number 2 is an element, if 2 is repeated later in the input it still can be present in any position, so it makes no difference. We’ll allow it, because you can’t control people, but roll our eyes a bit and filter out duplicates. Also, it we sort our list before we start, the results will come out sorted, which is nice.

We will use our now sorted and filtered list to start our list of variations. At this point, the first digit of the final number, we will need to address 0 again, because earlier we specifically allowed it. The problem is if we have a leading zero on a number, it will not, by convention, be considered in the final construction, and as such will shorten the positional count of the result, violating that criterium. As far as convention goes, in case there was any question remaining whether leading zeros are allowed (because I can certainly see a case for it) we refer again to the text. The example given in the challenge does not include the results 00, 01, 02, or 05. Ok, so that’s a no-no, and we must for the initial value exclude 0 if it is present. After this it’s fine, as it won’t affect position.

So we will need to selectively filter it out only for the initialization. To do this, we can either check the value at index 0, because the list is positive and sorted, then remove it if present, or alternately we can grep the sorted array on the values themselves, which will fail on 0. Kind of a high-pass filter. This is nice, so we’ll go with that.

But before we’re quite finished there also exists a singular pathological case to handle where x = 1 and the list includes 0. If we are not allowing leading 0s, does a solitary 0 count as a valid answer? Sure, it’s not a leading zero if it’s the only value present, even if that value is the absence of value. Really this relates back to the acceptance of 0 as a valid number at all, which is a much more recent, non-obvious and complicated development than many people understand, and entirely worthy of a discussion unto itself¹. In another way it reminds me of pluralizing cats: no cats, 1 cat, 2 cats. The progression is unexpectedly complex. So we eliminate the leading 0 of the first result set, but need to add it back in one case, when x = 1.

¹Kaplan, Robert (2000) The Nothing That Is: A Natural History of Zero, Oxford: Oxford University Press.

[colincrain:~/PWC]$  perl 60_2_gathering_digits_down_under.pl 2 21 0 1 2 5
10
11
12
15
20

use warnings;
use strict;
use feature ":5.26";

## ## ## ## ## MAIN:

my ($x, $y, @array) = @ARGV;

## removes leading 0 unless x = 1
my @result = $x == 1 ? @array
                     : grep { $_ } @array;  

for (2..$x) {
    my $end = @result-1;
    for ( 0..$end ) {
        my $num = shift @result;
        for my $next ( @array ) {
            push @result, "$num$next";
        }
    }
}

say "$_" for grep { $_ < $y } @result;

Raku Solution

As the algorithm logic, once figured, is straightforward, the Raku version mirrors the Perl, and vice-versa. I actually wrote this one first, so it’s proper to call the Perl version the port.

sub MAIN ( $x, $y, *@array) {
## @array   is array to be rearranged
## $x       is new number length
## $y       is larger than new number
 
    @array = @array.sort({$^a <=> $^b});

    ## removes leading 0, unless x = 1
    my @result = $x == 1 ?? @array
                         !! @array.grep: { $_ };  

    for 2..$x {
        ## note the length of the queue at this moment
        my $end = @result.end;      
        for 0..$end {
            my $num = @result.shift;
            for @array -> $next {
                @result.push: 10 * $num + $next;
            }
        }
    }

    .say for @result.grep: { $_ < $y };
}

Month: May 2020

An Excellent Gathering

THE WEEKLY CHALLENGE – PERL & RAKU #60

TASK #1 › Excel Column

Example

Method

Perl Solution

Raku Solution

TASK #2 › Find Numbers

Example

Method

Raku Solution