THE WEEKLY CHALLENGE – PERL & RAKU #54
TASK #1
kth Permutation Sequence
Write a script to accept two integers n (>=1) and k (>=1). It should print the kth permutation of n integers. For more information, please follow the wiki page.
For example, n=3 and k=4, the possible permutation sequences are listed below:
123
132
213
231
312
321
The script should print the 4th permutation sequence 231.
notes on the question
So what exactly are we looking for here?
Despite the wiki link given, which sends us to the subheading “#k-permutations_of_n”, we do not appear to be looking to calculate here what is known as nPk, or the k-permutations of n, and so we will neither quantify, enumerate nor ruminate on any possible ordered groupings of k items selected from a set of n elements. That’s a really interesting puzzle in its own right, but that does not jibe with the rest of the challenge description. We can only infer that the specific subheading portion of the link given is what as known as a red herring, or false lead, or perhaps wild goose chase.
The rest of the page is informative reading, though, and holds clues as to the intent of the challenge. The first is a permutation is an expression of a particular rearrangement, or remapping, of a sequence; the verb rather than the noun.
Thus the actual items being permuted are irrelevant, and as such when speaking of permutations it is common to use an ascending sequence of natural numbers, 1 2 3 4 5… as tokens representing the elements to be rearranged. So n here is the number of elements, which is what we expect, and the ordered start set will be of the form
( 1 2 3 … n-2 n-1 n)
The task seems to ask us to establish a list of all permutations of this ordered set and then locate and output the kth member of that list. The problem arises when one takes into account there is no one single way to sequence the order of possible permutations of a starting set. There are commonly accepted ways to label premutations with unique identifiers for later reference, but no one way to enumerate them. Mathematically it is the relationships between individual permutations that are interesting, rather than their positional information in a list.
The list of n=3 in the task outline is sorted ascending as though the individual permutations were strings; “1 2 3” is followed by “1 3 2”. This is known as lexicographic order, and is what we will assume is requested.
The description also notably starts the list of permutaions with the identity permutation, the mapping of each element to its original place. So we need to make sure we start counting from there, rather than the first alteration. Remember, to choose to do nothing is still a choice.
Method
Now before we continue, let it be known that these days I’m not shooting for the fastest, most sensible nor efficiant methodology to go about solving these challenges. For instance, there exist some quite good modules in CPAN to make permutations simple, painless and fast. Of these I prefer and recommend Algorithm::Permute, it’s written in XS and very good, albeit with some odd quirks. I even used it here previously for the wordgame challenge. No, these days I view these excursions as thought experiments and just prefer, if I have time, to explore the problem space and see whats there. Modules are an integral part of the Perl ecosystem but today I’m just going to have at it and roll my own. I have my own idiosyncratic rules for proceeding, and these rules are by no means fixed nor consistent.
I suppose I’m drawing on an old artist technique in my toolbox, to artificially add constraints to a challenge. Combined with my apparent fascination with parsing the minutiae of the challenge questions, and you get a little slice of my brain. Welcome to my world, we have snacks.
To obtain the kth lexicographically sorted permutation sequence the first approach was to generate a list of permutations and select the kth element of that list. Seems reasonable. So we use a recursive routine, constructing the patterns from left to right, and at each element iterate through the remaining numbers in the starting pattern in a well ordered way. The resulting list of patterns will come out lexicographically sorted. We can accomplish this by passing a set of remaining numbers, a working permutaion under construction, and a results set to hold the permutaions once finished.
This is all well and good, and works fine for reletively small values of n, but as-is it needs to construct all of the patterns first and then select the kth member. Furthermore, it will obviously blow up with all that recursion and looping; even though the sets shrink by one element each time we proceed the time will still be n! to finish. We can improve this dramatically for smaller numbers of k by coupling in some reference slots to store the number requested and the the permutation produced, and short-circuiting to collapse the remaining recursion if we have our answer, but this is by no means a universal safeguard. The final routine is quite workable within reason, and is located in
permute_recursive( \@set, \@working, $permutations, $data)
There is of course a better way to do this, which is to rearrange the sequences in place, using an algorithm that only requires the current permutation to generate the next. One such algorithm is Knuth’s Algoritm L (Lexicographic permutation generation), which he notes is only describing a method that had been invented some 600 years previous. This is implimented as
compute_next_permutation( \@set )
with a wrapper to apply it the correct number of times.
Perl Solution
[colincrain:~/PWC]$ perl 54_1_kth_me_kate.pl 3 4
recursion: the 4th permutation sequence is 2 3 1
in place : the 4th permutation sequence is 2 3 1
use warnings;
use strict;
use feature ":5.26";
## ## ## ## ## MAIN:
my ( $number_elements, $sequence_requested ) = @ARGV ;
my $result = permute_with_recursion( $number_elements, $sequence_requested );
say "recursion: the ".$sequence_requested."th permutation sequence is $result->@*";
my $result2 = permute_in_place( $number_elements, $sequence_requested );
say "in place : the " . $sequence_requested . "th permutation sequence is ",
join ' ', $result2->@*;
## ## ## ## ## SUBS:
sub permute_with_recursion {
my ( $end, $selected_sequence ) = @_;
my @set = (1..$end);
my @working;
my $permutations = [];
my $data = { seq_number => $selected_sequence,
result => undef };
permute_recursive( \@set, \@working, $permutations, $data);
return $data->{result};
}
sub permute_recursive {
## given a starting set, a working list and a permutations set
## computes complete permutations as arrays and places the arrays on the permutations array
## which is maintained throughout by reference
my ($setref, $workref, $permutations, $data) = @_;
my @set = $setref->@*;
## if there is only one element left, push it on the working list,
## push that array reference onto the permutations array and return.
## This unique permutation list is complete.
if ( scalar @set == 1 ) {
my @working = $workref->@*;
push @working, $set[0];
if (scalar $permutations->@* == $data->{seq_number} - 1) {
$data->{result} = \@working;
}
else {
push $permutations->@*, \@working;
}
return;
}
## iterate through the remaining elements of the set,
## creating new copy of the working list, moving the element
## from the set to the working list and recursing with these
## new lists. The permutations list reference is passed through unchanged.
for my $element ( @set ) {
## collapse the recursion if we have our result
last if defined $data->{result};
my @working = $workref->@*;
push @working, $element;
my @subset = grep { $_ != $element } @set;
permute_recursive( \@subset, \@working, $permutations, $data );
}
}
sub permute_in_place {
my ( $end, $selected_sequence ) = @_;
my @set = (1..$end);
## the unrearranged sequence, the identity permutation,
## counts as sequence #1 as per the task
for (1..$selected_sequence-1) {
compute_next_permutation( \@set );
}
return \@set;
}
sub compute_next_permutation {
## in place algorithm (from Knuth Algorithm L, The Art of Computer Programming)
#
# «before we start we assume a sorted sequence a[0] <= a[1] <= ... <= a[n]»
# L1. «Visit» Take the given arrangement
# L2. «Find j» Find the largest index j such that a[j] < a[j + 1]. If no such index
# exists, terminate the algorithm and we are done
# L3. «Increase a[j]» Find the largest index k greater than j such that a[j] < a[k],
# L3a. then swap the values of a[j] and a[k].
# L4. «Reverse a[j+1]..a[n]» Reverse the subsequence starting at a[j + 1]
# through the end of the permutation, a[n].
# Do nothing if j+1 >= n. Return to L1.
## L1
my $set = shift;
my $end = scalar $set->@* - 1;
## L2
my @one = grep { $set->[$_] < $set->[$_+1] } (0..$end-1);
my $j = $one[-1];
return if ! defined $j;
## L3
my @two = grep { $_ > $j and $set->[$_] > $set->[$j] } (0..$end);
my $k = $two[-1];
## L3a
($set->[$j], $set->[$k]) = ($set->[$k], $set->[$j]);
## L4
return unless ( $j+1 < $end ); # {
my @reversed = reverse($set->@[ ($j+1)..$end ]);
splice $set->@*, $j+1, $end-$j, @reversed;
}
Raku Solution
method: In the Perl 5 version of this challenge, I took a little excursion into routines for constructing permutations, eschewing modules like Algorithm::Permute
for the task and perfering to roll my own routines, finally landing on an implimentation of Knuth’s Algorithm L.
In the Raku language, the operation of permutation is built in for positional types, albeit not exactly well defined in the documentation as to how it’s implemented. However calling
@array.permutations
to construct a list of lists of purmutations is trivially straightforward. A close examination of the lists produced appears to show the algorithm used produces lexicographically sorted permutations. Dumping these lists into an array and indexing on the $sequence_requested variable ( -1, to allow for 0-indexing ) produces the result.
Because the algorithm used is not specified, it seemed reasonable to double check the ordering against a known lexicographically sorted algorithm, Knuth’s Algorithm L. This is implimented in
apply-algorithm-L ( @set )
below. Cross-referencing results for various inputs suggests there is no variance in the outputs of the two functions.
One glaring difference dows arise, though, and that is time of execution for larger permutation lists. By the time we get to n=10, the built in permutations method takes about 10 seconds to produce a result. Examination shows that we are producing the entire permutaion set for the given list before selecting the permutation requested, and changing this behavior is not well documented. As it turns out, after some digging I was able to determine the output of .permutaions is a Seq, and can be coerced by the .lazy method. Applying this wrapper transfers to the Array assignment, so we are only evaluating the sequences we need and the execution time is reduced proportionally.
It’s worth noting that for large values of n and k, the built in routine, with or without lazy evaluation, so easily outstrips the raku implimentation of Algorithm L for speed I can’t see any optimizing to close that gap.
sub MAIN ( $number_elements, $sequence_requested where {$sequence_requested > 0}) {
## built in routine
my @array = 1..$number_elements;
my @perm = @array.permutations.lazy;
## the challenge is indexing from 1, the first sequence is the identity permutation
say "using permutations routine:" ;
say @perm[$sequence_requested-1].join(' ');
## algorithm L
my @result = permute_in_place( $number_elements, $sequence_requested );
say "using Algorithm L";
say @result.join(' ');
}
sub permute_in_place ( $number_elements, $sequence_requested is copy) {
my @set = 1 .. $number_elements;
## the challenge is indexing from 1, the first sequence is the identity permutation
## if requested == 1, range is 0..0 (one less) which goes to -1 and the
## algorithm is not executed
apply-algorithm-L( @set ) while --$sequence_requested;
return @set;
}
sub apply-algorithm-L ( @set ) {
## in place algorithm (from Knuth Algorithm L, The Art of Computer Programming)
# «before we start we assume a sorted sequence a[0] <= a[1] <= ... <= a[n]»
# L1. «Visit» Take the given arrangement
# L2. «Find j» Find the largest index j such that a[j] < a[j + 1]. If no such index
# exists, terminate the algorithm and we are done
# L3. «Increase a[j]» Find the largest index k greater than j such that a[j] < a[k],
# L3a. then swap the values of a[j] and a[k].
# L4. «Reverse a[j+1]..a[n]» Reverse the subsequence starting at a[j + 1]
# through the end of the permutation, a[n]. Do nothing if j+1 >= n.
# Return to L1.
## L1
# @set is the standing iteration to be modified
## L2
my @indexes = (^@set.end).grep( { @set[$_] < @set[$_+1] } );
my $j = @indexes.tail;
return if ! defined $j;
## L3
@indexes = (^@set.end+1).grep( { $_ > $j and @set[$_] > @set[$j] } );
my $k = @indexes.tail;
## L3a
(@set[$j], @set[$k]) = (@set[$k], @set[$j]);
## L4
return unless $j+1 < @set.end;
@set[$j+1..@set.end] = @set[$j+1..@set.end].reverse;
return;
}
TASK #2
Collatz Conjecture
Challenge by Ryan Thompson
It is thought that the following sequence will always reach 1:
$n = $n / 2
when$n
is even$n = 3*$n + 1
when$n
is odd
For example, if we start at 23, we get the following sequence:
23 → 70 → 35 → 106 → 53 → 160 → 80 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1
Write a function that finds the Collatz sequence for any positive integer. Notice how the sequence itself may go far above the original starting number.
Extra Credit
Have your script calculate the sequence length for all starting numbers up to 1000000 (1e6), and output the starting number and sequence length for the longest 20 sequences.
Method
Since the conjecture is that all such sequences converge, it seems safe to say that the it’s been checked for a lot of numbers. So we’ll make a loop that finds the next number, and finishes when that number reaches 1. It may run a while, but it should always eventually finish, right?
Right? <crickets>
Ok, well certainly for n<1,000,000.
For the sake of clarity, I’ve removed the logic of the conjecture into its own separate routine.
For the metaanalysis (hey look, two ‘a’s in a row, see last week’s post), keeping all the sequences for a million numbers gets big quickly, so for that task we throw out the numbers and only keep a total count of the length indexed on the starting number.
To ensure reproducability, we will slightly modify the challenge to find the lowest numbers of the 20 longest sequences. The request wasn’t for the numbers to create sequences of the longest twenty lengths, so we need a qualifier to determine which numbers to include if multiple sequences of equal length are completing for the last places in the list. By sorting first to find the highet values and secondarily to find the lowest number running the script on different systems running different hashing algorithms.
Since we were in there looking at a couple of million numbers, it seemed reasonable to wonder what the largest number we found along the way actually was. So I added a little ratcheting variable that updates the high number and the number of the sequence that spawned it whenever the previous value is exceeded. Its 56,991,483,520, in the sequence for the number 704,511. Huh.
To enable the metaanalysis, pass any positive second argument on the command line.
Perl Solution
[colincrain:~/PWC]$ perl 54_2_call_me_collatz.pl 22 1
22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1
-----------------------------------
count number sequence length
-------+--------+------------------
1 837799 525
2 626331 509
3 939497 507
4 704623 504
5 910107 476
6 927003 476
7 511935 470
8 767903 468
9 796095 468
10 970599 458
11 546681 452
12 818943 450
13 820022 450
14 820023 450
15 410011 449
16 615017 447
17 886953 445
18 906175 445
19 922524 445
20 922525 445
-----------------------------------
largest value found was 56991483520
for number 704511
use warnings;
use strict;
use feature ":5.26";
## ## ## ## ## MAIN:
my $start = shift @ARGV // 23;
my $METAANALYSIS = shift @ARGV // 0;
## get the sequence and print it
say join ', ', make_collatz_sequence($start)->@*;
## metaanalysis section from here down in MAIN
exit unless $METAANALYSIS;
my $data = { seq_lengths => {},
highest_number => 0,
highest_value => 0 };
get_collatz_metadata($data);
## display length totals
my $count;
my @sorted = sort { $data->{seq_lengths}->{$b} <=> $data->{seq_lengths}->{$a}
||
$a <=> $b } keys $data->{seq_lengths}->%*;
say '-' x 35;
say ' count number sequence length';
say '-------+--------+------------------';
printf " %2d %6d %6d\n", ++$count, $_, $data->{seq_lengths}->{$_} for @sorted[0..19];
## display max number found
say '-' x 35;
say "largest value found was ", $data->{highest_value};
say "for number ", $data->{highest_number};
## ## ## ## ## SUBS:
sub make_collatz_sequence {
## given a positive integer, returns a reference to the array
## of the Collatz sequence associated with it
my $prev = shift;
my @seq = ($prev);
my $next;
while ($prev != 1) {
$next = next_collatz($prev);
push @seq, $next;
$prev = $next;
}
return \@seq;
}
sub next_collatz {
## get the next number according to the rules of the Collatz sequence
$_[0] % 2 == 0 ? $_[0] / 2
: 3 * $_[0] + 1;
}
sub get_collatz_metadata {
## run metaanalysis on the first 1,000,000 Collatz sequences
my $data = shift;
for my $number (1..1000000) {
my $prev = $number;
my $len = 1;
my $next;
while ($prev != 1) {
$next = $prev % 2 == 0 ? $prev / 2
: 3 * $prev + 1;
$prev = $next;
if ($next > $data->{highest_value}) {
$data->{highest_number} = $number;
$data->{highest_value} = $next;
}
$len++;
}
$data->{seq_lengths}->{$number} = $len;
}
}
Raku Solution
The Raku solution is a port of the Perl version.
sub MAIN (Int:D $start where {$start > 0} = 23, $do_meta = 0) {
say join ', ', make_collatz_sequence($start);
exit unless $do_meta;
my %sequences;
my %data = ( highest_number => 0,
highest_value => 0 );
get_collatz_metadata(%data, %sequences);
## display length totals
my $count;
my @sorted = (keys %sequences).sort({ %sequences{$^b} <=> %sequences{$^a}
||
$^a <=> $^b });
say '-' x 35;
say ' count number sequence length';
say '-------+--------+------------------';
for ^20 -> $idx {
printf " %2d %6d %-6d\n", $idx+1, @sorted[$idx], %sequences{@sorted[$idx]} ;
}
## display max number found
say '-' x 35;
say "largest value found was ", %data<highest_value>;
say "for number ", %data<highest_number>;
}
sub make_collatz_sequence ( $start ) {
my $current = $start;
my @seq = ($current);
my $next;
while ($current != 1) {
$next = next_collatz($current);
@seq.push: $next;
$current = $next;
}
return @seq;
}
sub next_collatz (Int:D $n ) {
## this confuses the compiler so we need to explicitly recast n/2 as an Int
## rather than a Rat which, by the way, is fine, because it always will remain
## an integer as n is even
$n %% 2 ?? ($n / 2).Int
!! 3 * $n + 1;
}
sub get_collatz_metadata ( %data, %sequences ) {
## run metaanalysis on the first 1,000,000 Collatz sequences
my $then = now;
for (1..1000000) -> $number {
my $prev = $number;
my $len = 1;
my $next;
while ($prev != 1) {
$next = $prev % 2 == 0 ?? $prev / 2
!! 3 * $prev + 1;
$prev = $next;
if ($next > %data<highest_value>) {
%data<highest_number> = $number;
%data<highest_value> = $next;
}
$len++;
}
%sequences{$number} = $len;
}
say "time: ", now - $then;
}