THE WEEKLY CHALLENGE – PERL & RAKU #62
TASK #1 › Sort Email Addresses
Submitted by: Neil Bowers
Reviewed by: Ryan Thompson
Write a script that takes a list of email addresses (one per line) and sorts them first by the domain part of the email address, and then by the part to the left of the @ (known as the mailbox).
Note that the domain is case-insensitive, while the mailbox part is case sensitive. (Some email providers choose to ignore case, but that’s another matter entirely.)
If your script is invoked with arguments, it should treat them as file names and read them in order, otherwise your script should read email addresses from standard input.
Bonus
Add a -u
option which only includes unique email addresses in the output, just like sort -u
.
Example
If given the following list:
name@example.org
rjt@cpan.org
Name@example.org
rjt@CPAN.org
user@alpha.example.org
Your script (without -u
) would return:
user@alpha.example.org
rjt@cpan.org
rjt@CPAN.org
Name@example.org
name@example.org
With -u
, the script would return:
user@alpha.example.org
rjt@CPAN.org
Name@example.org
name@example.org
Method
In Perl, having cascading sort criteria makes sorts like this pretty straightforward, once we’ve figured out what those criteria are and how to reach the relevant data. In this case we need to separate out the domain portion of the address and lowercase it (or uppercase, if you’re the shouty type) and first sort by that, once that is finished we need to separate out the mailbox as well and secondarily sort indeterminate items on that.
The unique filter is a little interesting because we need to do a transformation of the addresses first before we can determine what “unique” means under these circumstances. In this we need to lowercase (again: or upper if you like to shout where no one can possibly hear you) only the domain portion of the address so as to make that portion case-insensitive.
Because the sorting requires us to selectively look at only portions of the address at any one time, we will need to dissect the address into its mailbox and domain components. For the Perl solution, we will split on the ‘@’ symbol required in every email address and examine the resulting array elements.
For the unique filter we will modify the idiom of populating a %seen hash and checking for key existence. We will need to compare whole addresses as viewed through a certain lens, so we will modify the strings using non-destructive substitution and use these as hash keys.
One noteworthy difference in this change is the quite specific user interface; rather than a logical proof of concept we are being asked to implement a full-blown command line tool. Fortunately for us Perl grew out of an intimate relationship with the UNIX command line, so it contains many little tweaks that have evolved out of that growth. One of these is the behavior of the Diamond Operator <>
, a specially wrought filehandle-like thing which reads lines either preferentially from a list of files in @ARGV
, opening them sequentially until there are no more, or in the absence of files, from STDIN
. This is not coincidentally the way many, many command line tools work and so conveniently exactly the behavior specified in the challenge. In list context it’s slurpy and so @lines = <>
will deliver us our list of email addresses no matter how they are presented.
To bring in the -u switch we will use the core module Getopt:Std
. A few lines of code
my %options=();
getopts("u", \%options);
will then allow us to access $options{u}
as a boolean. To actually implement the behavior once we’ve turned it on we simply grep the email list using the %seen variant described above, to prune it before we waste time sorting.
Perl Solution
use warnings;
use strict;
use feature ":5.26";
use Getopt::Std;
# declare the perl command line flag '-u'
my %options=();
getopts("u", \%options);
## ## ## ## ## MAIN:
my @list = <>;
chomp @list;
my $sort = \&complex_sorter;
my @sorted = sort $sort @list;
## optionally apply -u switch
@sorted = unique(@sorted) if $options{u};
say $_ for @sorted;
## ## ## ## ## SUBS:
sub complex_sorter {
## sorts case insensitive on domain
## case sensitive on mailbox
## returns -1, 0, 1®
my @a_list = split /(@)/, $a;
my @b_list = split /(@)/, $b;
lc($a_list[2]) cmp lc($b_list[2])
||
$a_list[0] cmp $b_list[0]
}
sub unique {
## filters a list for unique email, with first occurence preserved
## domain segment is case-insensitive
my %seen;
return grep { ! $seen{ s/(\@.*$)/lc($1)/er }++ } @_;
}
Raku Solution
In Raku, we don’t have the convenience of the diamond operator, so we do need to manage our input a little more. This is made up for in other ways, as we can more easily specify the transformations required before determining uniqueness and sort ordering. Being able to specify the :as(&{BLOCK})
adjective to the built-in unique routine is a particularly useful addition, which allows us to attach an arbitrary which will determine the specifications for our uniqueness test. The sort specifications are streamlined as well; providing a routine that takes one parameter will be applied as an {$a cmp $b}
sort automatically, and if given a list of such functions will chain them to resolve equivalencies like using || in a Perl 5 sort.
Command line flags are handled in the signature directly by the inclusion of an optional Boolean type, here as Bool :$u?=False. If a -u switch is included, it will come up as $u = True, allowing us to test for it within the MAIN block directly.
#!/usr/bin/env perl6
sub MAIN(Bool :$u?=False, *@files) {
my @addr;
## if no files on the commandline, look for lines coming in from STDIN
if @files.elems {
@addr.append( .IO.lines ) for @files;
}
else {
@addr = $*IN.lines;
}
## filter unique items after applying lowercase to domain
@addr .= unique(:as(&{ S:i/(\@.*$)/{lc($/)}/ })) if $u;
## sort by lowercased domain then by mailbox
my @sorted = @addr.sort(&{.substr(.index("@")+1).lc, .substr(0,(.index("@")))});
.say for @sorted;
}
TASK #2 › N Queens
Submitted by: Ryan Thompson
A standard 8×8 chessboard has 64 squares. The Queen is a chesspiece that can attack in 8 directions, as shown by the green shaded squares below:
It is possible to place 8 queens on to a single chessboard such that none of the queens can attack each other (i.e., their shaded squares would not overlap). In fact, there are multiple ways to do so, and this is a favourite undergraduate assignment in computer science.
But here at PWC, we’re going to take it into the next dimension!
Your job is to write a script to work with a three dimensional chess cube, M×M×M in size, and find a solution that maximizes the number of queens that can be placed in that cube without attacking each other. Output one possible solution.
Example
A trivial 2×2×2 solution might look like this (1 = queen, 0 = empty):
[
[[1,0], # Layer 1
[0,0]],
[[0,0], # Layer 2
[0,0]],
]
Method
The 8 Queens and its related n-queen analogues for generalized chessboards is a well known and documented problem. Expanding it into the 3rd, and n-th, dimension is substantially less so. So here now are, out in fairly new ground.
Usually I can get an understanding and from that a methodology for these challenges in a fairly short time, then once a prototype is sketched out I can spend my time exploring the problem space further, working up more elegant solutions and carefully writing out my results for posterity. On a good week maybe translate some Old Norse or hunt down a reference for a Zen koan to craft a side commentary. Not so this week. No, that did not happen.
Let’s start by stating that I’ve never thought much about 3D chess at all. To be honest I’ve never been much for chess in general, for no particular reason. I know how to play, of course, and certainly have played at times throughout my life. I enjoy all manner of games, so it’s not that. I do enjoy it. I suppose I have come to understand the game just enough to see just how much I don’t truly understand the game, and never have invested the time required to change that fact, and ultimately there’s only so many things one can do with oneself. There’s only so many hours of the day.
Consequently I never was much more than merely aware of the existence of 3D chess, so after mulling over a strategy for the email sorting problem for a short while I dove right down the rabbit hole into that weird place.
There are many chess variants out there, but for 3D chess as actually played, Raumschach, or space chess, is apparently the most common. It is played on a 5x5x5 board, with a starting field of 20 pieces of 7 basic types: King, Queen, Bishop, Knight, Rook, Pawn and Unicorn. “Wait what?” You ask. Ahh, you noticed that I see. The obvious next question is “Why?”
The answer to all this lies in the various symmetries of a cube versus those of a square. On a square chess board, we have two orthogonal symmetries and two diagonal symmetries. The Rook piece is allowed unlimited travel along the line of any one orthogonal symmetry, the Bishop the same but along a diagonal line. The Queen combines these two freedoms to allow unlimited travel along any one of the four axes.
In a cube things get rather more complex. For orthogonal lines, we can align along our original x and y axes, but now an additional z. For the diagonals, we have two lines in any given x-y plane, and in addition now we have similar lines in the x-z and y-z planes as well, for a total of 6 axes. On top of these 9, in a cube we have a new form, a triagonal symmetry, marking rotation about pairs of opposing corners, of which we have 4. This direction through a cube has no analogue within the 2-dimensional plane. The Rook and Bishop maintain their movements in their dimensional transformation, with now 3 and 6 degrees of freedom respectively, but to move in these 4 new directions we need a new piece, and that piece is the Unicorn.
The Queen, thus, in this world adopts the movement of the Rook, the Bishop and now the Unicorn, to maintain her unrestrained freedoms. The Spacequeen has arrived.
The challenge, as stated, is to maximize the number of queens that can be placed in an MxMxM cube without attacking each other. In the 8-queens problem, it’s readily apparent that there can be no more than 8 queens placed, as there can be no more than 1 per row and column. In three dimensions nothing is so obvious. All we are given is the request to maximize the number fitted, suggesting we try fitting more and more until we can fit no more, never quite sure we are right unless we have tried all possible solutions. And the the spatial possibilities are immense, with (M3)! ways to fill a cube to start. If we are to compute this before the heat death of the universe we must do our best to rein in the progression.
A spacequeen is an immensely powerful piece, dominating along 13 axes, or radially in 26 directions if you prefer to look at it this way. In any method of populating the board we will need to be able to identify conflicts with existing pieces, so the first course of action is to locate the squares along those axes that lie within the confines of the board. Due to the sheer number of degrees of freedom available this task was daunting in itself.
I had the distinct feeling when writing the attack_vectors()
routine that things might have gone better for me if I had ever built a 3-D game, or perhaps rolled my own renderer. I suspect there may be some simple rules of thumb out there that I could be privy to, but wasn’t. As it was, we work with what we know, so armed with a certain knowledge of geometry, I just had at it. Visualizing the Unicorn was certainly the hardest part. Discovering the relationship between the Bishop equations and adding the z-dimension was key.
Here’s the equations for the Bishop I came up with:
BISHOP:
Queen Position: (Qx,Qy,Qz)
y ∝ x
x ∈ {0..max} x ≠ Qx
⎡ y = x + (Qy – Qx) | 0 ≤ y ≤ max
⎣ y = –x + (Qy + Qx) | 0 ≤ y ≤ max
z ∝ x
x ∈ {0..max} x ≠ Qx
⎡ z = x + (Qz – Qx) | 0 ≤ y ≤ max
⎣ z = –x + (Qz + Qx) | 0 ≤ y ≤ max
z ∝ y
y ∈ {0..max} y ≠ Qy
⎡ z = y + (Qz – Qy) | 0 ≤ y ≤ max
⎣ z = –y + (Qz + Qy) | 0 ≤ y ≤ max
As they say, any problem is either impossible or trivial. Or, it seems so easy once you know the answer.
The derived method is broken down into sections, solving individually to find the Rook, Bishop and Unicorn attacks. As all three sections hinge off a common n iterator serving to represent various components of the functions at different times, these sections could be combined into one big loop, but I have chosen not to refactor to keep things that much clearer. It would probably be a little faster that way, but not much. It would be nice to not compute squares out of our range, but in order to determine whether or not we’re in range we need to compute some squares anyway. The best we can do is control one dimension with the iterator, and after the fact filter the results, to make sure all elements in a given array are between 0 and the largest index, inclusive. The point arrays are stringified into a list of hash keys and returned.
By making the chessboard a hash, with stringified points associated to their respective arrays, once we have placed a queen we can easily remove that square and all of the squares along the lines of attack from play by deleting keys. We can also easily create a list of all remaining squares by calling keys()
on the board, knowing that all remaining squares are unconflicted. We shift this list to choose from the remaining squares for the next placement, likewise further constraining our field of play, and continue in this fashion until there are no more available moves. When this happens we count the queens on the board, note the board layout if it exceeds the current maximum, backtrack as required to find a different move and continue looking with a different configuration. Backtrack until we’ve exhausted all initial placements. Whew. Yea, and as expected factorial time blows up quite quickly, even with every queen attacking all of those axis lines. The terrible truth is the monarch’s relative power decreases as her kingdom grows. As we increase size of the board, the attack space of the queen grows roughly linearly, while the square count is cubic.
There is one more way to cull the possible moves, and that is to limit the initial placement of the first queen. Because our cube has so many symmetries and reflective planes, on the blank slate many squares are equivalent. For instance, merely turning the cube in space can move any of the eight vertexes to represent the [0,0,0]
square. Similarly the squares [1,2,0] and [2,1,0] are reflected along the line x = y and thus for any construct built from one of these two there exists a mirrored version from the other, so only one need be considered. The points in the trapezoid described by get_initial_section()
will, through transformation, serve for any initial location within the cube. After one queen placement, however, chirality is established and we need to return to considering the entirety of the unoccluded spaces. So it’s only applicable to the first move. But this does reduce the initial placement options by a factor of up to 48 times, so that’s nice.
The algorithm works, but the constraints provided ultimately cannot stem the cubic factorial growth of the problem space. I have exhaustively solved it for n = 3, (4 queens, instantaneous)
000, 120, 201, 012
and n = 4 (7 queens, 92 seconds). After 2.5 hours I have a solution to n = 5 (13 queens), which I believe from the literature to be the maximum known. I have a solution of 17 queens for n = 6
000, 034, 201, 015, 324, 332, 540, 141, 452, 053, 535, 103, 405, 410, 523, 244, 120
and understand there is a known solution with 21, but have no idea how long it would take to find that. Perhaps I will let it run all night and see how far we get.
[colincrain:~/PWC]$ ./queen3.pl
solutions: 1 000, 221, 103, 313, 031
solutions: 2 000, 221, 103, 032, 233, 313
solutions: 15 000, 221, 102, 301, 313, 032, 233
347950 solutions checked
max solution:
7 queens
[000]
[221]
[102]
[301]
[313]
[032]
[233]
use warnings;
use strict;
use feature ":5.26";
use POSIX qw( ceil );
use List::Util qw( all );
## ## ## ## ## MAIN:
my $then = time;
my $size = shift // 6;
my $base_board = build_gameboard( $size );
my $solution = [];
my $count;
my $starting_positions = get_initial_section( $size );
for my $start ( $starting_positions->@* ) {
my $this_board = remove_conflicts( $base_board, $start );
my $queenlist = [$start];
place_next_queen($this_board, $queenlist, $solution);
}
say "$count solutions checked";
say "max solution:";
say +(scalar $solution->@*), " queens";
for my $square ($solution->@*) {
say "[", $square, "]";
}
say "elapsed: ", time - $then;
## ## ## ## ## SUBS:
sub place_next_queen {
## select an open position on the board and continue the recursion
## after removing conflicts from the new queen
my ($board, $queenlist) = @_;
my @open_positions = keys $board->{squares}->%*;
for my $queen ( @open_positions ) {
my $new_list = [$queenlist->@*, $queen];
my $new_board = remove_conflicts( $board, $queen );
if (scalar keys $new_board->{squares}->%* == 0) {
$count++;
if (scalar @$new_list > scalar @$solution) {
$solution = $new_list;
say "solutions searched so far: ", $count, "\n", (join ", ", $new_list->@*);
}
return;
}
place_next_queen( $new_board, $new_list);
}
}
sub remove_conflicts {
## given a board and a queen placement hashkey,
## calculates attack vectors for the queen and removes conflicting
## squares from the board
## returns new board
my ($prev_board, $queen) = @_;
my %squares = $prev_board->{squares}->%*;
my $board = { "size" => $prev_board->{size},
"squares" => \%squares };
my $conflicts = attack_vectors( $board->{squares}->{$queen}, $board->{size} );
for my $squarekey ( $conflicts->@* ) {
delete $board->{squares}->{$squarekey}
}
delete $board->{squares}->{$queen};
return $board;
}
sub build_gameboard {
## given a cardinal size of one edge
## a hash of every square in the cube
## keys are stringified points
## values are array refs to those points as [$x, $y, $z]
## returns reference
my $size = shift;
my $board = {};
$board->{size} = $size;
for my $x (0..$size-1) {
for my $y (0..$size-1) {
for my $z (0..$size-1) {
my $key = hashkey([$x, $y, $z], $size);
$board->{squares}->{$key} = [$x, $y, $z];
}
}
}
return $board;
}
sub get_initial_section {
## under the many symmetries in the cube, most initial positions are
## equivalent under rotation or mirroring to many others. This function
## returns a minimal set of starting points on the gameboard that taken
## together under transformation represent the entirety.
## On a 5x5x5 board the squares
## (0,0,0)(1,0,0)(2,0,0)
## (1,1,0)(2,1,0)
## (2,2,0)
## (1,1,1)(2,1,1)
## (2,2,1)
## (2,2,2)
## can be mapped to all other
## possible starts. This tetrahedron can be as little as 1/48* of the initial
## possibility space, but because the squares divided along the lines of symmetry
## are included whole, the actual count will be higher, converging on 1/48.
#
## * "I have a truly marvelous demonstration of this proposition
## which this margin is too narrow to contain."
my $size = shift;
my $end = ceil(($size)/2)-1;
my @output;
my ($x, $y, $z);
for $z (0..$end) {
for $y ( $z..$end) {
for $x ($y..$end) {
push @output, [$x,$y,$z];
}
}
}
my @keys = map {hashkey($_, $size)} @output;
return \@keys;
}
sub attack_vectors {
## given a queen placement as [x,y,z]
## calculates spacequeen attack vectors within a cube of a given size
## returns a list of attack squares as hashkeys
my ($queen, $size) = @_;
# say "($queen, $size)";
my ($x, $y, $z) = $queen->@*;
my @attacks;
## Rook attacks
for my $n (0..$size-1) {
push @attacks, ([$n,$y,$z]) unless $n == $x;
push @attacks, ([$x,$n,$z]) unless $n == $y;
push @attacks, ([$x,$y,$n]) unless $n == $z;
}
## Bishop attacks
for my $n (0..$size-1) {
## y∝x n->x' y=(1|-1)x + b
push @attacks, ([$n,$n+($y-$x),$z],[$n,-$n+($y+$x),$z]) unless $n == $x;
## z∝x n->x' z=(1|-1)x + b
push @attacks, ([$n,$y,$n+($z-$x)],[$n,$y,-$n+($z+$x)]) unless $n == $x;
## z∝y n->y' z=(1|-1)y + b
push @attacks, ([$x,$n,$n+($z-$y)],[$x,$n,-$n+($z+$y)]) unless $n == $y;
}
## Unicorn attacks
for my $n (0..$size-1) {
## (0,0,0)->(n,n,n) axis
push @attacks, ([$n,$n+($y-$x),$n+($z-$x)]) unless $n == $x;
## (n,0,0)->(0,n,n) axis
push @attacks, ([$n,-$n+($y+$x),-$n+($z+$x)]) unless $n == $x;
## (0,n,0)->(n,0,n) axis
push @attacks, ([$n,-$n+($y+$x),$n+($z-$x)]) unless $n == $x;
## (n,n,0)->(0,0,n) axis
push @attacks, ([$n,$n+($y-$x),-$n+($z+$x)]) unless $n == $x;
}
## filter out out-of-range attacks
my @output = grep {
(all {$_ >= 0} $_->@*) && (all {$_ < $size} $_->@*)
} @attacks;
## convert to hashkeys
my @keys = map {hashkey($_, $size)} @output;
return \@keys;
}
sub hashkey {
my ($point, $size) = @_;
my $digits = scalar( split //, $size-1 ) ;
my $key_format = join '', ('%0'.$digits.'d') x 3;
my $key = sprintf $key_format, $point->@*;
return $key;
}
2 thoughts on “The Spacequeen’s Sordid Emails”