Wherein we explore the expressive nuances of getting the last word in, and mangle strings over and over until until we end up just where we started.
THE WEEKLY CHALLENGE – PERL & RAKU #63
Let me start off by saying things were thankfully a lot less theoretical and a lot better defined this week, allowing us some relaxing time to play around with techniques rather than just spending hours, then days looking for ways to first just understand and then try and solve a rather difficult problem, with somewhat unsatisfactory results to show for my work. I for one would rather not consider Spacechess again for some time, thank you very much. Not to say it wasn’t without its pleasures, but it was exhausting, and by the end I simply ran out of time to really figure the best way to go. I mean, it’s not like there’s really a correct answer, per se.
With that out of the way, let’s get to it.
TASK #1 › Last Word
Submitted by: Mohammad S Anwar
Lovingly Crafted by: Ryan Thompson
Define sub last_word($string, $regexp)
that returns the last word matching $regexp
found in the given string, or undef
if the string does not contain a word matching $regexp
.
For this challenge, a “word” is defined as any character sequence consisting of non-whitespace characters (\S
) only. That means punctuation and other symbols are part of the word.
The $regexp
is a regular expression. Take care that the regexp can only match individual words! See the Examples for one way this can break if you are not careful.
Examples
last_word(' hello world', qr/[ea]l/); # 'hello'
last_word("Don't match too much, Chet!", qr/ch.t/i); # 'Chet!'
last_word("spaces in regexp won't match", qr/in re/); # undef
last_word( join(' ', 1..1e6), qr/^(3.*?){3}/); # '399933'
Method
It seems the main idea behind this challenge is in the mechanics of passing a regular expression into a subroutine, and in handling a variety of edge-cases that can arise whilst still conforming to the required behavior.
In general, looking for the last occurrence of a thing requires a sense of state to be updated, but finding one solution and moving on is much simpler. So with that in mind, we will tokenize our string and look for a match starting from the end rather than the beginning. Returning after our first match will find that last match in the original string. Easy peasy.
Another detail is that the challenge explicitly asks for a subroutine, so that is what we will create, along with a wrapper to call it.
Raku Solution
For the challenge description, we are given Perl 5 regexes, and the request to return undef
. In Raku, we will need to update this a little, as PCRE requires a little expansion to allow for new abilities. The examples have been altered accordingly. Also, undef
has been replaced by Nil
, another trivial change in keeping with the spirit of the specification. After that, accomplishing the task is short and clear:
[colincrain:~/PWC]$ raku drop-the_mic.p6
hello
Chet!
«no match found»
399933
Note the subroutine asked for does return Nil
here, and the script when given Nil
to report chooses to say «no match found»
instead to avoid confusion with a match. Seems as good a way to handle this as any.
sub MAIN() {
my @pairs =
[' hello world', rx/<[ea]>l/],
["Don't match too much, Chet!", rx:i/ch.t/],
["spaces in regexp won't match", rx:s/in re/],
[(1 .. 1e6).join( ' ' ), rx/^(3.*?) ** 3/];
for @pairs -> ($string, $regex) {
(last_word( $string, $regex ) || '«no match found»').say
}
}
## ## ## ## ## SUBS:
sub last_word ($string, $regex) {
my @words = $string.split( /\s/, :skip-empty );
for @words.reverse {
return $_ if $_.match: /$regex/
}
return Nil;
}
Perl Solution
For the Perl 5 solution we did things a slight bit differently, choosing to pop items directly off the back of our word list, rather than reversing it. Presumably a little quicker in pathological cases, like our million element array made from a six million character string. Who does that, anyway? Well, we do. That is the answer.
use warnings;
use strict;
use feature ":5.26";
## ## ## ## ## MAIN:
my @pairs = ( [' hello world', qr/[ea]l/],
["Don't match too much, Chet!", qr/ch.t/i],
["spaces in regexp won't match", qr/in re/],
[ join(' ', 1..1e6), qr/^(3.*?){3}/] );
for my $parameters ( @pairs ) {
my $word = last_word($parameters->@*);
if (defined $word) {
say $word
}
else {
say "\t«no match found»";
}
}
## ## ## ## ## SUBS:
sub last_word {
my ($string, $regex) = @_;
my @words = split /\s/, $string;
my $word;
while (@words > 0) {
$word = pop @words;
return $word if ($word =~ m/$regex/);
}
return undef;
}
TASK #2 › Rotate String
Submitted by: Mohammad S Anwar
Intricately Crafted by: RYAN THOMPSON
Given a word made up of an arbitrary number of x
and y
characters, that word can be rotated as follows: For the ith rotation (starting at i = 1), i % length(word) characters are moved from the front of the string to the end. Thus, for the string xyxx
, the initial (i = 1) % 4 = 1 character (x
) is moved to the end, forming yxxx
. On the second rotation, (i = 2) % 4 = 2 characters (yx
) are moved to the end, forming xxyx
, and so on. See below for a complete example.
Your task is to write a function that takes a string of x
s and y
s and returns the minimum non-zero number of rotations required to obtain the original string. You may show the individual rotations if you wish, but that is not required.
Example
Input: $word = 'xyxx';
- Rotation 1: you get
yxxx
by movingx
to the end. - Rotation 2: you get
xxyx
by movingyx
to the end. - Rotation 3: you get
xxxy
by movingxxy
to the end. - Rotation 4: you get
xxxy
by moving nothing as 4 % length(xyxx
) == 0. - Rotation 5: you get
xxyx
by movingx
to the end. - Rotation 6: you get
yxxx
by movingxx
to the end. - Rotation 7: you get
xyxx
by movingyxx
to the end which is same as the given word.
Output: 7
Analysis and Method
Back in string manipulation world, in this case moving bits from the front to the back according to a progressive set of rules dependent on the iteration. This can be done in various ways, and I have come up with two for a demonstration, one for Raku, the other Perl. Either one works nearly the same in both languages.
One obvious way to proceed is to build a loop, do the transform, and jump out if the transformed string is the same as the initial. This is all well and good, but does follow through to the next question, which is whether it will always eventually find a solution, or perhaps we might perchance get stuck in an infinite stable oscillation of some sort.
This got me thinking more about what we’re actually doing here. And the big reveal is that looked at the right way, the sequence of characters never changes. When we move the first character of the string to the end, the last character is now followed by the first. No matter how many characters are moved over at a time, this relationship will always remain true, and each character within the string will at all times be followed (and proceeded by) the same characters as in the original untransformed string. All we have done is create a new rule, that the end of the string is immediately followed by the beginning. In topology this is known as a loop, a function over an interval where the state at the end of the interval is exactly the same as the beginning. Think of a donut, being a circle rotated in space to form a torus. We have made something quite like that, but with a string.
We have no natural data type for a loop of string, but using modular arithmetic we can pretend we do, leaving the string alone and just changing the index of the start point, or, in topology what is known as the base point. As long as when moving forward we treat the current position as mod the length of the string, we will remain on the loop. This is akin to pointing to a point on a circle and declaring that the circle starts here: no matter where you start, one can always trace a complete circle from that point. As a matter of fact, once we have the logic to move the starting point in our loop according to the rules of the given progression, we can then determine that we have worked our way through a complete cycle by observing when our start point returns to 0. At this level of abstraction, we have completely abandoned our actual text; we don’t even need the string anymore to do our math, and can just count the iterations. Neat!
It is worth noting before we get too excited that what we’ve solved here this isn’t exactly our problem. It does however give a lot of insight into what we were tasked with. In our thought experiment, it doesn’t matter what the characters of the string are, because given enough rotations the start point will work its way back to 0. But for our challenge we do care: in a random assemblage of xs and ys, an objectively differently ordered but identical pattern might arise by happenstance, so we will need to construct and examine the intermediate steps to check for this. But before that, to follow through in our analysis, what we can do is construct a little program:
for my $len (1..500) {
my $idx = 0;
my $i;
while (++$i) {
$idx = ($idx + ($i % $len)) % $len;
last if $idx == 0;
}
say "$len $i"; ## string length and number of rotations to cycle
}
which, for any arbitrary length of string, will provide an upper bound of the maximum number of steps required to come full circle.1
Now for an aside: every week, after reading through the weekly challenge, my first course of action is to fire up a new BBEdit script template and paste in the text from the website in a comment up at the top for easy reference, and perhaps a play on words. Because of this, I’m aware of a ninja-edit in the pasted copy from later in the week, as reproduced above, with the repair of a slight typo. Originally, and I must say I didn’t even notice and automatically assumed the correct understanding, the line said: “…returns the maximum non-zero number of rotations…”. As this on the face of it doesn’t make a whole lot of sense, I fixed it in my mind, substituted minimum and moved on. Out for a walk earlier it occurred to me I might not have solved the right problem and had to check and make sure. All was well, but sure enough I hadn’t made it up.
But, if you really want to know, with our little side script we have indeed answered the original question as asked, finding the maximum number of rotations required for any given string length.2
1 The sequence of maximum values is really interesting. All values for any length finish, with local outlier maxima for powers of 2. Those values finish in 2n-1 rotations. The next most common recurrence is n – 1, which occurs for odd numbers, with exceptions. All other cycles are shorter than this, and expressible with a rational fraction of string length to rotations, such as ½.
2 For the love of all things complete, the progressions are indeed cyclic, so once we wrap around we can repeat the process as many times as we wish. So technically the answer is always positive infinity for any length, but that’s not a very interesting interpretation.
Raku Solution
Notable in the Raku solution is how elegant it is to create a string of random xs and ys between 4 and 12 characters long. These constraints are rather arbitrary, but demonstrate the function well. To move the string parts we use a substitution routine.
raku chopped_and_screwed.p6 -v
starting string: xxxyxy
move 1 shifting 1 chars: xxyxyx
move 2 shifting 2 chars: yxyxxx
move 3 shifting 3 chars: xxxyxy
rotations required: 3
sub MAIN (Bool :$v = False) {
say "\nrotations required: ", churn( xys, $v );
}
sub xys {
# creates a string of xs and ys of random length 4..12
return [~] <x y>.roll( (4 .. 12).pick );
}
sub churn ($base, $v) {
my $shifted = $base;
my $i = 1;
say "starting string: ", $base, "\n" if $v;
while (1) {
my $moves = $i % $base.chars;
$shifted ~~ s/(\S ** { $moves })(.*)/$1$0/;
printf "move %-2d shifting %2d chars: %s\n", $i, $moves, $shifted if $v;
return $i if $shifted eq $base;
$i++;
}
}
Perl Solution
Just for kicks, we do the transformation here using substr
, lifting out the desired prefix and modifying the original string at the same time, which we then graft on to the end of the remaining tail. It’s a succinct solution I dare say, doing a lot with very few words. Versus the simplicity of the substitution version, I can’t decide which I like best.
use warnings;
use strict;
use feature ":5.26";
## ## ## ## ## MAIN:
my $string = make_xys();
my $rotations = churn($string);
say "\nrotations required: $rotations";
## ## ## ## ## SUBS:
sub churn {
my $base = shift;
my $shifted = $base;
my $i;
say "starting string: ", $base, "\n";
while (++$i) {
my $moves = $i % length($base);
$shifted .= substr( $shifted, 0, $moves, '');
printf "move %-2d shifting %2d chars: %s\n", $i, $moves, $shifted;
return $i if $shifted eq $base;
}
}
sub make_xys {
my $len = shift // int( rand(12) ) + 4;
my @xy = ('x','y');
my $out;
for ( 1..$len) {
$out .= $xy[int(rand 2)] ;
}
return $out;
}
One thought on ““Drop the Mic” – Chopped and Screwed Remix”