Directed Acyclic Graph – Page 2 – Programming Excursions in Perl and Raku

Wherein we walk the lonely city streets at night, listening to Lucinda Williams and St. Vincent, looking for the right words, staring at strings of strange characters lurking in the shadows.

THE WEEKLY CHALLENGE – PERL & RAKU #64

First an anecdote: years ago, doing document transformations in a Perl shop, I was instructed, when inquiring about speed, to not worry about it too much. Every job was a different spec, with different criteria, for different data –– so being adaptable, nailing down the logic, tweaking and tuning the output was the paramount thing, and manipulating text was what Perl was made to do. We had some perky SGI boxen; if something was bogging down it could probably be reworked, and if everything went south we could always rewrite anything in C. We rarely needed to do any of that.

My point here not being to piss off the gods of algorithmic time complexity, but rather to stress how great Perl is to quickly prototype new logic, something it undeniably excels at. As such when I approach these problems every week, once I’ve come up with a basic plan, I just open up a template in BBEdit and start coding. Then, when I have it figured out and working in Perl, I’ll use that to write a Raku version, much as the hypothetical rewrite in C that I spoke of earlier. I will also have taken notes by that point, made choices; sometimes a completely different method comes to mind, so I do that instead. So the Raku scripts are in theory more refined and thought out, even if I’m not as familiar with the language itself. So why on Earth bury the lede? Let’s put our best foot forward, I say.

TASK #1 › Minimum Sum Path

Submitted by: Mohammad S Anwar
Remelted and Refined by: RYAN THOMPSON

Given an m × n matrix with non-negative integers, write a script to find a path from top left to bottom right which minimizes the sum of all numbers along its path. You can only move either down or right at any point in time.

Example

Input:

[ 1 2 3 ]
[ 4 5 6 ]
[ 7 8 9 ]

The minimum sum path looks like this:

Thus, your script could output: 21 ( 1 → 2 → 3 → 6 → 9 )

Method

What we have here is a matrix of values, located at the points of a multidimensional array. We are connecting adjacent points with potential pathways, but restricting travel on those pathways to only the left-to-right and top-to-bottom directions. We need a method to follow routes through this grid from point to point, tallying the values as we go. From this we can determine the correct answer.

The structure we have made is known as a Directed Acyclic Graph, and is useful to model many things with a series of choices towards a goal. We will start by looking at the underlying structure of a simple 3×4 array, with the points labeled, rather than the values stored there:

        0,0     0,1     0,2     0,3

        1,0     1,1     1,2     1,3

        2,0     2,1     2,2     2,3

rotating the whole thing clockwise 45° makes the underlying graph easier to see.

#                  (0,0)      <-- START
#                  ⬋   ⬊
#              (1,0)   (0,1)   
#              ⬋   ⬊   ⬋   ⬊
#          (2,0)   (1,1)   (0,2) 
#              ⬊   ⬋   ⬊   ⬋   ⬊
#              (2,1)   (1,2)   (0,3) 
#                  ⬊   ⬋   ⬊   ⬋
#                  (2,2)   (1,3) 
#                      ⬊   ⬋
#                      (2,3)  <-- END

It’s like a tree that links back into itself, and we progress from top to bottom, traveling inexorably downward, as in a pachinko machine with only one pocket. There’s a juicy metaphor in there somewhere. In any case as can easily be seen there are many ways to proceed, but if we remain bound to the restrictions we will always end up at the same endpoint.

When situated at any given point, on the other hand, we are only allowed at maximum two choices in direction. If we build a recursive function that will follow each open pathway available at the current node, by the time we get to the endpoint we will have logged every possible route. Then we can take those routes, as lists of points, and do a lookup to the original values at each point to do the sums. The smallest of these is the solution. Because we are asked to find “a path” with the minimal sum, in the case of multiple equal answers any one will do.

Raku Solution

For the Raku solution I wanted to build some classes to compartmentalize the ideas of the graph and its vertices. It seemed a good way to separate the underlying structure from the data. I thought about serializing the input grid, with an x and y value followed by the values in a long list, but decided it distracted from the logic while providing little gain, so I hardwired in a crafted example array. There is a -v verbose command line switch though, which adds a pretty printing of the original grid and a list of the point coordinates of the vertices in the final route.

To start we need a simple Vertex, which holds an x and y coordinate and a gist method for display, as (x,y).

In the Grid class for data we have just the input grid. We have added methods to:

find the endpoint and return it (as a Vertex)
find the theoretical following two Vertex points, which are composed by adding 1 to either the x value or the y value
determine whether a given Vertex is within the Grid or not
sum the values referenced along a given route, by mapping to the $.grid data and using the sum [+] metaoperator.

In the MAIN block we have input, output and two logic sections, to trace the paths and determine the minimum sum.

The find_nodes() routine first checks whether the given Vertex is the endpoint, logging the completed chain and returning if so. It then posits two potential new Vertexes, treating their creator methods as proper first-order functions in a for loop. For each of these we check to see whether its remains within the Grid, and if so we clone the route so far, extending it with that new Vertex, and recurse with the new parameters. Eventually all routes lead to the endpoint and terminate.

Summing a given list of vertices from outside is using the data from the Grid object to compute and so it seems a method of that object is a good place to put that logic. On the other hand, using those sums to find the minimum sum of a variety of routes isn’t particularly intrinsic to the data structure so we leave that in the MAIN block. This task is easily dispatched in Raku:

    my $minpath = @paths.min( { $graph.sum_route( $_ ) } );

In this method we are providing a code block to apply to the data; it returns the minimum value of the array, as determined by those transformed values returned by the block. This design pattern is available for all of the min/max list functions, as if you ask me is very, very cool. Think of it as being analogous to handing sort() a block to determine the ordering. One thing though: because we’ve used this first-class function option the results of the selection function aren’t saved, and after we’ve found our minimum route we will need to recompute the sum for that minimal route we’ve found. Yes this is not perfect efficient, and we could use a map and a hash and save the value, but this way is both ridiculously easy and rather succinct so we’re going to go with that.

[colincrain:~/PWC]$  raku six-blocks-away.p6 -v
grid:
  1   16   12   43   48   19
 13    7    9   16   26    8
 23   18    6   11   15   17
 22   33   28    5   36   32
 38   43    9   46    3   42
 56    4   66   76   25    2
 27   10   58   14   68   52
minimum sum: 170
route:
[(0,0) (0,1) (1,1) (2,1) (2,2) (3,2) (3,3) (4,3) (4,4) (4,5) (5,5) (5,6)]
1 ➔ 13 ➔ 7 ➔ 9 ➔ 6 ➔ 11 ➔ 5 ➔ 36 ➔ 3 ➔ 25 ➔ 2 ➔ 52

#!/usr/bin/env perl6

class Vertex {
    has Int $.x;
    has Int $.y;

    method gist {
        return "($.x,$.y)";
    }
}

class Grid {
    ## a special case of a rectangular grid DAG where we can
    ## only progress rightwards or downwards between vertices
    has $.grid;

    method endpoint () {
        ## lower right corner of the grid
        return Vertex.new: :x(self.grid[0].end), :y(self.grid.end)
    }

    method down_edge ($vertex) {
        return Vertex.new: :x($vertex.x), :y($vertex.y + 1)
    }

    method right_edge ($vertex) {
        return Vertex.new: :x($vertex.x + 1), :y($vertex.y)
    }

    method out_of_bounds ($vertex) {
        ## Bool is vertex outside the grid?
        return ($vertex.x > self.endpoint.x || $vertex.y > self.endpoint.y)
                ?? True
                !! False
    }

    method sum_route ($route) {
        ## given a path of vertices, return the sum of the values
        my $sum = [+] $route.map( { $.grid[.y][.x] } );
        return $sum;
    }
}

sub MAIN (Bool :$v = False) {

    ## input
    my @grid = [1,  16, 12, 43, 48, 19],
               [13,  7,  9, 16, 26,  8],
               [23, 18,  6, 11, 15, 17],
               [22, 33, 28,  5, 36, 32],
               [38, 43,  9, 46,  3, 42],
               [56,  4, 66, 76, 25,  2],
               [27, 10, 58, 14, 68, 52];

    my $graph = Grid.new: :grid(@grid);
    my @paths;
    my $startpoint = Vertex.new: :x(0), :y(0);
    my $route = [$startpoint];

    find_nodes( $route, $startpoint, $graph, @paths );

    my $minpath = @paths.min( { $graph.sum_route( $_ ) } );
    my $minsum = $graph.sum_route( $minpath );

    ## output
    say "grid:"                                 if $v;
    (.fmt( '%3d', '  ' ).say for $graph.grid)   if $v;
    say "minimum sum: $minsum";
    say "route:";
    say $minpath                                if $v;
    $minpath.map( { $graph.grid[.y][.x] } ).join( ' ➔ ' ).say;
}

sub find_nodes ($route, $vertex, $graph, @paths) {

    if $vertex eqv $graph.endpoint( ) {
        @paths.push: $route;
        return;
    }
    for ($graph.down_edge( $vertex ), $graph.right_edge( $vertex )) -> $next_vertex {
        next if $graph.out_of_bounds( $next_vertex );
        my $new_path = [|$route, $next_vertex];
        find_nodes( $new_path, $next_vertex, $graph, @paths );
    }
}

Perl Solution

In the Perl 5 version, as promised, things are quite straightforward and direct. Walking down the script we have an input section, where we also determine the endpoint. We then find our routes, using a find_node() routine similar to that in the Raku in logic, but in this case bifurcating into two independent forks for downward pointing edges and rightward. There is a little code duplication, but then again the flow is quite clear. I actually refactored this to more resemble the Raku version, which saved a total of 4 lines. This works just fine, but at the cost of quite a lot of clarity. So I left it as is, in it’s simplicity. One might also not that as is, the script here is already quite a bit shorter without the classes, at 58 lines versus the Raku’s 84. I do like the way the Raku classes worked out though, so I regret nothing.

Words to live by.

use warnings;
use strict;
use feature ":5.26";

## ## ## ## ## MAIN:

my $graph = [   [  1,  16,  12,  43,  48,  19 ], 
                [ 13,   7,   9,  16,  26,  8  ],
                [ 23,  18,   6,  11,  15,  17 ],
                [ 22,  33,  28,   5,  36,  32 ],
                [ 38,  43,   9,  46,   3,  42 ],
                [ 56,   4,  66,  76,  25,   2 ],
                [ 27,  10,  58,  14,  68,  52 ]   ];
                
my $endpoint = [$graph->@* - 1, $graph->[0]->@* - 1];

## determine the paths through the grid
my @paths;
my $startpoint = [0,0];
my $path = [$startpoint];
find_nodes( $path, $startpoint );

## sum totals to find the smallest
my $minsum = "+Inf";
my $minpath;
for $path (@paths) {
    my $sum = 0;
    $sum += $graph->[@$_[0]][@$_[1]] for @$path;
    if ($sum < $minsum) {
        $minsum  = $sum;
        $minpath = $path;
    }
}

## output
say "minimum sum path:";
print join ' -> ', map { $graph->[@$_[0]][@$_[1]] } @$minpath;
say "\nsum is $minsum";

## ## ## ## ## SUBS:

sub find_nodes {
    my ( $path, $point ) = @_;
    if ( $point->[0] == $endpoint->[0] &&
         $point->[1] == $endpoint->[1]    ) {
        push @paths, $path;
        return;
    }        
    unless ($point->[0] + 1 > $endpoint->[0]) {
        my $next_point = [$point->[0] + 1, $point->[1]];
        my $new_path = [$path->@*, $next_point];
        find_nodes( $new_path, $next_point)
    }
    unless ($point->[1] + 1 > $endpoint->[1]) {
        my $next_point = [$point->[0], $point->[1] + 1];
        my $new_path = [$path->@*, $next_point];
        find_nodes( $new_path, $next_point)
    }
}

TASK #2 › Word Break

Submitted by: Mohammad S Anwar

You are given a string $S and an array of words @W.

Write a script to find out if $S can be split into sequence of one or more words as in the given @W.

Print the all the words if found otherwise print 0.

Example 1:

Input:

$S = "perlweeklychallenge"
@W = ("weekly", "challenge", "perl")

Output:

"perl", "weekly", "challenge"

Example 2:

Input:

$S = "perlandraku"
@W = ("python", "ruby", "haskell")

Output:

0 as none matching word found.

Method

I have been called… things… for my love of regular expressions. That it wasn’t natural. Suggestions that there was something… off maybe, somewhere deep inside me. Not to discount the possibility that those people were on to something, I have persisted in the face of the critics. Refusing to be shamed, I announce it to the world. It has always been perhaps my favorite feature of the language, which is no small praise in a language with so many nice thing to say about it.

One cannot overstate the immense power contained in the DSL that is Perl Regular Expressions. The added features of the Raku RE engine only serve to augment that power, and every time I have an opportunity to learn about something new they’ve come up with I find myself giggling with glee. Oh you can do that now? Sweet… Larry’s vision of RE really knocked it out of the park when Perl grew to rule the web, and the PCRE library spawned from that effort still holds a very promenant position today. With Raku, they have in a sense applied a metaoperator to the the very idea of regexes, expanding the initial DSL into a complete object ecosystem known as Grammers which we can in turn use to write new DSLs.¹ It does take a little getting used to coming from pure Perl, but it well worth the effort.

This challenge, as I understand it, seems to me to be a straightforward application of regular expressions.

¹ Andrew Shitov Creating a Complier With Raku
https://andrewshitov.com/creating-a-compiler-with-raku/

Raku Solution

One of the biggest changes to RE under Raku is that a matching is saved completely to a new match object, which will hold all of the information we would normally associate with that match for later reference: the initial string, the match, positions, captures, the whole works. There is a certain amount of syntax change, enough that most regexes won’t slot straight over, but those changes are minor. In this case the pointy brackets explicitly tell Raku to interpolate the $group variable we have constructed to hold the regex specifier before evaluating. The /g global switch is now known as an adverb, modifying the action of matching, and is attached to the verb rather than the regex specifier. The match object itself is saved to the $match variable. When we look at this object as a list, it returns a list of the matches made. This list can be is a list like any other, and a ternary operator is required to return 0 for the case of no matches.

One change is we now have two options available for alternation, both || and |. The first chooses the first option that matches, but the second, | chooses the longest match that still fits. Because of the edge case where one word in the word array might match within another, we have chosen the latter behavior. By including ‘week’ in the word list, you can see we still see we are matching ‘weekly’ instead of short-circuiting out.

[colincrain:~/PWC]$  raku c-c-combo-breaker.p6 coolweeklyperlchallengeclub week weekly perl challenge
weekly
perl
challenge

sub MAIN(Str:D $string, *@words) {
    my $group = @words.join(' | ');
    my $matches = $string ~~ m:g/ <$group> /;
    $matches.list.elems ?? (.Str.say for $matches.list)
                        !!   say '0';

Perl Solution

[colincrain:~/PWC]$  perl 64_2_c-c-combo_breaker\!.pl coolperlweeklychallengeclub perl weekly challenge

perl weekly challenge

use warnings;
use strict;
use feature ":5.26";

## ## ## ## ## MAIN:


my $S = "coolperlweeklychallengeclub";
my @W = ("weekly", "challenge", "perl");

my $group = join '|', @W;
my @matched = $S =~ m/$group/g;

say @matched ? "@matched" : 0;

Tag: Directed Acyclic Graph

Six blocks away? C-c-c-combo breaker!

THE WEEKLY CHALLENGE – PERL & RAKU #64

TASK #1 › Minimum Sum Path

Example

Method

Raku Solution

Perl Solution

TASK #2 › Word Break

Example 1:

Example 2:

Method

Raku Solution

Perl Solution