perlguts – Programming Excursions in Perl and Raku

Wherein we walk these endless marble corridors divided in time: Empty salons. Corridors. Salons. Doors. Doors. Salons. Empty chairs, deep armchairs, thick carpets. Heavy hangings. Stairs, steps. Steps, one after the other. Glass objects, objects still intact, empty glasses. A glass that falls, three, two, one, zero. Glass partition, letters.

THE WEEKLY CHALLENGE – PERL & RAKU #108

episode one:
“Where In the World Am I?”

Task 1

Locate Memory

Submitted by: Mohammad S Anwar

Write a script to declare a variable or constant and print it’s location in the memory.

Method

I have been known to wax metaphysical at times. When is comes down to it, I don’t seem to look at the world the same way as mort of the people I talk to. To me they seem obsessed with the surface, and I find myself looking rather at the reasons why things are the way they are, with the results being a more superficial, often obvious conclusion. Sometimes the thing isn’t the art, but rather why the thing is is the art. Or why the thing is art is the art. And don’t get me started on Korzybski unless you want things to get weird.

So when you as ask me for for a variable’s location, I ask, which one? Because in Perl, depending on what it is there may be several answers available.

There are commonly said to be three data types in Perl, corresponding to the three sigils: a scalar that holds a single thing, an array that hold a list of things, an a hash that holds a table of things. No dimension, one dimension, two dimensions you might say. The metaphor doesn’t hold up in the end, but it’s not bad, either. It’s also not strictly true, as there are in fact a bunch of complex data structures in Perl, just under the surface, hidden from sight: scalars can hold integers, or sometimes floats, or strings, which are pointers to C arrays of characters. Arrays are C-arrays of pointers to scalars that hold all this extra information about their length, their allocated memory and the start position in that memory block. There are sometimes structures holding pointers to other structures containing pointers to other structures that have pointers that keep going further.

So are we asking about the name, or the thing? The variable, or the data? And if the data, which data specifically, as there may be quite a bit of it when talking about a blessed hash object, for instance.

The easy answer, and I think the one that’s being asked for here, is the memory location of the first, smaller struct known as the head. This is a small gatekeeper of sorts that has the reference count, a host of flags to tell us things about the variable and the data inside it, and a pointer to another struct that actually holds the data. Or perhaps holds a pointer to the data. Or one of a small crowd of other, increasingly complex options. In any case we always have a relatively simple head element pointing to a body which holds further information.

The only form of variable without a body is undef, which doesn’t need to point to anything, so doesn’t. But even that is only if it has never contained anything – setting it to undef later in its life is as simple as unsetting the flags in the head that tell us what it is. Perl won’t go through the trouble to deallocate the memory for the more complex variable body if it doesn’t have to — or more precisely , considers it inefficient to do so.

So how do we get at the memory location of the head of our variable? I mean, I have no idea why you’d just want that, but that part’s easy — it’s in the string when we print a reference. If we say

my $foo = 42;
my $ref = \$foo;
say $ref;

We might get a result like

SCALAR(0x7f88ae0291b8)

where SCALAR tells us it’s a scalar value in the body and 0x7f88ae0291b8 is a hexadecimal location in memory for the head. If you wanted to you could use a regex to clip out the memory location, which solves our particular task here. But let’s have a look further.

The stringified value of the reference is only really useful to determine whether two references are the same. You can’t meaningfully alter the reference by altering the string, and doing that ends up assigning a string with some value like “SCALAR(0x7f88ae0291b8)” to the variable — dressing like one doesn’t automatically turn it into a reference, and it’s no longer a reference at that point, only a string that superficially looks like one.

A more useful tool for looking inside variables is the Devel::Peek module, which in its Dump() routine tells you everything you might want to know about a variable: what type of data is in it, what kind of structure is holding that data, and what does it report itself to be, which, as we noted in certain undefined variables, might not be the same thing as what’s going on under the hood.

For example we’ll make a scalar. We’ll give it an integer value, then a string, then undefine it, and peek in on it three times along the way. If we took a reference to this variable and printed it out, stringifying it, we’d get

SCALAR(0x7fce61029a70)

So what does Devel::Peek tell us? Well,

First as an integer — when we say $foo = 42;

SV = IV(0x7fce61029a60) at 0x7fce61029a70
  REFCNT = 1
  FLAGS = (IOK,pIOK)
  IV = 42

Then as a string — now set $foo = "bar";

SV = PVIV(0x7fce61027e20) at 0x7fce61029a70
  REFCNT = 1
  FLAGS = (POK,IsCOW,pPOK)
  IV = 42
  PV = 0x7fce60c241e0 "bar"\0
  CUR = 3
  LEN = 10
  COW_REFCNT = 1

And finally once we set it to undef — $foo = undef;


SV = PVIV(0x7fce61027e20) at 0x7fce61029a70
  REFCNT = 1
  FLAGS = ()
  IV = 42
  PV = 0

In the first example it says we have an integer value, IV, followed by the location of the body struct in parentheses, and it continues to say the head is “at 0x7fce61029a70”. Notice this is the same number as we got when printing the reference. Its internal reference count, REFCNT is at 1, it has the integer-okay, IOK flag set to say the value in the Integer slot is good-to-go, and the integer value itself, IV, is 42. Don’t worry about the other flags for now, they’re private to the distribution and more exotic.

After we assign a sting to $foo, our body type has changed to PVIV, for pointer value / integer value, and this body is in a different location. This struct allows that dual nature being both of a string and a number we’ve come to expect from a scalar variable. It’s not one or the other but both simultaneously, and making this behavior just work requires some complex bookkeeping. I’d say magic but that word actually refers to something else.

The flags have changed to POK for pointer-okay, and now the pointer value is declared good. If we look we can see the IV value of 42 its still there, but the IOK flag that says it’s good is gone, so we can’t use it. It is now a string and only a string. If we’d only wanted to use the integer value as string, say to interpolate it somewhere, Perl would have created a string version of the integer and made a pointer to it, then added the POK flag, leaving the IOK flag up. In this case Perl created the string, so it knows the two values are both correct and either can be used.

After we undefine the value we still keep the same PVIV struct, in the same location, only now all the xOK flags are gone. Perl has also decided to free up the memory allocated to hold the string, which was at 0x7fce60c241e0 with length LEN of 10, currently holding, CUR, 3 characters. Perl always grabs a little more than needed, to avoid reallocating if it can. But here it has returned that on assigning undef. The IV is part of the PVIV struct and we keep it untouched because why do extra work until we know we need to? There seems to be no easy way to simply reset the IOK flag to resurrect the value, which is probably a good thing. Devel::Peek is just for looking, thank you very much.

All of this is just scratching the surface a bit on the Perl internals. Every time I take that plunge there’s more to find and understand. For this trip I found Perlguts Illustrated to be the most helpful. Just read it a couple or four dozen times and you really start to get the hang of it. And Devel::Peek of course, to help explain what you’re looking at.

I’ll leave you with last line of the documentation for that amazingly helpful module:

SEE ALSO
perlguts, and perlguts, again.

True words.

PERL 5 SOLUTION

We’ll just make a basic example, demonstrating both parsing out the stringified reference and the same when viewed with Devel::Peek.

use warnings;
use strict;
use feature ":5.26";
use feature qw(signatures);
no  warnings 'experimental::signatures';
use Devel::Peek qw(Dump);


my $foo = "this is a string";
my $foo_ref = \$foo;
$foo_ref =~ m/(SCALAR|ARRAY|HASH)\((.*)\)/;

say "The variable head is stored at $2";
my $det = $1 eq 'ARRAY' ? 'an' : 'a';
say "The variable is $det \L$1";
say "-" x 25;
say '';

Dump $foo;

The output:

The variable head is stored at 0x7febe2029a70
The variable is a scalar
-------------------------

SV = PV(0x7febe200ca80) at 0x7febe2029a70
  REFCNT = 2
  FLAGS = (POK,IsCOW,pPOK)
  PV = 0x7febe1c10ab0 "this is a string"\0
  CUR = 16
  LEN = 18
  COW_REFCNT = 1

raku solution

In Raku, as it is so often, there’s already a built-in function for that. However, it comes with two caveats: one, that this position isn’t necessarily fixed for a given object, and is subject to change within the course of a program:

Please note that in the Rakudo implementation of Raku, and possibly other implementations, the memory location of an object is NOT fixed for the lifetime of the object. So it has limited use for applications, and is intended as a debugging tool only.
docs.raku.org

So all-in-all that’s not a terribly useful number to know. It would, perhaps, be useful to determine whether two things are in fact pointing to the same thing in memory, that they are the same reference to the same object, at the same time. It seems unlikely an object would get moved at completely unpredictable times, but to be clear I’m unsure on all the parameters here. The second oddity is that this number is for whatever reason delivered in decimal. Even if it’s a semi-arbitrary value subject to change, this doesn’t seem right. So I convert it to hex. Because I have a deep sense of propriety I suppose.

unit sub MAIN () ;

my $foo = 42;

say $foo;
say $foo.WHERE.base(16);

my $bar = $foo;
say "they're the same of course" if $bar.WHERE == $foo.WHERE;

Python Solution

In Python we have the id() function. As in Raku, we’ll convert to hex because, well, because memory locations are written in hexadecimal notation. There I said it. It is the way. In Python everything is an object and ofttimes they are the same object, and comparing these objects directly can be done with the is operator, or indirectly using the location values returned by id().

foo = 42
loc_foo = hex(id(foo))

print( "the variable 'foo' is located at:", loc_foo )

bar = 42
loc_bar = hex(id(bar))

print( "the variable 'bar' is located at:", loc_bar )
print( "bar equals foo                  :", bar == foo )
print( "bar is the same object as foo   :", bar is foo)

if loc_foo == loc_bar:
    print( "the value", foo, "is interned")
else:
    print( "the value", foo, "is not interned")

episode two:
“Belling the Cat in Myriad Ways”

task 2

Bell Numbers

Submitted by: Mohammad S Anwar

Write a script to display top 10 Bell Numbers. Please refer to wikipedia page for more informations.

Example:

B₀: 1 as you can only have one partition of zero element set

B₁: 1 as you can only have one partition of one element set {a}.

B₂: 2

   {a}{b}
   {a,b}

B₃: 5

   {a}{b}{c}
   {a,b}{c}
   {a}{b,c}
   {a,c}{b}
   {a,b,c}

B₄: 15

   {a}{b}{c}{d}
   {a,b,c,d}
   {a,b}{c,d}
   {a,c}{b,d}
   {a,d}{b,c}
   {a,b}{c}{d}
   {a,c}{b}{d}
   {a,d}{b}{c}
   {b,c}{a}{d}
   {b,d}{a}{c}
   {c,d}{a}{b}
   {a}{b,c,d}
   {b}{a,c,d}
   {c}{a,b,d}
   {d}{a,b,c}

Method

As shown in the above examples, the Bell numbers, B_n, represent the number of all possible partitions of a set with n elements. Although those examples demonstrate those partitions for the numbers 2, 3, and 4, we’re going to have to assume the request is for the first 10 numbers, rather than the notational breakdown as shown, notwithstanding the wording being a bit ambiguous. Not because this is difficult, mind you, which it is, but because the 10th Bell number is 115975, and that’s a lot of breaking down for quite questionable benefit.

How do we create the Bell numbers? Mathematically it can get rather obscure, but there also exists a relatively straightforward graphical algorithm. This involves constructing a triangular matrix structure and using that to compute successive elements of the sequence.

The algorithm has a few moving parts. The goal is to construct a triangular matrix, where each row contains one more element than the row immediately above it. If we number the rows starting at 0, at the top, and incrementing as we go down, each row will have a number of elements equal to its order, or row number, plus 1.

For the first row, 0, we start with the number 1, in a row with 1 element. This is the only fixed base structure required.

In each additional row, the first element is the last element from the previous row. In this case the last element from the previous row is 1, so we start the row with that value. Now, for the next element in the row, we take the value of the previous element from the row and add it to the value of the the element immediately above that element.

In other words, to create the value at a given index, add the two values from the previous index and the previous index on the previous row. If we continue this way until we run out of values in the previous row to add, we will have added one additional element to the current row. Write this value in, and copy it to the following row as the first element. By applying these two steps as many rows can be computed as required.

The Bell numbers, B_n are found as the first element in every row, indexed from 0.

PERL 5 SOLUTION

The algorithm follows the text explanation: within a loop constructing each row in turn we take the last element from the previous row as index 0. Then for as many times as required to complete the row we take the value from the index – 1 and add it to the value from the index – 1 in the row index – 1.

The Bell numbers are the 0 index from each row in the resultant matrix.

As we noted earlier, after the first few entries the numbers get large at a steady pace, adding 1 or 2 digits with each successive value. By using the bigint pragma we can compute arbitrarily large values, such as

B₅₀₀₌

16060726010399914537437328604655077862919245466450
01249221458647036609031692388742264533068377381547
52608395670137495550103762064426599148582399756042
39194722536731542877114522434826259434253245235878
07683222016163482602127637462832110631754715883059
00499988767247495691030562568738615933054947931678
91236081525416343738220590486226851969446456073386
72012856321864174391445622227559253116940246721792
83672281903035512240651033569323506092293426051672
55200156770306847016242592054728359001944534021890
46985409254748392090704758491594261624209127147911
85467698394373398473494156034113380195089364116722
93535432986991680881631979332663613611717495678062
52210577980695569672801349293276671456649407517188
00283570310764911916148945975987515393829548296018
96350096235455285746800958124227773807905768259318
23862858389709617386930741651345394229457772

use warnings;
use strict;
use feature ":5.26";
use feature qw(signatures);
no  warnings 'experimental::signatures';
use bigint;


my $triangle;
my $values     = 100;
$triangle->[0] = [1];

for my $row ( 1..$values ) {
    ## the first value
    $triangle->[$row][0] = $triangle->[$row-1][-1];
    
    ## the rest of the row
    for my $i (1..$row) {
        $triangle->[$row][$i] = 
            $triangle->[$row-1][$i-1] + $triangle->[$row][$i-1];
    }
}

for my $n (0..$values) {
    printf "%10s %-s\n", "B($n)", $triangle->[$n][0];
}

The output:

      B(0) 1
      B(1) 1
      B(2) 2
      B(3) 5
      B(4) 15
      B(5) 52
      B(6) 203
      B(7) 877
      B(8) 4140
      B(9) 21147
     B(10) 115975
     B(11) 678570
     B(12) 4213597
     B(13) 27644437
     B(14) 190899322
     B(15) 1382958545
                      ...

Raku Solution

The Raku is a straight port from the Perl.

unit sub MAIN ($values = 100) ;

my @triangle;
@triangle[0] = [1];

for 1..$values -> $row {
    @triangle[$row][0] = @triangle[$row-1][*-1];
    for 1..$row -> $i {
        @triangle[$row][$i] = @triangle[$row][$i-1] + @triangle[$row-1][$i-1];
    }
}

say $_[0] for @triangle;

Python Solution

In Python, to be different, we can use a nested list comprehension to make a triangular array of 0s, then fill in the blanks. You say “tomato“, I say “delicious“.

values = 10
tri    = [[0 for x in range(x+1)] for x in range(values)]

tri[0][0] = 1

for row in range(1,len(tri)):
    tri[row][0] = tri[row-1][-1]    
    for i in range(1,row+1):
        tri[row][i] = tri[row][i-1] + tri[row-1][i-1] 
    
for row in tri:
    for elem in row:
        print(elem, end=" ")
    print()

The Perl Weekly Challenge, that idyllic glade wherein we stumble upon the holes for these sweet descents, is now known as

The Weekly Challenge – Perl and Raku

It is the creation of the lovely Mohammad Sajid Anwar and a veritable swarm of contributors from all over the world, who gather, as might be expected, weekly online to solve puzzles. Everyone is encouraged to visit, learn and contribute at

https://perlweeklychallenge.org

Tag: perlguts

Partitions in Memories, A Triangular Palace

THE WEEKLY CHALLENGE – PERL & RAKU #108

Task 1

Locate Memory

Method

PERL 5 SOLUTION

raku solution

Python Solution

task 2

Bell Numbers

Example:

Method

PERL 5 SOLUTION

Raku Solution

Python Solution