Flip Me Your Digits, Baby

Wherein we take a call and flip it on it’s side, looking at the world from another point of view…

THE WEEKLY CHALLENGE – PERL & RAKU #110


episode one:
“Hanging on the Telephone”


Task 1

Valid Phone Numbers

Submitted by: Mohammad S Anwar

You are given a text file.

Write a script to display all valid phone numbers in the given text file.

Acceptable Phone Number Formats
+nn  nnnnnnnnnn
(nn) nnnnnnnnnn
nnnn nnnnnnnnnn
Input File
0044 1148820341
 +44 1148820341
  44-11-4882-0341
(44) 1148820341
  00 1148820341
Output
0044 1148820341
 +44 1148820341
(44) 1148820341

Method

Telephone networks are, by their essential nature, big sprawling things. They are required to remain operational continuously from their origin onward, forced to compatibly adapt through changes in scale and technology, rarely allowed the luxury of having an ideal architecture designed expressly for their current needs. Starting with a simple short number routed by an operator through a physical patch bay, the need for more connectivity has brought us first mechanical, then analog electronic, then digital switching of numbers; first with a code for an exchange, then a wider area, then a country appended. And at any given time on the line, the old had to still work alongside the new, never allowing a complete rewrite of the rules from the ground up.

Here we are tasked to extract valid ten-digit UK telephone numbers, with a country code prefix, from a file containing malformed data. As the rules for validation are well defined, we won’t second-guess them and try and decide whether a given phone number actually points to a legitimate subscriber endpoint, saving that for another part of the program, whatever that may be.

That said, the record entry with the “00” prefix looks pretty shady. I don’t know how these things work in the UK, but that throws up all kind of red flags. On the other hand, it’s not preceded by a “+”, or surrounded by parentheses, so it gets tossed for not wearing the proper attire before we ever check to see whether its papers are in order; it’s not removed for its insensibility but rather its sartorial malfeasance.

The rules given seem pretty strict, judging on formatting alone rather than numerical meaning. Over here, in the United States, we tolerate phone numbers in a little more variety of related formats:


(212) 555-1212
(212)555-1212
212-555-1212
212.555.1212


Or even with spaces: 212 555 1212

The parentheses, if present, are exclusively around the area code, but other combinations of dashes, dots and spaces can be found in the wild and all are considered perfectly acceptable. The wisest computer forms have you input the 10 numbers and then go on to format the numbers for you, obviating any confusion as irrelevant.

The most important part in presentation is the breaking of the number into 3 digits, 3 digits and 4 digits. The delimiters are varied; the single dot being the most recent addition, but still decades old by now. Rarely you might still wander across a number without an area code, but I live in a big city and haven’t seen this for years. I have, in my life, lived in communities where is was acceptable to go with the last four digits if the others could be assumed. Yea, those days are long gone. I’m not sure when the last time I was not required to dial an area code.

To me the ten-number segment looks brutal for a human eye to scan. Just sayin’.

For once, and I think this may be a one-time offer, I’m going to try not to overly complicate things, and just implement the rules as given.

PERL 5 SOLUTION

For the Perl solution we just need to construct the proper regular expression to pass one of the acceptable prefixes, followed by ten digits without delimiters. Although it’s not well-defined, we’ll assume that an 11-digit number would be invalid, even if the first 10-digits were parsed out correctly, so I’ve added a zero-width negative lookahead assertion to the end, that the final digit is not followed by another digit. Anything else, including a newline, and we’re cool.

use warnings;
use strict;
use feature ":5.26";
use feature qw(signatures);
no warnings 'experimental::signatures';

@ARGV = qw( phone-numbers.txt );
my @numbers;

my $regex = qr/((?: \d{4} | \(\d\d\) | \+\d\d ) \s \d{10}) (?!\d)/x;

while (<>) {
    push @numbers, $_ for /$regex/;
}

say $_ for @numbers;

raku solution

unit sub MAIN ( Str $file = "phone-numbers.txt" ) ;

my $regex   = rx/ [ \d\d\d\d || \(\d\d\) || \+\d\d ] \s \d ** 10 /;

my @numbers  = $file.IO.lines
                       .map({$_~~$regex})
                       .grep({.defined}) ;

say $_.Str.fmt("%16s") for @numbers;
Python Solution

import re

f = open("phone-numbers.txt", "r")
for line in f:
    pn = re.search(r"((?:\d{4}|\(\d\d\)|\+\d\d)\s\d{10}(?!\d))", line)
    if pn != None:
        print('{0:>16s}'.format(pn.group()))

f.close

episode two:
“Not That Way – Play the Other Record”


task 2

Transpose File

Submitted by: Mohammad S Anwar

You are given a text file.

Write a script to transpose the contents of the given file.

Input File
name,age,sex
Mohammad,45,m
Joe,20,m
Julie,35,f
Cristina,10,f
Output:
name,Mohammad,Joe,Julie,Cristina
age,45,20,35,10
sex,m,m,f,f

Method

Well, for the second task we are presented with another mystery file out-of-context. Are we going to assume this is proper CSV data as distinguished from a bunch of words and numbers separated by commas? The difference being one has edge-cases covered and the other will presumably break sooner or later. Comma Separated Values, the format, has been well confirmed to be a nightmare of edge-cases, and practitioners are advised not to attempt to implement it on their own. It will always easily and simply work until it doesn’t. But this challenge doesn’t seem to be about that, rather it’s about rejiggering data. So again, let’s stay focused and keep the clutter down. It would be simple enough to pull in a proper CSV library to handle I/O, being the only sensible thing to do, but that would only serve as a distraction to the demonstration.

Instead we’ll split the lines into fields on the commas and call it day on that. It’s a toy, not production code.

Another question that comes to mind is what do we do if the data doesn’t align? I think we will need to fill our records with NULL data to keep the transposed relationships intact. It’s the decent thing to do, so:

T(N) ∘ T⁻ⁱ(N) = N

If we transpose our matrix and then transpose it back we should end up with identical data to when we started.

So the dimensions of the transposition will be based on the largest dimensions of the rows and columns. This doesn’t affect us with the input as given but should a datum be missing the record would look like:

Joe,,m

I guess I just can’t not look for edge-cases. Can’t do it. It’s a good quality to have so I’m not going to fight it.

PERL 5 SOLUTION

name,Mohammad,Joe,Julie,Cristina
age,45,20,35,10
sex,m,m,f,f

In Perl we keep track of the largest row we find, and that will define our column dimension, which gets transposed to rows. The data set as-is is symmetrical, so this doesn’t come into play, but we’ll implement it anyway.

use warnings;
use strict;
use feature ":5.26";
use feature qw(signatures);
no warnings 'experimental::signatures';

@ARGV = qw( transpose-data.txt );

my @mat;
my @trans;

my $max = 0;

while (<>) {
    chomp;
    push @mat, [ split ',', $_ ];
    $mat[-1]->@* > $max and $max = $mat[-1]->@*;
}

for my $i (0..@mat-1) {
    for  my $j (0..$max-1) {
        $trans[$j][$i] = $mat[$i][$j];
    }
}

local $" = ',';
say "@$_" for @trans;
Raku Solution

unit sub MAIN ( Str $input = 'transpose-data.txt' ) ;


my @mat.push( $_.split(',') ) for $input.IO.lines;
my $max = @mat.map( *.elems )
              .max;
my @trans;

for ^@mat.elems -> $i {
    for ^$max -> $j {
        @trans[$j][$i] = @mat[$i][$j];
    }
}

say $_.list for @mat;
say '';
say $_.list for @trans;
Python Solution

import re

mat   = []
trans = []
cols  = 0

f = open("transpose-data.txt", "r")
for line in f:
    row = re.split(",", line.rstrip())
    mat.append(row)
    cols = max(cols, len(row))
f.close
    

trans = [[mat[j][i] for j in range(len(mat))] for i in range(len(mat[0]))]

for i in mat:
    print( *i )

print()

for i in trans:
    print( *i )


The Perl Weekly Challenge, that idyllic glade wherein we stumble upon the holes for these sweet descents, is now known as

The Weekly Challenge – Perl and Raku

It is the creation of the lovely Mohammad Sajid Anwar and a veritable swarm of contributors from all over the world, who gather, as might be expected, weekly online to solve puzzles. Everyone is encouraged to visit, learn and contribute at

https://perlweeklychallenge.org

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s