Fair Play with Your Nose to the Grindstone

Wherein we mess things up pretty spectacularly, but not so bad we can’t mostly put everything back where it was. With maybe a little extra stuff left over that we’re not sure how it got there…

THE WEEKLY CHALLENGE – PERL & RAKU #162 Task 2


“The world rolls round forever like a mill; it grinds out death and life and good and ill; it has no purpose, heart or mind or will.”

— James Thomson


Wheatstone-Playfair

Submitted by: Roger Bell_West

Implement encryption and decryption using the Wheatstone-Playfair cipher.

Examples:
(These combine I and J, and use X as padding.)

encrypt("playfair example", "hide the gold in the tree stump") = "bmodzbxdnabekudmuixmmouvif"

decrypt("perl and raku", "siderwrdulfipaarkcrw") = "thewexeklychallengex"

Commentary

The Playfair cypher is a positional substitution cypher built around paired groups of letters arranged in a two-dimensional 5×5 grid. The encoding of a single letter also uses positional data from its paired complement, which is arranged by the rules to always differ. As such each of the 25 positions can be grouped with each of 24 optional differing positions, yielding 600 possible bigram pairs. These bigrams, once encoded, can be attacked through frequency analysis, however with a much larger pool of options this will be considerably harder to do.

Because a 5×5 grid only allows 25 positions, some scheme must be selected to compres the alphabet of 26 English-language characters into this limited space before any encoding can be started. This is usually done either by combining the letters I and J into a single glyph, or removing any instances of the letter Q entirely. This scheme must be prearranged and in not generally considered part of the cypher proper. The lossy nature of the compression means that information is destroyed, and as such this alteration of the plaintext cannot be reversed. There is no single defined method for this compression, only the advice to either remove or combine some uncommon letter in the expected corpus being encoded.

Once a 25-letter alphabet had been selected a prearranged secret passphrase in used to initiate the encryption pad. The grid is filled sequentially, one row at a time, using unique letters from the passphrase. One that is accomplished, the remaining unplaced letters are used in alphabetical order to complete the square. Use of a fairly long and complex passphrase will thus install a fair bit of randomization into the arrangement of the letters in the pad.

As the workings of the cypher requires pairs of letters from the source without duplication, an additional prearranged letter must be selected to be introduced should a letter repeat in the plaintext and end up selected for pairing. This value can again be any letter, with the guidance that as a introduced error that cannot be removed by decoding it should be uncommon, and easy to spot so it can be later ignored. The letter X is commonly chosen for this purpose. This character is also appended to the plaintext should the message have an odd number of characters, to allow complete pairing.

After preprocessing of the message, the text the letters are finally grouped by twos into digram combinations.

To encode the digrams, each is found in the key pad. This can result in three possible cases:

  1. The letters share the same row. Should this happen the encoded letters are selected as the letters each one to the right in the row, wrapping the fifth position around to the first.
  2. The letters share the same column. In this event the encoded values are the letters one down from each original, again wrapping the fifth row position around to the first.
  3. If neither of these cases are true, then the two letters will form the opposing corners of a rectangle within the grid. In this case the letter substituted for each is the character in the same row in the opposite horizontal corner of the rectangle. The row is kept, but the column position of the complement character is used, mixing the information between the two.

Each character in the grid can wind up mapped to one of five encoded letters: to one of four alternate positions in its keypad row or the position in the same column one row down, depending on its letter pairing. Although considerably more complex than a simple substitution cypher, this does not approach the maximal mixing potential of mapping each character to any of the other 24 grid coordinates. On the other hand because multiple pairings will also result in the same encoded value for a letter this serves to further obscure the original. The rectangular symmetry also carries through when pairs of letters are reversed; the encoded pattern is also reversed and this can be detected. Because of these patterns the encryption is subject to attack using both frequency and pattern analysis of the resulting digram codes.

METHOD

The encryption and decryption of the cypher is highly symmetrical: for the shifted rows and columne the decryption step is to shift in the opposing direction to undo the transformation. For the rectangular transformation we there is no difference at all, as the transform is its own inverse; reapplication of the procedure reverts the system to its original state. As such a single core function can both en- and decrypt messages, differing between the two by means of a directional coefficient indicating whether to advance or decrease the row or column positions.

The creation of the keypad table is common to both encryption and decryption and is delegated to a helper subroutine. As letters from the passphrase are examined one-by-one, they are checked against a hash of the alphabet being used, and if found in the hash are added to the table and removed from the hash. Once the passphrase is evaluated the loop continues over a sorted list of the remaining hash keys to fill in the rest of the letters.

Lastly, encoded cyphertexts are traditionally broken into homogeneously-sized groupings of symbols, usually 4 or 5 characters in length. To accommodate this a mod 5 parser is brought in to add spaces, with the last quintet padded with random valid characters to fill it out to the proper size, obscuring the original message endpoint.

PERL 5 SOLUTION

use warnings;
use strict;
use utf8;
use feature ":5.26";
use feature qw(signatures);
no warnings 'experimental::signatures';
use constant { PAD => 'Q' };

my $phrase    = shift @ARGV // 'playfair example';
my $text      = shift @ARGV // 'hide the gold in the tree stump';
my $cypher    = encode( $phrase, $text );
my $plaintext = decode( $phrase, $cypher );

say<<~"END";

    input:      $text
    passphrase: $phrase
---------------------------------

    cyphertext: 
    
        $cypher
    
    decoded plaintext:  
    
        $plaintext
    
END




## subroutines

sub encode( $phrase, $text ) {
    return _quints( _code($phrase, $text, 1) );
}

sub decode( $phrase, $text ) {
    return _code( $phrase, $text, -1)
}




sub _code( $phrase, $text, $shift ) {
    my ($kh, $ka) = _key_table( $phrase );
    $text =~ s/\W//g;                  ## strip spaces and nonword chars
    $text =~ tr/a-z/A-Z/;              ## uppercase
    my @text = split //, $text; 
    
    my @dgs;
    while (@text) {                    ## setup digram pairs
        my $digram = [shift @text];
        $digram->[1] = ($text[0] eq $digram->[0]) 
            ? PAD                      ## pad if duplicate
            : shift @text || PAD;      ## shift next char or backfill w/pad
        push @dgs, $digram;
    }
    
    my @out;
    for my $dg (@dgs) {
        ## same row, shift right and wrap
        if ($kh->{$dg->[0]}->[0] eq $kh->{$dg->[1]}->[0]) {         
            push @out, $ka->[$kh->{$dg->[0]}->[0]]
                            [($kh->{$dg->[0]}->[1]+$shift)%5], 
                       $ka->[$kh->{$dg->[1]}->[0]]
                            [($kh->{$dg->[1]}->[1]+$shift)%5] ;
        }
        ## same col, shift down and wrap
        elsif ($kh->{$dg->[0]}->[1] eq $kh->{$dg->[1]}->[1]) {      
            push @out, $ka->[($kh->{$dg->[0]}->[0]+$shift)%5]
                            [$kh->{$dg->[0]}->[1]], 
                       $ka->[($kh->{$dg->[1]}->[0]+$shift)%5]
                            [$kh->{$dg->[1]}->[1]] ;
       
        }
        ## rectangle, swap in same row with char at pair column position
        else {                                                      
            push @out, $ka->[$kh->{$dg->[0]}->[0]]
                            [$kh->{$dg->[1]}->[1]], 
                       $ka->[$kh->{$dg->[1]}->[0]]
                            [$kh->{$dg->[0]}->[1]] ;
        }
    }
    
    return join '', @out;
}

sub _key_table ($phrase) {
    $phrase =~ s/\W//g;                     ## strip spaces and nonword
    $phrase =~ tr/a-z/A-Z/;                 ## uppercase
    my @phrase = split //, $phrase;    
    
    my %alpha = map { $_ => 1 }             ## 25-letter alpha hash
                ("A".."I", "K".."Z");   
    my ($i, $j) = (0, 0);
    my $khash;
    my $karr;
    
    ## first assign unique phrase letters, then remaining alphabet
    for (@phrase, sort keys %alpha ) {                 
        next if not exists $alpha{$_};
        $khash->{$_} = [$i, $j] and delete $alpha{$_};
        $karr->[$i][$j] = $_;
        ++$j > 4 and ($i, $j) = (++$i, 0);
    }
    
    return $khash, $karr;
}


sub _quints( $text ) {
## arrange cypher as 5-char groupings with random backfill for last group
    my @alpha = ("A".."I", "K".."Z");  ## 25 chars
    my $fill  = 5 - length($text) % 5;
       $text .= $alpha[rand(25)] while $fill--;
       $text  =~ s/(.{5})/$1 /g;
       
    return $text;
}



The Perl Weekly Challenge, that idyllic glade wherein we stumble upon the holes for these sweet descents, is now known as

The Weekly Challenge – Perl and Raku

It is the creation of the lovely Mohammad Sajid Anwar and a veritable swarm of contributors from all over the world, who gather, as might be expected, weekly online to solve puzzles. Everyone is encouraged to visit, learn and contribute at

https://theweeklychallenge.org