That Big Ol’ Square’ll be Just Perfect

Wherein we search for perfection. A whole bunch of it.And no cheap copies.

THE WEEKLY CHALLENGE – PERL & RAKU #149 Task 2

“The Perfect Square Has No Corners”

— Lao Tzu, Tao Te Ching

Largest Square

Given a number base, derive the largest perfect square with no repeated digits and return it as a string. (For base>10, use ‘A’..‘Z’.)

Example:

    f(2)="1"
    f(4)="3201"
    f(10)="9814072356"
    f(12)="B8750A649321"

Method

The numbers we’re looking at here share two qualities: they’re perfect squares, which is to say some integer times itself yields the number, and they have no repeated digits in their representation. We want a generalized solution to work in any base, and we want the largest number that can be made in that base that fits the bill.

From these selectors, given a base there can obviously be only one largest value. What would that look like?

For a given base, we need one type of digit for each possible value, so the maximum number of distinct digits we can pack in to our theoretical solution would be one for each digit in the base. So base-10 would get a 10-digit number. In fact in the examples that particular number is listed:

9814072356

This number has 10 digits as expected and is quite high in the range. The largest 10-digit number that does not repeat a digit in base-10 would be

9876543210

and this is pretty close to that. Perhaps we could count down from the maximum and check numbers for squareness along the way. Let’s say we take the square root and see if it’s an integer, by truncating the square root to an integer and seeing whether it’s the same as the original, for instance…

To be honest that sounds both terribly inefficient and more than a little bit boring. Most numbers won’t be perfect squares. In fact very, very few will. We’re doing a lot of looping against long odds. And it doesn’t solve the problem of having distinct digits (but we will revisit that notion later), so adding on that check as well makes a hit doubly unlikely. I’m not going to even think about the odds.¹

The quickest way, it seems, to shorten those odds would be to start with squares and then see whether they have distinct digits. That sounds more promising. How can we start with squares, then?

Well, we could start with the largest integer that when squared still yields a 10-digit number and check the result. Then decrement to the next lower value and try squaring that. We’re still hit-or-miss on whether the digits check out, but at least we’re starting from a much better position. The remaining part of the puzzle is to make it work in any base.

At first I was concerned about how to do multiplication and square roots in different bases, but then I realized these fears turned out to be unfounded. Numbers are numbers and don’t care how we represent them. Five times five equals twenty-five just the same in binary as it does in decimal. I mean, that’s the way it’s done behind the scenes in computing anyways, right? We don’t need to worry how it works, practically. All we really need is a way to convert back and forth between decimal and arbitrary bases and we can keep our math basic and easy to understand.

This should work.

But before we go, what about the third way? Take all the digits of a base and start constructing numbers out of them?

If we take a fast permutations routine we can rearrange the digits and test to see whether the square root is whole. There must be far fewer 10-digit permutations than 10-digit numbers, right?

This is a viable scheme, but suffers from the fact that whereas multiplying is cheap and easy, taking the square root is an expensive operation, that only gets worse as numbers get larger. And the number of possibilities grows with the factorial, which is better than the super-exponential growth of base-digit numbers in ever larger bases ( nⁿ ), but isn’t great, either.

In other words, the number of permutations for base-10 will be

P_base = base ! = 10 ! = 3,628,800

which will be considerably less than

Max_base = base^base = 10¹⁰ = 10,000,000,000

but considerably more than

Root_max-base = √base^base = √10,000,000,000 = 100,000

Once above base-12 the factorial growth begins to take off² and this technique starts to really become unusable.

¹ A ridiculous assertion to anyone who knows me: I can’t really do that. But I will limit myself to a back-of-the-envelope start on the problem. To start we have 10 ! = 3,628,800 arrangements of 10 digits for base 10, against 10¹⁰ – 10⁹ = 9,000,000,000 values that have 10 digits without a leading 0. Which means about 0.04% of the values have all 10 digits. In the range from 1,000,000,000 to 10,000,000,000 (numbers with 10 digits that do not start with 0) we will have √10,000,000,000 – √1,000,000,000 values that can be squared to fall within the range, so hence 68,377 perfect squares. That’s 0.0008% or so. Multiplying the two we get odds of about 1 in 3 million that a given value will have both qualities. This doesn’t take into account the possibility that the largest number that fits will not contain every digit in the base. That wasn’t the criterium — was only that the digits do not repeat.

² factorial growth seems to exceed the super-exponential at around 3.3, but it takes a while to get really out-of hand.

we’re looking at where the red and blue lines cross

PERL 5 SOLUTION

In Perl we implement the solution two ways, but end up only using one. We’ll do decrementing first.

Using a little array slicing we’ll construct the largest number that can be written in the given base using every digit. This means putting the highest-value digits in the highest values places first until we run out of digits, making numbers like 9876543210. It gets a little trickier after base-10.

The solution is to make an array of all digits like (0,1,2,3,4,5,6,7,8,9,A,B,C,D, … ,Z) and take a slice the size of the base from the front of it. Then we reverse it and join the elements into a single string without spaces. This works quite well. Above base-36 there’s no universal agreement on what symbols to use so we just aren’t going to deal with those.

(eyes narrow)

I’m looking at you, Base64. Troublemaker. Just go home.

Anyway, using a loop we square the number and convert it into our desired base, then hash the digits, falling out at the first duplicate found. We’re going to be doing this a lot, so we need to be efficient. The first attempt that successfully runs the gauntlet is then the largest squarable value, and that number squared will be the largest perfect square for the given base.

Things get complicated after base-15 because we to run out of space in a 64-bit integer to store the square and Perl shifts us to a float, which won’t do. The algorithm of note still continues to find viable results, but not necessarily the optimal, highest value. I found with all the intensive number-crunching adding the overhead of the bigint pragma slowed things down sufficiently as to make the technique unusable at larges bases anyway, so there was no obvious help on the way there.

base 15
decrementing from 660045254
found 659854601 squared is: 435408094460869201
in base 15: 3CDE271B squared is EDAC93B24658701

## decrementing solution:
## from the square root of the largest base-digit number
## square and check for repeating digits pass/fail
sub decrements  ( $base ) {
    my $maxbase =  1 . 0 x $base ;
    my $max10   = base2dec( $maxbase, $base ) ;
    my $max     = int sqrt $max10;
    say "decrementing from $max";

    my %h;
    MAX: while ($max--) {
        %h = ();
        ## inlined convert square to base and detect repeats code
        my $num = $max * $max;
        while ( $num > 0  ) {
            ++$h{$num % $base} > 1 and next MAX ;
            $num = int( $num/$base ); 
        }
        
        ## print result found
        say "found $max squared is: ", $max * $max;
        my $bout   = dec2base( $max, $base );
        my $bsqout = dec2base( $max * $max, $base );
        say "in base $base: $bout squared is $bsqout";
        last;
    }
}

sub base2dec( $num, $base ) {
## converts from base-n to base-10 : n = 2..36
    my %alpha;
    my $n ;
    $alpha{$_} = $n++ for ( 0..9, "A".."Z" );
    my $out;
    my $pos = 0;
    for ( reverse split //, $num ) {
        $out += $alpha{$_} * $base ** $pos++;
    }
    return $out;
}

sub dec2base ( $num, $base ) {
## converts from base-10 to base-n : n = 2..36
    my @alpha = ( 0..9, "A".."Z" );
    my $rem; 
    my $val = '';
    while ( $num > 0  ) {
        ($num, $rem) = (int( $num/$base ), $num % $base);   
        $val = $alpha[$rem] . $val;
    }
    return $val;
}

The alternate solution we discussed before is to take all the digits that make up a given base and start permuting. If we first list them descending, then a lexicographic permutation routine will generate the permutations in descending order, so the first match wins with this method as well.

We use Algorithm::Combinatorics to provide the routine, because this will be slow enough already as the bases get large, without implementing a pure Perl implementation of Knuth’s Algorithm L or such. The permutation is an array of digits (using the letters A – Z for digits above 10, as is the way) so the conversion routine is altered to accept and work from that, which is easy enough. This gave acceptable performance at first, but really started to bog down at base-12 so I stopped there. Decrementing is definitely faster and will stay that way, as the exponential growth of the permutation sets will remain greater than the super-exponential growth of the square roots ( n^√n) at that point.

### constructing complete solutions from all digits in base 
### and testing for squareness

sub constructive ( $base ) {
    use Algorithm::Combinatorics qw( permutations );

    my @arr = reverse (0..$base-1);
    my $iter = permutations( \@arr );
    my $c;
    my $dec;
    say "permuting @arr";

    while (my $p = $iter->next) {
        $dec = perm2dec( $p, $base );
        my $sq = int sqrt $dec;
        last if $sq * $sq == $dec;    
    }
    
    say "constructive:";
    say "found $dec" ;
    my $bout = dec2base( $dec, $base );
    say "in base $base: $bout";
}

sub perm2dec ( $arr, $base ) {
    my $out;
    my $pos = 0;
    for ( reverse $arr->@* ) {
        $out += $_ * $base ** $pos++;
    }
    return $out;
}

raku solution

Raku has base-conversion built-in, so with the ease of chaining commands we get a very compact, super-pretty piece of code. With support for Unicode exponentiation I think it looks amazing. On the down-side, we do lose the short-circuit out of the hashing for the repeating digit check by doing it this way, but I went for clean this time. We could always implement the algorithm inline, as we did in Perl, if we wanted to improve the speed with larger bases.

unit sub MAIN ( $base = 6 ) ;


my $max = (1 ~ 0 x $base).parse-base($base)
                         .sqrt
                         .Int ;
            
repeat { $max-- } while repeats-in-base( $max², $base );
    
say qq:to/END/;     
    found $max squared is {$max²}
    in base $base: {$max.base($base)} squared is {$max².base($base)}
    END
    

sub repeats-in-base ( $num, $base ) {

    return ($num.base($base)
                .comb
                .Bag 
                .values
                .sort)[*-1] > 1
        ?? 1
        !! 0 ;
}

The Perl Weekly Challenge, that idyllic glade wherein we stumble upon the holes for these sweet descents, is now known as

The Weekly Challenge – Perl and Raku

It is the creation of the lovely Mohammad Sajid Anwar and a veritable swarm of contributors from all over the world, who gather, as might be expected, weekly online to solve puzzles. Everyone is encouraged to visit, learn and contribute at

https://theweeklychallenge.org