Add a Little Wiggle Room

Wherein as we learn to wiggle it. Just a little bit…

THE WEEKLY CHALLENGE – PERL & RAKU #197 Task 2

“First things first: wiggle your big toe… Hard part’s over… Now, let’s get these other piggies wiggling.“

— Beatrix Kiddo, Kill Bill

Wiggle Sort

Submitted by: Mohammad S Anwar

You are given a list of integers, @list.

Write a script to perform Wiggle Sort on the given list.

Wiggle sort would be such as list[0] < list[1] > list[2] < list[3]….

Example 1

Input: @list = (1,5,1,1,6,4)
Output: (1,6,1,5,1,4)

Example 2

Input: @list = (1,3,2,2,3,1)
Output: (2,3,1,3,1,2)

METHOD

Ok, stay with me here because it’s going to be a wild ride today. You see, this “wiggle sort” isn’t really a thing. It’s not number theory, or much of any sort of mathematics, really. It’s a puzzle someone made up that uses lists of numbers. Thus, there isn’t a lot of history to draw from that might explain why we would possibly want to do this. So going in, I got nothing. We’re going to have to start from scratch.

What, then, is a wiggle sort? Beyond the immediate series of conditionals that need apply, the actual final sequencing seems pretty undefined. Even sketchy. Some lists are sortable, others not. Still others have many valid sorts available. Mathematically, the whole thing is a bit of a mess.

For example, there are a number of valid solutions for the first example in addition to the one given, such as (1,4,1,5,1,6) and (1,5,1,4,1,6). In fact there are six, and no method is given to select out a single solution.

To further complicate matters, from the same example multiple occurrences of a value are obviously allowed, but no provision is carved out for when no valid wiggle ordering can be made — a direct consequence of allowing such multiple values.

Consider, for instance, the list (2,2,2,2,2). We can plainly see that, as no value is greater than any other, no solution can exist that will satisfy the very first conditional we come across: that the second element must be greater than the first.

I believe that if the values were defined unique we could determine there would exist at minimum one solution for every array. But this is not the case, and determining whether an array is even solvable at all is not obvious.

So what can we say about whether a list is even solvable? Can we whittle away at this and find an easier challenge at the core?

Let’s see: if more than half the elements are equal we will physically run out of unequal elements to alternate and be forced to place adjacent equal values, so that won’t work. It might also be true that any array satisfying this condition can be made to work. Let’s look at some examples going up to the half-way point, using a list of six values. In each there is a group of three multiples and three others, som larger and some smaller:

    1,2,2,2,3,4 ->  2,3,2,4,1,2     two above, one below
    1,1,2,2,2,4 ->  2,4,1,2,1,2     one above, two below
    4,4,4,2,2,1 ->  1,4,2,4,2,4     all below

This all seems on-point. Things get a little more complicated, though, when an odd number of elements is considered. Then we cannot evenly divide the list into halves. And what’s happening around the half-way point becomes more complex.

As expected, (1,1,1,0,0) cannot work because the ordering (0,1,0,1,1) is the only progression possible going from left to right and it fails at the last element. Let’s look at some others, working through varying fractions of multiple items in varying relative placements:

    1,1,1,0,2   ->  no solution, when starting from either 0 or 1
    1,1,1,2,2   ->  1,2,1,2,1 ok this is interesting 
    1,1,2,0,0   ->  1,2,0,1,0 or 0,1,0,2,1 less than half is fine
    1,1,2,2,0   ->  1,2,0,2,1 or 0,2,1,2,0 multiple multiples are ok

Did you notice the second example? It seems the multiple count of a single value can in fact be be larger than half if the multiple is the smallest value. As we start any wiggle sort low, this makes sense. But there’s also a limit, as the base of the pyramid is only one element larger than the larger numbers above it. So we are limited to the larger half of the split of an odd-numbered list of elements, or ceiling(length/2). Three ones and two twos will work, but three twos and two ones will not.

This reasoning is built on odd and even numbers, and not any particular count. Further the trivial cases of 1, 2 and 3 elements all hold as well. So it’s a general rule.

I believe, then, we can now make a conjecture: if we have no element counted to over one-half the total elements, or if the count of the most numerous element is ceiling(length/2) and the value in question is also the smallest value, then there exists a valid wiggle sort to be found.

Note the ceiling function will only matter when there are an odd number of elements, and in this case the solution will start with the low value, the multiple, and alternate the other, higher values in any ordering we want, to create a valid wiggle sort. With this edge-case now defined I think we’re venturing onto a more solid footing.

In a normal list some elements may be duplicated, but none dominate. Some will be lower and higher than others. We can sort the list numerically and find the smaller half and the larger half of the numbers. There may be some overlap, but again nothing too severe. If we alternately pick from the smaller-value sublist and the larger, we have the basis of an algorithm. This sequential action of alternately taking the same indexed elements from two arrays is sometimes called zipping.

But before we get too excited, we still have hairy edges. It’s those multiples again.

You see, we can still get into trouble with an overlapping multiple spanning the middle of our numerically sorted array.

I think this will determine whether we zip from the head or the tail.

Consider the input

(1,2,3,4,2,2,5,2)

Sorted we get

(1,2,2,2,2,3,4,5)

and then split it follows we get

(1,2,2,2) and (2,3,4,5)

for our low and high values. When we zip from the two lists, we can start from either end. Usually this choice doesn’t matter, but with this list it does:

    head -> (1,2), (2,3), (2,4), (2,5)  does not work
    tail -> (2,5), (2,4), (2,3), (1,2)  works

So if the multiple in the lower half is more than half that individual subarray length, we will need to start in reverse from the tails, otherwise we can use the heads.

Finally though, in hindsight, why even do this last step? There’s nothing inherent in the idea of a wiggle sort that the result is lexicographically ascending, or even coherent for that matter. There’s no reason to select any particular pattern that fits the required inequalities. The problem occurring in the corner case I constructed is that a run of multiples in the middle can overlap if we start from the heads of the subarray halves. Selecting from the tail separates this possible overlap, moving the interfering values outward towards the ends, away from each other.

So why not do this always and just be done with it? Again, no particular wiggle sort is superior to any other. It’s an egalitarian chaos. So lets make our lives simple and do that. Out with any need for a no_duplicates() validation sub and any attempt to sort from the heads.

This tightens things up rather well. And our separate general zip() routine might as well be rolled into its wigglezip() wrapper. At this point the routine that determined whether we can construct a wiggle sort at all is the more complicated of the two remaining functions.

I get the feeling some important information was left out of this description. But I feel this should give us a satisfactory algorithm for finding one valid solution for every array that can have one.

Followup Observations

I never, as a rule, consult the web for solutions to these puzzles. That’s boring, and I’d rather be doing something else. Research, on the other hand, is a different matter. Sometimes I will do research to learn more about the terms within a problem, and other times I will look into a specific technique that I want to use to spice things up. I mean, it makes no sense to restrict my flights of fancy, now, does it? The whole point is to learn something new. And the way to do that is to dive in and see what you find.

That said my explorations into the background on this “wiggle sort” — I suspected something with little if any practical use — went as far as a quick Google search. With no Wikipedia article or anything scholarly whatsoever it became clear this was just a made-up coding puzzle. With that understanding I stopped looking and started thinking about the problem with a blank slate.

Once I had my solution, though, I did another thing this time that I rarely do: I went back to the web for a more thorough look. I was curious that I had somehow missed something obvious. The solution I came up with seems robust but also not as rigorously proven as I would like. The problem as stated is not, as I said, well defined.

As it turns out, I seem to have solved the problem on hard mode, as it appears the critical part left off the common task description is that the array given is assumed from the start to be solvable. This removes quite a bit of headache, as all the edge-cases that could go wrong quietly go away. Another simplifying condition I found several times is to allow greater/less than or equal to. This too also sidesteps the trouble of accommodating runs of equal values.

The basic idea, it turns out, of alternating picks from the high half of the sorted values and the lower, is a solid way to go. Good to know.

PERL 5 SOLUTION

use warnings;
use strict;
use utf8;
use feature ":5.26";
use feature qw(signatures);
no warnings 'experimental::signatures';

my @input = @ARGV
    ? @ARGV
    : (6,4,4,7,4,4,1,8,5,4);
say "input:  @input";
    
@input    = sort {$a<=>$b} @input;
my @small = @input[0..(@input+1)/2-1];              ## up to ceiling
my @large = @input[(@input+1)/2..@input-1];         ## rest of array

unless ( has_solution(\@input) ) {
    say "no solution possible" and exit;
}
say "sorted: ", join ' ', wigglezip( \@small, \@large )->@*;

sub has_solution ( $arr ) {
## checks to see no element is duplicated more than one-half the total
## length, or the ceiling under the specific circumstance that it is the 
## smallest value in the set
    my %freq;
    $freq{$_}++ for $arr->@*;
    my @range = sort {$b<=>$a} values %freq;      
    return 1 if $range[0] <= scalar($arr->@* )/ 2;   ## highest freq value
    if ($range[0] == int( ($arr->@* + 1)/2 )) {     ## ceiling
        my ($multiple) = grep { $freq{$_} == $range[0] } keys %freq;
        if ( min( keys %freq ) == $multiple ) {
            return 1;
        }
    }   
    
    return 0;
}

sub wigglezip ($aref1, $aref2) {
## zip two arrays from head if this makes a valid wigglesort, else reverse
## the arrays and zip, resulting with a zip from tails backwards    
    my @low  = reverse $aref1->@*;
    my @high = reverse $aref2->@*;
    my @zipped;
    
    for (0..$#low) {
        push @zipped, $low[$_];
        push @zipped, $high[$_] unless $_ > $#high; ;
    }
    
    return \@zipped;  
}

Raku Solution

The Perl Weekly Challenge, that idyllic glade wherein we stumble upon the holes for these sweet descents, is now known as

The Weekly Challenge – Perl and Raku

It is the creation of the lovely Mohammad Sajid Anwar and a veritable swarm of contributors from all over the world, who gather, as might be expected, weekly online to solve puzzles. Everyone is encouraged to visit, learn and contribute at

https://theweeklychallenge.org