August 2022 – Programming Excursions in Perl and Raku

Wherein we follow the lines wherever they lead us: up, down or off the page…

THE WEEKLY CHALLENGE – PERL & RAKU #179 Task 2

“I like the sparkle of the vibraphone.”

— Evelyn Glennie

Unicode Sparkline

Submitted by: Mohammad S Anwar

You are given a list of positive numbers, @n.

Write a script to print a sparkline in Unicode for the given list of numbers.

Background

Edward Tufte coined the term sparklines in 2006 for “data-intense, design-simple, word-sized graphics”, embedded seamlessly within a line of text. The idea is to convey simple graphical information, such as data trending, quickly and without breaking the reader’s spatial attention on the page. Without specific axis labeling one can, for instance, plainly see the basic shaping of a value as it varies.

“Sparklines”, Tufte says, “can be placed anywhere that words or numbers or graphics can be placed: in sentences, maps, graphics, tables.”

Individual words each contain small portions of information. Generally the information content of a single word is incomplete, but by grouping words together in a linear way more complex ideas can be developed. Sparklines similarly contain a small kernel of useful information that, incomplete on its own, can be combined inline into a written context to add additional meaning to the phrase. Rather than forcing the reader to jump back and forth between a sideboxed graphic and its textual foundation to provide synthetic data context, the sparkline can provide a small significant part of a larger data analysis immediately when required, with the context self-evident.

The widespread adoption of Unicode has greatly expanded the character sets available to typographers, including graphical primitives. This has opened up the possibilities of directly creating character-based “words” constructed from these elements, allowing simple graphing based on dimensions and toning density. Sparklines, then, can be constructed from characters and can textually exist the same as any other, more normal word, allowing them to be seamlessly integrated directly into a line of text.

METHOD

Typographers have been exploring the Unicode character set for various character collections suitable for conveying graphical information, but one such grouping has stood out from the others and received the lion’s share of attention.

The sequence of 8 Unicode characters at the code points U+2581 through U+2588 provides a set of box graphics of increasing vertical sizing that can be used to construct simple bar-graphs, and these in turn can be used to create sparklines. Thus when given a data set of numerical values, the data-processing challenge is to divide the range into 8 equal segments and assign the various elements across them. From these a sparkline can then be constructed as a string of characters. Without vertical axis markings the graph will present itself accurately but with limited precision, however we can clearly show general upward or downward trending in the data, or concentrations of values, for instance, without relying on specific axis registration marks.

The problem then becomes one of normalizing the data. We need an overall range for the values provided, and from that we can then derive a conversion factor to map this range to the range 1 through 8. We then apply this conversion factor (and offset) to the data and take the floor value, yielding an integer index 0-7. We then use this index to find the appropriate character.

The conversion is not unlike converting from Celsius to Fahrenheit. Our temperatures as °F are the measurement in °C × 9/5, as 100 units in °C map to 180 units in °F, however there is also an offset, in that the freezing point of water, 0°C, starts at 32°F. This would be because Fahrenheit decided to use salted ice slurry for his lower bound, for the curious. So after we convert our scale from 0-100 to 0-180, we then need to add that 32 to get the proper result.

So in the process, generalized, we end up with:

input_range  = input_max  - input_min
output_range = output_max - output_min = 7 - 0 = 7
input_offset = input - input_min

output = (output_range/input_range) + input_offset;

From there we want the integer component to get range 0-7 for an index so we subtract 1.

PERL 5 SOLUTION

Note that we want 8 zones, which have 9 division points, including the extremes. The largest value belongs in the highest octant, but since it sits on the edge the algorithm wants to place it one above. We allow for this by duplicating the 8th shape into the 9th position in the array for this one value. Ultimately all of the border values will need to be rounded one way or the other, and doing it this way simplifies matters.

use warnings;
use strict;
use utf8;
use feature ":5.26";
use feature qw(signatures);
no warnings 'experimental::signatures';
use open ':std', ':encoding(UTF-8)';

use List::MoreUtils qw( minmax );

my @input = (15,2500,35,-4500,55,65,75,8500);

say "@input";
say sparkle( \@input );


sub sparkle( $arr ) {
    my @sparks = qw( ▁ ▂ ▃ ▄ ▅ ▆ ▇ █ █);

    my ($min, $max) = minmax( $arr->@* );
    my $out;

    for ($arr->@*) {
        ## output = (output_range/input_range) + input_offset;
        my $idx = int( (7/($max - $min)) * ($_ - $min) );
        $out .= $sparks[ $idx ];
    }
    return $out;
}

The Perl Weekly Challenge, that idyllic glade wherein we stumble upon the holes for these sweet descents, is now known as

The Weekly Challenge – Perl and Raku

It is the creation of the lovely Mohammad Sajid Anwar and a veritable swarm of contributors from all over the world, who gather, as might be expected, weekly online to solve puzzles. Everyone is encouraged to visit, learn and contribute at

https://theweeklychallenge.org

Month: August 2022

Glitter Bombs and Sparkling Lines