A few days ago, I posted Python’s Itertools in Pure Raku and I got quite a few responses asking me to elaborate on these examples. This page will then act as a useful addendum to the Python to Raku nutshell page in the Raku docs.

Python’s itertools package is the gold standard for working with iterable streams of data.

However, Raku treats lazy lists as first class objects, so that made me start to wonder: how well does the base Raku language stack up?

To answer this question, I’m just going to go through every function in the itertools and provide a one liner Raku equivalent. These examples all work with normal iterables as well as infinite lists and sequences.

All of these examples will be presented as plain old subroutines – Raku’s equivalent of Python functions. Throughout this post, feel free to open a Raku interpreter by typing raku into your terminal and follow along. Let’s begin:

Count

count() docs.

count() takes two arguments: start and end. These are both numeric arguments, which are represented in Raku as variables with the $ sigil because they only contain one value.

To define this function, let’s start by defining a sub count:

sub count($start, $end = *) { ??? }

(??? is more or less Python’s pass, to stub out stub code.)

First off, we give $end a default value of *. Default arguments work the same as they do in Python, however they are initialized every time the function is called instead of being computed once and cached as they are in Python. (You can see the documentation for that on the Signature page).

The default value of $end here is where things get exciting. A bare * creates a Whatever object, which is a special value that many operators choose to interpret differently than other values. Let’s finish the function and we’ll see how it comes into play.

sub count($start, $end = *) {
    $start ... $end 
}

Raku, like Ruby if you’re familiar, implicitly returns the final statement in of the subroutine.

The bulk of the logic happens in the infix ... operator. This won’t be the last time we see ..., so it might be nice to let that those docs sink in for a minute.

In this situation, we use ... to do two things: in the case that $end is provided, it simply creates a list from $start to $end. In the case that $end is not provided, it creates an infinite sequence from $start to whatever (*).

Cycle

cycle() docs.

While count() took a scalar argument, cycle() takes a list which thusly must be wrapped using the positional sigil @. It repeats this list infinitely. So, our subroutine definition will look like this:

sub cycle(@p) { ??? }

I’ve got two different ways of writing this subroutine: an explicit method and an implicit method.

The explicit method

We’re going to use Raku’s gather / take control flow structures to do this explicitly. gather tells Raku that the following block is going to generate a sequence, and take yields a value in the sequence – just like Python’s yield.

sub cycle(@p) { 
    gather loop { 
        for @p -> $p {
            take $p
        }
    }
}

This definition should be rather readable to a Python programmer. A few things to pay attention to: the loop control flow structure is just an infinite loop, and the for control flow structure used here is just like Python’s for ... in construct.

The implicit method

(Thank you to CIAvash on the Raku IRC!)

Here’s where things get fun:

sub cycle(@p) { 
    |@p xx * 
}

There’s a lot going on here, but in English, this reads like “concatenate (slip) infinite copies of @p”.

The prefix | turns our @p into a Slip, which is a class that automatically flattens the list that it’s inserted into. The Raku-ism for concatenating two lists is to coerce them both into slips and create a new list out of those, which looks like: |@a, |@b for some lists @a and @b.

Once we’ve created the Slip from @p, we concatenate an infinite amount of them by passing whatever (*) to the list repetition operator infix xx.

Repeat

repeat() docs.

We’ve actually already seen everything we need to create repeat(), so let’s just do it!

sub repeat($elem, $n = *) {
    $elem xx $n
}

Here are some links if you need a refresher: Whatever object (*), and infix xx.

Accumulate

accumulate() docs.

Accumulate is the first function that we’ve seen that takes a predicate function instead of a scalar or list-like value. To pass in a predicate function, we can use the callable sigil & to tell Raku that the argument we’re passing in can be executed. Our subroutine signature’s now going to look like this:

sub accumulate(@p, &func = * + *) { ??? }

This time, we’re setting the default argument of &func to * + *. If you look at Python’s default argument for accumulate, you’ll see that they’re using a default argument of operator.add: a function which adds two values which are passed to it.

If you’ve made the jump and guessed that somehow * + * is a function which takes two arguments and adds them together, you’d be 100% correct. Using whatever (*) in a statement actually coerces the entire statement to a WhateverCode object and allows it to act as a function in its own right. If all these stars are making you see stars, the Raku Advent Calendar blog has a good post disambiguating all of them.

Now that we understand accumulate’s signature, let’s move on to the body of the function:

sub accumulate(@p, &func = * + *) {
    [\[&func]] @p
}

If you’re an APL programmer, using \ in an accumulator should be ringing a bell to you. This is a little bit simpler than it seems: [ ] is the reduction metaoperator. In order to use a non-operator callable inside of it we must surround that callable with an extra pair of brackets, and in order accumulate intermediate results we use a \ inside of the metaoperator itself. That’s all that’s going on here.

Chain

chain() / chain.from_iterable() docs.

This is the default behavior of slurpy arguments.

sub chain(*@p) {
    @p
}

Compress

compress() docs.

This one might take a little bit to build up to, so let’s take it step by step until we’ve built the whole function. The final product looks like this:

sub compress(@d, @s) {
    flat @d Zxx (+<<?<<@s)
}

Its operation is easy enough to explain in English. For every element in @d, we return it if its corresponding value in @s is truthy. Let’s start with a much easier question. How do we tell which values in @s are truthy?

Raku has the prefix ? operator which coerces its argument to a boolean. The only problem is that it coerces the whole argument, meaning it coerces lists to a single value:

> (0,1,2,3).WHAT
(List)
> ?(0,1,2,3)
True

In other words, we want to be able to coerce every element individually to a bool, not the whole thing. There are a couple ways to do this. We could use a classic for loop, we could use map, or we could use hyper operators. Just like the reduction metaoperator [ ] from before, you can make any operator into a hyper operator by using << and >>. Let’s see how this changes things:

> (0,1,2,3).WHAT
(List)
> ?<<(0,1,2,3)
(False True True True)

Aha! It’s exactly what we want. Let’s use the same trick to coerce them back to numbers, using the numeric context operator prefix +:

> +<<?<<(0,1,2,3)
(0 1 1 1)

Again, an APL programmer will see exactly where I’m going with this. Using the list we’ve created to replicate elements in @s will give us exactly what we want from compress. To do this, we can use the zip metaoperator Z to pair off corresponding elements in each list automatically. Combining this with the list repetition operator infix xx that we learned about earlier gets us very close to what we need:

> (0,1,2,3) Zxx (0,2,4,6)
(() (1 1) (2 2 2 2) (3 3 3 3 3 3))

Now we just have to flatten the final list with flat:

> flat (0,1,2,3) Zxx (0,2,4,6)
(1 1 2 2 2 2 3 3 3 3 3 3)

And once we put the rest of the pieces together, we’re done!

> flat (0,1,2,3) Zxx +<<?<<(0,2,4,6)
(1 2 3)

(Note: functional programmers may notice that we could have instead used a single call to flatmap. If you give this a try, let me know 😉)

Drop while

dropwhile() docs.

sub dropwhile(&pred, @seq) {
    gather for @seq {
        take $_ if (none &pred) ff *
    }
}

(Thanks to Rogue from the Raku discord server for a correction in this section.)

We’ve seen a lot of this before! Here are some refreshers if you need them: the callable sigil &, the gather / take control flow structures, the Whatever object (*).

That for loop looks a little bit different than the one we’ve already seen. It doesn’t have a current iteration variable! That’s like writing a Python loop for value in list as for list… which makes no sense in Python, but it makes perfect sense in Raku! Raku has a special variable called $_ which is called the “topic variable”. $_ gets set to whatever you’re currently talking about in your code – in for loops it is the current loop variable, in given blocks it’s the given variable, in smartmatches it’s the left hand side, etc, etc, etc.

So dropwhile simply says “take (and yield, if you will) the $_ variable if (none &pred) ff * holds”.

(none &pred) ff * uses two things you haven’t seen before: the none junction and the flip-flop operator infix ff.

Junctions are another ball game entirely, and if you’d like to learn more about them, I wrote another blog post here: GADTs and Superpositions in Raku. The import gist here is that the none junction is only true if all of its constituents are false. Its one constituent is, in this case, the &pred callable.

Once none &pred returns True (meaning once &pred returns false), the flip-flop operator, well, flip-flops. By default, the flip-flop operator always returns False until its left side returns True, in which it’ll return True until its right side returns True. It bounces back and forth between these two conditions forever.

We can override ff’s default functionality however by passing whatever (*) to the right side of it. This makes ff only flip-flop once and never again, returning True for the rest of time once the left side returns True.

Filter false

filterfalse() docs.

This is a builtin: the grep method, using a none Junction.

Group by

groupby() docs.

This is a builtin: the categorize method or the classify method.

Islice

islice() docs.

This is a builtin: basic positional list slices are capable of this.

Starmap

starmap() docs.

sub starmap(&func, @seq) {
    @seq>>.&{ func(|$_) }
}

You’ve seen almost everything here except for the methodop .& operator, allowing us to call our { func(|$_) } block as a method.

Take while

takewhile() docs.

sub takewhile(&pred, @seq) {
    |@seq ...^ { !pred($_) }
}

Some refreshers if you need them: the infix ...^ operator, the prefix | operator, and the Block object.

Tee

tee() docs.

Not really sure that this one makes sense to implement, as we’re technically working with lazy lists for the most part here and not generated sequences.

For that matter, Seq does provide a builtin, the cache method, that may be used effectively the same way in practice.

Zip longest

zip_longest() docs.

(todo)

Product

product() docs.

sub product(+p) {
    [X] p
}

There’s a few new things to introduce here. Before this, we used single star (*@) slurpy arguments. I opted to use a different kind of slurpy argument just to show that it exists. We then reduce our list p using the cross product operator infix X to create all of our cross products.

Permutations

permutations() docs.

This is a builtin: the permutations method.

Combinations

combinations() docs.

This is a builtin: the combinations method.

Combinations with replacements

combinations_with_replacement() docs.

sub combinations_with_replacement(@p, $r) {
    |@p.combinations($r), |([Z] @p xx $r)
}

Left as an exercise to the reader 😊.


Well, that’s about all of them. Every itertools function written on one page in pure Raku. Hope you enjoyed and maybe learned something!

Message me using the contact info below, if you’d like.

If you found this useful, why not toss me a few bucks to support my blogging habit?

Comments


/u/raiph from the Raku subreddit mentioned the Inline::Python library:

use itertools:from<Python> ;
say count(10) ; # 10 11 12 13 14 ...
say cycle('ABCD') ; # A B C D A B C D ...