Explaining Raku using Python's Itertools
A few days ago, I posted Python’s Itertools in Pure Raku and I got quite a few responses asking me to elaborate on these examples. This page will then act as a useful addendum to the Python to Raku nutshell page in the Raku docs.
Python’s itertools
package is the gold standard for working with iterable streams of data.
However, Raku treats lazy lists as first class objects, so that made me start to wonder: how well does the base Raku language stack up?
To answer this question, I’m just going to go through every function in the itertools
and provide a one liner Raku equivalent. These examples all work with normal iterables as well as infinite lists and sequences.
All of these examples will be presented as plain old subroutines – Raku’s equivalent of Python functions. Throughout this post, feel free to open a Raku interpreter by typing raku
into your terminal and follow along. Let’s begin:
Count
count()
takes two arguments: start
and end
. These are both numeric arguments, which are represented in Raku as variables with the $
sigil because they only contain one value.
To define this function, let’s start by defining a sub count
:
sub count($start, $end = *) { ??? }
(???
is more or less Python’s pass
, to stub out stub code.)
First off, we give $end
a default value of *
. Default arguments work the same as they do in Python, however they are initialized every time the function is called instead of being computed once and cached as they are in Python. (You can see the documentation for that on the Signature page).
The default value of $end
here is where things get exciting. A bare *
creates a Whatever object, which is a special value that many operators choose to interpret differently than other values. Let’s finish the function and we’ll see how it comes into play.
sub count($start, $end = *) {
$start ... $end
}
Raku, like Ruby if you’re familiar, implicitly returns the final statement in of the subroutine.
The bulk of the logic happens in the infix ...
operator. This won’t be the last time we see ...
, so it might be nice to let that those docs sink in for a minute.
In this situation, we use ...
to do two things: in the case that $end
is provided, it simply creates a list from $start
to $end
. In the case that $end
is not provided, it creates an infinite sequence from $start
to whatever (*
).
Cycle
While count()
took a scalar argument, cycle()
takes a list which thusly must be wrapped using the positional sigil @
. It repeats this list infinitely. So, our subroutine definition will look like this:
sub cycle(@p) { ??? }
I’ve got two different ways of writing this subroutine: an explicit method and an implicit method.
The explicit method
We’re going to use Raku’s gather / take
control flow structures to do this explicitly. gather
tells Raku that the following block is going to generate a sequence, and take
yields a value in the sequence – just like Python’s yield
.
sub cycle(@p) {
gather loop {
for @p -> $p {
take $p
}
}
}
This definition should be rather readable to a Python programmer. A few things to pay attention to: the loop
control flow structure is just an infinite loop, and the for
control flow structure used here is just like Python’s for ... in
construct.
The implicit method
(Thank you to CIAvash on the Raku IRC!)
Here’s where things get fun:
sub cycle(@p) {
|@p xx *
}
There’s a lot going on here, but in English, this reads like “concatenate (slip) infinite copies of @p
”.
The prefix |
turns our @p
into a Slip, which is a class that automatically flattens the list that it’s inserted into. The Raku-ism for concatenating two lists is to coerce them both into slips and create a new list out of those, which looks like: |@a, |@b
for some lists @a
and @b
.
Once we’ve created the Slip from @p
, we concatenate an infinite amount of them by passing whatever (*
) to the list repetition operator infix xx
.
Repeat
We’ve actually already seen everything we need to create repeat()
, so let’s just do it!
sub repeat($elem, $n = *) {
$elem xx $n
}
Here are some links if you need a refresher: Whatever object (*
), and infix xx
.
Accumulate
Accumulate is the first function that we’ve seen that takes a predicate function instead of a scalar or list-like value. To pass in a predicate function, we can use the callable sigil &
to tell Raku that the argument we’re passing in can be executed. Our subroutine signature’s now going to look like this:
sub accumulate(@p, &func = * + *) { ??? }
This time, we’re setting the default argument of &func
to * + *
. If you look at Python’s default argument for accumulate, you’ll see that they’re using a default argument of operator.add
: a function which adds two values which are passed to it.
If you’ve made the jump and guessed that somehow * + *
is a function which takes two arguments and adds them together, you’d be 100% correct. Using whatever (*
) in a statement actually coerces the entire statement to a WhateverCode
object and allows it to act as a function in its own right. If all these stars are making you see stars, the Raku Advent Calendar blog has a good post disambiguating all of them.
Now that we understand accumulate
’s signature, let’s move on to the body of the function:
sub accumulate(@p, &func = * + *) {
[\[&func]] @p
}
If you’re an APL programmer, using \
in an accumulator should be ringing a bell to you. This is a little bit simpler than it seems: [ ]
is the reduction metaoperator. In order to use a non-operator callable inside of it we must surround that callable with an extra pair of brackets, and in order accumulate intermediate results we use a \
inside of the metaoperator itself. That’s all that’s going on here.
Chain
chain()
/ chain.from_iterable()
docs.
This is the default behavior of slurpy arguments.
sub chain(*@p) {
@p
}
Compress
This one might take a little bit to build up to, so let’s take it step by step until we’ve built the whole function. The final product looks like this:
sub compress(@d, @s) {
flat @d Zxx (+<<?<<@s)
}
Its operation is easy enough to explain in English. For every element in @d
, we return it if its corresponding value in @s
is truthy. Let’s start with a much easier question. How do we tell which values in @s
are truthy?
Raku has the prefix ?
operator which coerces its argument to a boolean. The only problem is that it coerces the whole argument, meaning it coerces lists to a single value:
> (0,1,2,3).WHAT
(List)
> ?(0,1,2,3)
True
In other words, we want to be able to coerce every element individually to a bool, not the whole thing. There are a couple ways to do this. We could use a classic for loop, we could use map
, or we could use hyper operators. Just like the reduction metaoperator [ ]
from before, you can make any operator into a hyper operator by using <<
and >>
. Let’s see how this changes things:
> (0,1,2,3).WHAT
(List)
> ?<<(0,1,2,3)
(False True True True)
Aha! It’s exactly what we want. Let’s use the same trick to coerce them back to numbers, using the numeric context operator prefix +
:
> +<<?<<(0,1,2,3)
(0 1 1 1)
Again, an APL programmer will see exactly where I’m going with this. Using the list we’ve created to replicate elements in @s
will give us exactly what we want from compress
. To do this, we can use the zip metaoperator Z
to pair off corresponding elements in each list automatically. Combining this with the list repetition operator infix xx
that we learned about earlier gets us very close to what we need:
> (0,1,2,3) Zxx (0,2,4,6)
(() (1 1) (2 2 2 2) (3 3 3 3 3 3))
Now we just have to flatten the final list with flat
:
> flat (0,1,2,3) Zxx (0,2,4,6)
(1 1 2 2 2 2 3 3 3 3 3 3)
And once we put the rest of the pieces together, we’re done!
> flat (0,1,2,3) Zxx +<<?<<(0,2,4,6)
(1 2 3)
(Note: functional programmers may notice that we could have instead used a single call to flatmap
. If you give this a try, let me know 😉)
Drop while
sub dropwhile(&pred, @seq) {
gather for @seq {
take $_ if (none &pred) ff *
}
}
(Thanks to Rogue from the Raku discord server for a correction in this section.)
We’ve seen a lot of this before! Here are some refreshers if you need them: the callable sigil &
, the gather / take
control flow structures, the Whatever object (*
).
That for
loop looks a little bit different than the one we’ve already seen. It doesn’t have a current iteration variable! That’s like writing a Python loop for value in list
as for list
… which makes no sense in Python, but it makes perfect sense in Raku! Raku has a special variable called $_
which is called the “topic variable”. $_
gets set to whatever you’re currently talking about in your code – in for
loops it is the current loop variable, in given
blocks it’s the given variable, in smartmatches it’s the left hand side, etc, etc, etc.
So dropwhile
simply says “take (and yield, if you will) the $_
variable if (none &pred) ff *
holds”.
(none &pred) ff *
uses two things you haven’t seen before: the none
junction and the flip-flop operator infix ff
.
Junctions are another ball game entirely, and if you’d like to learn more about them, I wrote another blog post here: GADTs and Superpositions in Raku. The import gist here is that the none
junction is only true if all of its constituents are false. Its one constituent is, in this case, the &pred
callable.
Once none &pred
returns True (meaning once &pred
returns false), the flip-flop operator, well, flip-flops. By default, the flip-flop operator always returns False until its left side returns True, in which it’ll return True until its right side returns True. It bounces back and forth between these two conditions forever.
We can override ff
’s default functionality however by passing whatever (*
) to the right side of it. This makes ff
only flip-flop once and never again, returning True for the rest of time once the left side returns True.
Filter false
This is a builtin: the grep
method, using a none
Junction.
Group by
This is a builtin: the categorize
method or the classify
method.
Islice
This is a builtin: basic positional list slices are capable of this.
Starmap
sub starmap(&func, @seq) {
@seq>>.&{ func(|$_) }
}
You’ve seen almost everything here except for the methodop .&
operator, allowing us to call our { func(|$_) }
block as a method.
Take while
sub takewhile(&pred, @seq) {
|@seq ...^ { !pred($_) }
}
Some refreshers if you need them: the infix ...^
operator, the prefix |
operator, and the Block
object.
Tee
Not really sure that this one makes sense to implement, as we’re technically working with lazy lists for the most part here and not generated sequences.
For that matter, Seq
does provide a builtin, the cache
method, that may be used effectively the same way in practice.
Zip longest
(todo)
Product
sub product(+p) {
[X] p
}
There’s a few new things to introduce here. Before this, we used single star (*@
) slurpy arguments. I opted to use a different kind of slurpy argument just to show that it exists. We then reduce our list p
using the cross product operator infix X
to create all of our cross products.
Permutations
This is a builtin: the permutations
method.
Combinations
This is a builtin: the combinations
method.
Combinations with replacements
combinations_with_replacement()
docs.
sub combinations_with_replacement(@p, $r) {
|@p.combinations($r), |([Z] @p xx $r)
}
Left as an exercise to the reader 😊.
Well, that’s about all of them. Every itertools
function written on one page in pure Raku. Hope you enjoyed and maybe learned something!
Message me using the contact info below, if you’d like.
If you found this useful, why not toss me a few bucks to support my blogging habit?
Comments
/u/raiph from the Raku subreddit mentioned the Inline::Python
library:
use itertools:from<Python> ;
say count(10) ; # 10 11 12 13 14 ...
say cycle('ABCD') ; # A B C D A B C D ...