Implementing an iterator in Rust

Iterator is a thing in Rust that allows you to… well, iterate over things. It can be very useful to write compact and optimal code.

For example

let a: Vec<u32> = (1..1024).collect();
let b = a.iter()
  .map(|&x| x*x)
  .take(5)
  .collect();
// b == vec![1, 4, 9, 16, 25]);

this code guarantees you it does only two exact allocations:

one allocation of 1023 u32s for a
one allocation of 5 u32s for b

In contrast in this code

let mut a = Vec::with_capacity(1023);
for i in 1..1024 {
  a.push(i);
}

let mut b = Vec::with_capacity(5);
for i in 0..5 {
  let x = a[i] * a[i];
  b.push(x);
}

even though push() should not re-allocate the Vecs, it always checks if there is some space left
a[i] always checks if the given index is not out of bounds

Plus it is longer and a bit less readable (at least for me)

So, what does it take to implement an iterator?

Not so much I would say - the only required method is next() A very simple iterator, that returns values of fibonacci sequence would look like this

struct FibIter {
  x: [u32; 2],
}

impl FibIter {
  fn new() -> Self {
    Self{ x: [1, 1] }
  }
}

impl Iterator for FibIter {
  type Item = u32;

  fn next(&mut self) -> Option<Self::Item> {
    let ret = self.x[0];
    self.x[0] = self.x[1];
    self.x[1] = ret.wrapping_add(self.x[0]);
    return Some(ret);
  }
}

And thats it! Now it can be used like this

for x in FibIter::new().take(5) {
  println!("{}", x);
}

let x = FibIter::new()
  .skip(5)
  .take(6)
  .collect();

But there is one caveat: even though next() is so implemented that it always returns Some(), computers are ~~stupid~~ of limited intelligence and rely on size_hint functions when pre-allocating memory inside collect() method

So, this should fix it, right?

impl Iterator for FibIter {
  // ...
  fn size_hint(&self) -> (usize, Option<usize>) {
    (usize::MAX, None)
  }
}

FibIter now reports that user should expect usize::MAX or more elements, so in the case .take(6).collect(), Vec should be pre-allocated to fit six elements and then the collection proccess should happen without reserve()ing additional space. This would help unrolling/vectorizing the loop, because reserve() checks if it can expand memory (if it can’t, it just panic!()s).

But godbolt says that the reserve() is still there in loop (in form of __rust_realloc and __rust_alloc)

.LBB5_14:
        mov     rax, r12
        add     rax, r13
        jb      .LBB5_29
        lea     rcx, [r12 + r12]
        cmp     rcx, rax
        cmova   rax, rcx
        cmp     rax, 4
        mov     ecx, 4
        cmovbe  rax, rcx
        xor     esi, esi
        mul     rcx
        setno   cl
        jo      .LBB5_29
        mov     rbx, rax
        mov     rax, qword ptr [rsp + 8]
        test    rax, rax
        je      .LBB5_22
        shl     r12, 2
        cmp     r12, rbx
        je      .LBB5_27
        test    r12, r12
        je      .LBB5_19
        mov     edx, 4
        mov     rdi, rax
        mov     rsi, r12
        mov     rcx, rbx
        call    qword ptr [rip + __rust_realloc@GOTPCREL]
        jmp     .LBB5_26
.LBB5_11:
        mov     rax, qword ptr [rsp + 8]
        mov     dword ptr [rax + 4*r14 - 4], ebp
        mov     qword ptr [rsp + 24], r14
        cmp     r8, r14
        jne     .LBB5_13
        jmp     .LBB5_35
.LBB5_22:
        mov     sil, cl
        shl     rsi, 2
        test    rbx, rbx
        jne     .LBB5_25
        mov     rax, rsi
        jmp     .LBB5_27
.LBB5_19:
        test    rbx, rbx
        je      .LBB5_20
        mov     esi, 4
.LBB5_25:
        mov     rdi, rbx
        call    qword ptr [rip + __rust_alloc@GOTPCREL]
.LBB5_26:
        mov     r8, qword ptr [rsp]
.LBB5_27:
        test    rax, rax
        je      .LBB5_30
.LBB5_28:
        mov     qword ptr [rsp + 8], rax
        shr     rbx, 2
        mov     qword ptr [rsp + 16], rbx
        mov     dword ptr [rax + 4*r14 - 4], ebp
        mov     qword ptr [rsp + 24], r14
        cmp     r8, r14
        je      .LBB5_35
.LBB5_13:
        add     r13, -1
        add     ebp, r15d
        mov     r12, qword ptr [rsp + 16]
        add     r14, 1
        mov     eax, ebp
        mov     ebp, r15d
        mov     r15d, eax
        lea     rax, [r14 - 1]
        cmp     rax, r12
        jne     .LBB5_11
        jmp     .LBB5_14

Is ExactSizeIterator the answer here?

No.

ummm.. maybe it will work with FusedIterator?

No.

Turns out that value from size_hint can’t be trusted. For example a programmer could leave a default implementation of it

fn size_hint(&self) -> (usize, Option<usize>) {
  (0, None)
}

which is correct for every iterator, but is not precise. More so, because implementation of it could return random value and because of “safe code must be safe” rule, it can’t be used to build a vector safely from it without additional checks.

However, there is still hope - TrustedLen

TrustedLen is an unsafe trait, but unsafe doesn’t mean it will erase hard drive (although it can), unsafe often means that the compiler can’t prove if the code is correct, so programmer has to do it instead.

I am sure that implementation is correct,

#![feature(trusted_len)]

struct FibIter {
  x: [u32; 2],
}

impl FibIter {
  fn new() -> Self {
    Self{ x: [1u32; 2] }
  }
}

impl Iterator for FibIter {
  type Item = u32;
  
  fn next(&mut self) -> Option<Self::Item> {
    let ret = self.x[0];
    self.x[0] = self.x[1];
    self.x[1] = ret.wrapping_add(self.x[0]);
    return Some(ret);
  }
  
  fn size_hint(&self) -> (usize, Option<usize>) {
    (usize::MAX, None)
  }
}

use std::iter::{
    FusedIterator,
    ExactSizeIterator,
    TrustedLen,
};

impl FusedIterator for FibIter {}
impl ExactSizeIterator for FibIter {}
unsafe impl TrustedLen for FibIter {}

however, the proof of correctness of this code is trivial and left for reader as an exercise.

Now the loop of this function

pub fn collect_fib(n: usize) -> Vec<u32> {
  FibIter::new().take(n).collect()
}

looks like this

.LBB3_11:
        mov     dword ptr [rsi], edi
        add     edi, ebx
        mov     dword ptr [rsi + 4], ebx
        add     ebx, edi
        mov     dword ptr [rsi + 8], edi
        add     edi, ebx
        mov     dword ptr [rsi + 12], ebx
        add     ebx, edi
        mov     dword ptr [rsi + 16], edi
        add     edi, ebx
        mov     dword ptr [rsi + 20], ebx
        add     ebx, edi
        mov     dword ptr [rsi + 24], edi
        add     edi, ebx
        mov     dword ptr [rsi + 28], ebx
        add     ebx, edi
        add     rsi, 32
        add     rax, 8
        jne     .LBB3_11

No reserve(), no additional checks, loop unrolling, pure beauty! It has only one drawback - it compiles only on nightly compiler because of trusted_len feature

Thanks to kind and helpful people on unofficial rust discord channel that helped me to solve this issue. If you are interested in Rust and you are not on this server, you should definetely check out.

And if you want to dive deeper into Rust iterators check out this youtube video