for loop extension

✍️ Written on 2024-04-24 in 1460 words.
Part of reflection cs software-development programming-languages

Motivation

A for loop is a generic concept in programming to iterate over a set of values. The concept of iteration with a while loop or for loop is common to all imperative programming languages. However, I often feel restricted by the capabilities of these loops and propose an extension.

Original idea

The original approach was to declare a loop counter with a value like 0 and then the process execute, iterate, and check is repeated. Here I am going to present the concept in C syntax. Iterate means that the loop counter is modified (commonly: incremented), check verifies that the loop value did not yet reach a certain value (if so, the loop terminates) and otherwise execute means that the “loop body” (any sequence of instructions) is run:

for (int i = 0; i < 5; i++)
{
    printf("%d\n", i);
}

int i = 0 is the initial declaration of a loop counter i. Iterate is represented by i++ and check by i < 5. The loop body is given within curly braces by printf("%d\n", i);. It does not matter what is does. The total execution in this example is:

  1. i = 0

  2. check i < 5 (true)

  3. execute printf("%d\n", i);

  4. iterate i++ (1)

  5. check i < 5 (true)

  6. execute printf("%d\n", i);

  7. iterate i++ (2)

  8. check i < 5 (true)

  9. execute printf("%d\n", i);

  10. iterate i++ (3)

  11. check i < 5 (true)

  12. execute printf("%d\n", i);

  13. iterate i++ (4)

  14. check i < 5 (true)

  15. execute printf("%d\n", i);

  16. iterate i++ (5)

  17. check i < 5 (false)

For beginners, I always point out that the exact steps matter. You need to be aware when the check happens. Does it happen before running the loop body the first time? (yes, it does) Does it run before or after the iteration? (after). This is necessary to reason about the behavior of programs.

Golang

As an alterative for loop design, I want to present Go:

values := []string{"hello", "world"}
for v, value := range values {
    fmt.Println(v);
    fmt.Println(value);
}

Here, we don’t have a loop counter anymore. Instead, we have some index called v and some item value. values is initialized as ordered list of 2 strings “hello” and “world”. v becomes 0 and value becomes hello. In the next iteration, v becomes 1 and value becomes world.

In Go, we don’t have a loop variable we need to check and iterate. In Go with the range keyword, we can just iterate over the values in a collection. But just like in our previous example, we have a for keyword as well as curly braces to denote the loop body.

Jinja2

When I was studying Python deeply, I read the source code of the Jinja2 template engine. It has a feature for for loops. I enjoyed it, made me think deeper about iteration back then and I think the feature is very appropriate for template languages:

<table>
  <tr>
{% for column in table_row %}
  {%- if loop.first %}
    <th>{{ column }}</th>
  {%- else %}
    <td class="item-{{ loop.index }}-of-{{ loop.length }}">{{ column }}</td>
  {%- endif %}
{% endfor %}
  </tr>
</table>

I am not sure you can read this fine, but the point is that Jinja2’s for loop introduces a special variable loop which carries properties you can access during the iteration. For example loop.index gives you a loop counter (in the sense like above) and loop.length gives you the total number of iterations.

Overall, the line for column in table_row is short and if you need more details about the loop iteration, the loop variable is of avail.

A proposal

My proposal is now to extend for loops in C for usecases similar to the ones found in Jinja2. It should be easy to introduce an additional keyword after the for loop body in a backwards compatible way. Here is the design, I mean:

values := []string{"hello", "world"}
for v, value := range values {
    fmt.Printf("%s", value);
}
inter {
    fmt.Printf(",")
}

With inter, I want to execute some block of code between iterations. Assume L denotes the loop body and I denotes the inter body. Then the total execution is provided as LILIL. All I sit between Ls.

We can take this further similar to Jinja2:

values := []string{"hello", "world"}
for v, value := range values {
    fmt.Printf("%s", value);
}
inter {
    fmt.Printf(",")
}
before {
    fmt.Printf("[")
}
after {
    fmt.Println("]")
}

I think the property loop.first is fine for Jinja2, but before is more appropriate in my design. The point here is that before bodies are run before the first loop iteration and after bodies are run after the final loop iteration. Why should we not just put this source code before/after the common for loop? The idea is that before/after blocks are skipped if no iteration actually happens. This makes it easier to handle these special cases. Even more so, I wish for one more case:

values := []string{"hello", "world"}
for v, value := range values {
    fmt.Printf("%s", value);
}
inter {
    fmt.Printf(",")
}
before {
    fmt.Printf("[")
}
after {
    fmt.Println("]")
}
empty {
    fmt.Println("(nothing found)")
}

An empty body would be executed only if no iteration occurs.

In summary, we could introduce additional for-loop bodies with the semantics depending on the keyword before. I also tried to convey a common usecase for these loop bodies: if you want to generate a representation including wrapping symbols and separator characters for collection elements. My recent usecase was opstr. This proposal could be implemented across many programming languages with C-based syntax in a backwards-compatible way.

Conclusion

I proposed to extend the concept of for-loops in order to address common iteration usescases better. The details would have to be worked out for every programming language individually.