Incrementing Iterators in C++
C++ inherits from C two increment operators: suffix (i++) and prefix (++i). The difference between the two is whether the value of the expression is that of the incremented variable before the increment (suffix version) or after (prefix version). Practically speaking, though, it's fairly rare (at least in my experience) to see either operator used within a larger expression; instead they are frequently used by themselves as the counting expression in a for loop:
for(int i=0; i!=n; i++)
foo(i);
A common case in C++ is that the loop induction variable is actually an iterator object:
std::vector<int> v;
//somehow populate v
for(std::vector<int>::iterator it=v.begin(), end=v.end();
it!=end; it++)
//do something with *it
In this case there is still no semantic difference between using it++ and ++it, but a common fear is that there will be a detectable difference in runtime performance, since the suffix increment operator now involves constructing a new iterator object which is a copy of the existing one, doing the increment (whatever that may entail), and the returning the copy. This potentially introduces a lot of extra code, so it's reasonable to be unsure whether one's compiler will be capable of figuring out that only the operations for the actual increment are needed and eliminate the rest. Therefore, the conventional wisdom is that when using iterators one should always favor the prefix increment form1.
Uncertainty doesn't seem, however, a particularly good basis for a rule. It would be better, surely, to actually measure whether there is a meaningful difference in the code generated from the two increment operators. Let's try it out with a program like this (iterate.cpp):
#include <iostream>
#include <deque>
#include <sys/time.h>
volatile int dummy=0;
int main(){
typedef std::deque<int> container_type;
unsigned int nIter=2000;
container_type v;
timeval start,end;
for(int i=0; i<10000; i++)
v.push_back(i);
//suffix version
gettimeofday (&start, NULL);
for(unsigned int i=0; i<nIter; i++){
container_type::const_iterator it=v.begin(), end=v.end();
for(;it!=end;it++)
dummy+=*it;
}
gettimeofday (&end, NULL);
uint64_t microseconds = (end.tv_sec*1e6 + end.tv_usec)-(start.tv_sec*1e6 + start.tv_usec);
std::cout << "suffix: " << microseconds << " µs" << std::endl;
//prefix version
gettimeofday (&start, NULL);
for(unsigned int i=0; i<nIter; i++){
container_type::const_iterator it=v.begin(), end=v.end();
for(;it!=end;++it)
dummy+=*it;
}
gettimeofday (&end, NULL);
microseconds = (end.tv_sec*1e6 + end.tv_usec)-(start.tv_sec*1e6 + start.tv_usec);
std::cout << "prefix: " << microseconds << " µs" << std::endl;
//subscript with suffix
gettimeofday (&start, NULL);
for(unsigned int i=0; i<nIter; i++){
unsigned int j=0, max=v.size();
for(;j!=max;j++)
dummy+=v[j];
}
gettimeofday (&end, NULL);
microseconds = (end.tv_sec*1e6 + end.tv_usec)-(start.tv_sec*1e6 + start.tv_usec);
std::cout << "[] with suffix: " << microseconds << " µs" << std::endl;
//subscript with prefix
gettimeofday (&start, NULL);
for(unsigned int i=0; i<nIter; ++i){
unsigned int j=0, max=v.size();
for(;j!=max;j++)
dummy+=v[j];
}
gettimeofday (&end, NULL);
microseconds = (end.tv_sec*1e6 + end.tv_usec)-(start.tv_sec*1e6 + start.tv_usec);
std::cout << "[] with prefix: " << microseconds << " µs" << std::endl;
return(0);
}
Sorry about the length, but I wanted to also include using the subscript operator as well, for a total of four methods tested: iterators with prefix increment, iterators with suffix increment, subscripting with a prefix incremented integer, and subscripting with a suffix incremented integer. Obviously the compiler should have no trouble getting the latter two cases to perform the same, also it's interesting to compare the iterator methods with subscripts. In this case I deliberately chose std::deque as the container over which to iterate since it is a more complicated data structure (than, say, std::vector) and so its iterators probably can't just be typedef'ed pointers and entail some real work to construct and increment. I ran this program compiled with different compilers and different optimization levels using the following script:
#!/bin/sh
for compiler in g++ clang++
do
$compiler --version
for opt in 0 1 2 3 s
do
echo O$opt:
$compiler -O$opt -o iterate iterate.cpp
./iterate
done
echo
done
and I got the following results:
i686-apple-darwin10-g++-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3)
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
O0:
suffix: 643554 µs
prefix: 292411 µs
[] with suffix: 2314753 µs
[] with prefix: 2315917 µs
O1:
suffix: 49213 µs
prefix: 78978 µs
[] with suffix: 105684 µs
[] with prefix: 95365 µs
O2:
suffix: 49733 µs
prefix: 50519 µs
[] with suffix: 70027 µs
[] with prefix: 67773 µs
O3:
suffix: 49607 µs
prefix: 49721 µs
[] with suffix: 78863 µs
[] with prefix: 77730 µs
Os:
suffix: 49711 µs
prefix: 52426 µs
[] with suffix: 235492 µs
[] with prefix: 224849 µs
clang version 3.0 (trunk 132379)
Target: x86_64-apple-darwin10.6.0
Thread model: posix
O0:
suffix: 480877 µs
prefix: 329351 µs
[] with suffix: 2642234 µs
[] with prefix: 2623845 µs
O1:
suffix: 342572 µs
prefix: 245188 µs
[] with suffix: 1890256 µs
[] with prefix: 1888221 µs
O2:
suffix: 49051 µs
prefix: 50030 µs
[] with suffix: 69217 µs
[] with prefix: 68275 µs
O3:
suffix: 49109 µs
prefix: 49505 µs
[] with suffix: 69709 µs
[] with prefix: 68014 µs
Os:
suffix: 50275 µs
prefix: 49196 µs
[] with suffix: 156027 µs
[] with prefix: 155124 µs
There are some fluctuations here since the sample size isn't terribly large and I didn't go to great lengths to stop other processes using the CPU while I was measuring, but the major question is clearly answered: at reasonable optimization levels (-O2, -Os) both (my relatively old version of) gcc and clang/llvm are smart enough to make suffix incremented operators just as efficient as prefix incremented. Other observations include the facts that the form of the increment is irrelevant for integer counters (as expected), and that for a somewhat complicated data structure like std::deque2 using iterators really is more efficient than the overloaded subscript operator. (A side observation is that there seems to be more of a gradation in clang's optimization levels, since it made some but not all of the possible improvement at -O1, whereas gcc gave essentially the same results for all of its optimization levels.)
It seems well justified, then, to stop worrying about performance differences from the two styles, and just use the suffix increment3.
-
This leads to pesky jokes about C++ being a language in which the operator appearing in its name can't actually be used. ↩
-
I did this same test using std::vector, and as expected there was no meaningful difference in speed (with optimization) between iterators and subscripting. (There was a minor exception; at -O3 gcc made some sort of poor choice which made the subscript versions slightly slower, both than the terator versions and than themselves at -O1 and -O2.) ↩
-
Because it looks nicer. And while you're at it, make sure that you're indenting using tabs. ↩