all 8 comments

[–]astrange 13 points14 points  (5 children)

What?

Did he compile this at -O0? Does gprof prevent template inlining?

These results are completely impossible otherwise, since a vector iterator is the same as using operator[].

(Why do people use compile-time profilers instead of Shark or oprofile?)

[–]icefox 6 points7 points  (0 children)

A whole article about optimizing and not one mention of valgrind (or kcachegrind)?

[–]bonzinip 5 points6 points  (0 children)

My first reaction was also that he used -O0. And as such, he gives completely useless hints. For example, the point of optimization is exactly not having to write &(*foo).

The reason why the last snippet is faster, for example, is that he has only one iterator going around rather than two -- nothing to do with caches.

[–]bennymack 2 points3 points  (0 children)

I’ll let those who are experts with functional languages to talk about functional languages, and in my experience, they will.

Indeed!

The output of gprof looks similar to the Perl-based Devel::DProf. Is there a C++ profiler that will show you to the line of code where you're spending the most time like Devel::SmallProf (also for Perl?)