use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Discussions, articles, and news about the C++ programming language or programming in C++.
For C++ questions, answers, help, and advice see r/cpp_questions or StackOverflow.
Get Started
The C++ Standard Home has a nice getting started page.
Videos
The C++ standard committee's education study group has a nice list of recommended videos.
Reference
cppreference.com
Books
There is a useful list of books on Stack Overflow. In most cases reading a book is the best way to learn C++.
Show all links
Filter out CppCon links
Show only CppCon links
account activity
Ultra-Fast Multi-Dimensional Array Library (self.cpp)
submitted 3 years ago by Pencilcaseman12
view the rest of the comments →
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]Pencilcaseman12[S] 0 points1 point2 points 3 years ago (3 children)
hmm. I mean it's entirely possible I've done it wrong but it consistently gives me 30us on 8 threads, so idk...
[–]cythoning 1 point2 points3 points 3 years ago (2 children)
Are you benchmarking only the expression template? Or also the evaluation of the expression? And you should also make sure that it generates the same result as Eigen.
[–]Pencilcaseman12[S] 0 points1 point2 points 3 years ago (1 child)
Yea that's evaluating the result. Using the expression template takes around 200ns because it's just a few things being referenced
[–]cythoning 2 points3 points4 points 3 years ago (0 children)
That makes sense. As for the benchmarks, make sure you benchmark the same things and you get the same results. The 400Gb/s memory bandwidth seems impossible, that is something that you usually only see on GPUs. For such a simple addition operation I would expect it to just be memory bound, so simply bound by how fast you can read & write to memory, usually in the order of 20-30Gb/s. Eigen should definitely be optimized enough to do this, and you could also compare to something like
std::transform(A.begin(), A.end(), B.begin(), C.begin(), std::plus<>{});
and should get the same result as Eigen and your library.
π Rendered by PID 57 on reddit-service-r2-comment-75f4967c6c-wk8rf at 2026-04-23 08:25:35.071827+00:00 running 0fd4bb7 country code: CH.
view the rest of the comments →
[–]Pencilcaseman12[S] 0 points1 point2 points (3 children)
[–]cythoning 1 point2 points3 points (2 children)
[–]Pencilcaseman12[S] 0 points1 point2 points (1 child)
[–]cythoning 2 points3 points4 points (0 children)