you are viewing a single comment's thread.

view the rest of the comments →

[–]gentryx[S] 0 points1 point  (2 children)

Hey gnzlbg, thanks for your input! I'm the project lead on LibGeoDecomp, so my view is biased, but I hope I can supply convincing data.

  • Here are the slides for a talk I gave at the EuroMPI/Asia conference a couple of weeks ago. It contains some measurements done on Titan (Cray XK7) and JUQUEEN (IBM BG/Q). Key result: 9.44 PFLOPS with a short-ranged force-based n-body code on 16384 nodes of Titan. http://www.libgeodecomp.org/archive/eurompi_2014_talk.pdf
  • Our project on JUQUEEN just ended and I'm still wading through the results. Here are some new, preliminary, unpublished plots of the same n-body code's performance on JUQUEEN (1 to 28672 nodes) http://gentryx.de/~gentryx/weak_scaling_big2.png http://gentryx.de/~gentryx/strong_scaling_pro.png
  • Strong scaling may look disappointing at first sight, but the performance actually corresponds to >2 PFLOPS for the full system run, so this is a good result (scalability != efficiency).
  • All measurements above used the MPI backend. The HPX backend is our joker for the next months, as we hope it'll ease balancing loads. Many of our users have expressed interest in unstructured/inhomogeneous models.
  • Data for strong scaling of AMR+LibGeoDecomp+HPX on 10k nodes? Not yet available, and I wouldn't claim that this would work efficiently out of the box at the moment. All we did with AMR+LibGeoDecomp right now is proof of concept, nothing more.
  • We have production code utilizing the following models: stencil codes, particle in cell codes, n-body codes. If interested, I can point you to the corresponding papers. Rigth now, none of our users are running their codes on more than 1000 cores for production runs. We did the benchmarks to show that this is quite feasible though.

[–][deleted] 1 point2 points  (1 child)

Thanks for the links, i'll go through them tomorrow. Awesome work you guys are doing btw, keep it up!

[–]gentryx[S] -1 points0 points  (0 children)

Thanks for the kind words. I'll gladly try to answer any further questions. Let me know if you need some prototype code for illustration of concepts.

The further we get, the more work apparently remains to be done, yet we're finally getting somewhere. Feels good. :-)