This is the reference for my current simulations: hasenbusch2011.pdf
According to Appendix A.2, about
2.5e8 measurements on a
64^3 lattice took less than 4 CPU years on a single core of a Quad-Core AMD Opteron Processor 2378 running at 2.4 GHz. Let's say it was one year for this lattice size:
nemcs = 2.5e8 hours = 365*24 -> nemcs/hour = 28.500
Now our timing results on the new part of the ITPA cluster (without hyper threading) are:
nemcs = 10.000 hours = 9.1 -> nemcs/hour = 1100
Which is about a factor 25x slower ...