Make it faster
As is often the case it would be desireable to get this code faster. There's not so much to be gained since already for the small test-cases around 30%-50% of the time is spent in lapack routines. Therefore the performance of the code is a direct measure of the underlying lapack implementation. Still for the smaller system sizes some benefits can be obtained: Some results obtained on my laptop using the (awful....) reference lapack implementation:
Ising: old new
3.586 +- 0.10 s 3.2333 +- 0.0078s
Hub:
339.5 s 340.9s
Kondo:
38.18s +- 0.13s 34.377 +- 0.093s