Improve SVD
The SVD Routine currently is using the plain general SVD (zgesvd) of lapack. There is a faster alternative available using the Divide an Conquer algorithm in ZGESDD. Sadly, the accuracy of this algorithm is bounded for small singular values. Denoting the biggest SV by S and the machine epsilon by eps singular values below |Seps| are just set to Seps. Here is an example comparing the two algorithms for some matrix occuring in the Kondo_Honey test(gfortran + OpenBLAS)
SVD: 29358741009046640. 4371337284967.0049 153386145832.22366 140027571558.44540 27442244878.419041 10174293407.414543 4964951867.1745338 251384.78330092729 146527.34515982482 23223.133927646668 12441.956067787451 3155.7699166069851 778.15633911220743 154.66496181881612 37.982212181611324 11.530438290939198 7.7547070117216936 2.2351274524290323 1.0948194029243250 1.0042513253612206 0.99704132205978235 0.99435210821385356 0.98869665168721543 0.98182901762230468 0.96243719159489505 0.94558520505886201 0.92263647632343304 0.89813001016344507 0.83939399652998081 0.79299370940335534 0.73791996973963436 0.66886534876659343 0.53457362404518893 0.33111705654936113 0.22525847213823211 0.16682721525439592
SDD: 29358741009046640. 4371337284967.0059 153386145832.22369 140027571558.44540 27442244878.419018 10174293407.414539 4964951867.1745367 251384.78330092729 146527.34515982488 23223.133927646679 12441.956067787452 3155.7699166069842 778.15633911220732 154.66496181969327 37.994077800173180 11.534388356890515 7.5592855613432492 2.9335275108774077 2.9335275108774077 2.9335275108774077 2.9335275108774077 2.9335275108774077 2.9335275108774077 2.9335275108774077 2.9335275108774077 2.9335275108774077 2.9335275108774077 2.9335275108774077 2.9335275108774077 2.9335275108774077 2.9335275108774077 2.9335275108774077 2.9335275108774077 2.9335275108774077 2.6568125749348002 2.3547466794264533
Note that the big singular values are reproduced correctly but that SDD as soon as the singular values approach the value of ~ 3 = 29358741009046640 * 10^(-16) essentialy only the same values are reproduced. This is not necessarily by design of the algorithm, but seems to be done in the implementations: https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/520683 So for now, there can't be anything improved, except the calculation of the workspace maybe(will be found in #19 (closed) ) For the future it has to be seen wether an adaption of the very successful MR^3 algorithm to the SVD can be found(theory looks promising: http://etna.mcs.kent.edu/vol.39.2012/pp1-21.dir/pp1-21.pdf) and wether an implementation using that algorithm will be available.