ALF issueshttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues2021-11-22T13:45:01Zhttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/195Automatic computation of Hopping_Matrix_Type%Multiplicity2021-11-22T13:45:01ZFrancesco Parisen ToldinAutomatic computation of Hopping_Matrix_Type%MultiplicityThe field Hopping_Matrix_Type%Multiplicity contains the number of times a given orbital appears in the list of bonds. Therefore it could be deduced from Hopping_Matrix_Type%List which contains the bonds emanating from a unit cell.
It see...The field Hopping_Matrix_Type%Multiplicity contains the number of times a given orbital appears in the list of bonds. Therefore it could be deduced from Hopping_Matrix_Type%List which contains the bonds emanating from a unit cell.
It seems a good idea to have it automatically computed, instead of relying in the user to enter it, in particular if the user defines a new set of hopping with checkerboard decomposition. This will ensure consistency and avoid bugs potentially difficult to detect.
The algorithm to compute it is basically easy (just a loop over Hopping_Matrix_Type%List).
Concerning the design, I see some possibilities:
1. Define a procedure like this
```
Subroutine Set_Multiplicity(this)
Type(Hopping_Matrix_Type), intent(INOUT) :: this
```
Pros: simplicity
Cons: again, we rely in the user to call it. And we have to properly document it.
2. Considering that, as far as I can see, Hopping_Matrix_Type%Multiplicity is only used in Predefined_Hoppings_set_OPT, we could remove altogether the field Hopping_Matrix_Type%Multiplicity and define a function that computes it, something like
```
Subroutine Compute_Multiplicity(hop_matrix, multiplicity)
Type(Hopping_Matrix_Type), intent(IN) :: hop_matrix
Integer, intent(OUT) :: multiplicity(:)
```
Pros: we remove completely the possibility of errors in Multiplicity. No need to document it.
Cons: at the moment Multiplicity is only used in a subroutine, presumably called only once. Should we use it in more routines, we may lose in efficiency by computing it every time.
3. Let the object compute Multiplicity the first time is needed, and cache the result. This is obtained by having a method of Hopping_Matrix_Type like that
```
Function Multiplicity(n)
implicit none
n, intent(IN) :: Integer
Multiplicity :: dble
If (is_cache_valid) then
Multiplicity = m_Multiplicity(n)
else
call Set_Multiplicity()
is_cache_valid = .true.
Multiplicity = m_Multiplicity(n)
end if
```
Here, is_cache_valid is a boolean that signals if multiplicity has been computed. m_Multiplicity is where the data are stored. Both should be, I think, private variables.
Pro: This can be regarded as a "internal" method that we do not need to necessarily document. Good efficiency. We hide the actual implementation of Multiplicity, without changing the syntax: that is, still Multiplicity(n) gives me the multiplicity of the n-th orbital.
Cons: Perhaps slightly more complex in the logic.
I personally favor solution 3., but I am not entirely sure: does Fortran allow for such type of c++-like design?Francesco Parisen ToldinFrancesco Parisen Toldinhttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/181Reduce memory consumption on a node by sharing data between MPI task2021-05-31T11:05:37ZJohannes HofmannReduce memory consumption on a node by sharing data between MPI taskSome memory objects are independent of the rank of the MPI task within a tempering group that shares all parameters. Those can be shared between different MPI jobs on the same node and thus reduce the memory footprint of ALF.Some memory objects are independent of the rank of the MPI task within a tempering group that shares all parameters. Those can be shared between different MPI jobs on the same node and thus reduce the memory footprint of ALF.Johannes HofmannJohannes Hofmannhttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/82Introduve Type 3 for continous field operators2019-09-25T12:45:19ZJohannes HofmannIntroduve Type 3 for continous field operatorsHere we will first upgrade the integer nsigma storage of the field (ising and auxiliary) to double such that it can also accommodate continues field implementation. This will also require an update of the operator module to handle the fi...Here we will first upgrade the integer nsigma storage of the field (ising and auxiliary) to double such that it can also accommodate continues field implementation. This will also require an update of the operator module to handle the fields accordingly.
Second, there has to follow the implementation of the updates of the continuous fields using HMC.Johannes HofmannJohannes Hofmannhttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/59work better with unitary matrices2017-12-01T10:58:17ZFlorian Gothwork better with unitary matricesexploit the properties of some unitary matrices.exploit the properties of some unitary matrices.Florian GothFlorian Gothhttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/75Use a logscale for the QMC intrinsic scales2017-11-01T18:40:54ZJohannes HofmannUse a logscale for the QMC intrinsic scalesI ran into double over-flowing and under-flowing issues which extremely limited the range of beta*V for my model. This problem gets solved in this issue as long as beta*V (or beta*bandwidth for weakly interacting Hubbard models) is small...I ran into double over-flowing and under-flowing issues which extremely limited the range of beta*V for my model. This problem gets solved in this issue as long as beta*V (or beta*bandwidth for weakly interacting Hubbard models) is smaller then ~10^300.Johannes HofmannJohannes Hofmannhttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/73introduce generic function to multiply a small matrix onto a larger on2017-10-30T18:43:33ZJohannes Hofmannintroduce generic function to multiply a small matrix onto a larger onDuring a discussion with Florian, we saw some potential benefit to introduce the following function:
ZSLMM(char side, char op, int N, int M, complex A(:,:), int P(:), complex B(:,:))
This function should multiply the (small) matrix op(A)...During a discussion with Florian, we saw some potential benefit to introduce the following function:
ZSLMM(char side, char op, int N, int M, complex A(:,:), int P(:), complex B(:,:))
This function should multiply the (small) matrix op(A) [A; A^T; A^C] onto the subblock spezified by the indices in P(:) of the large matrix B from the left (side='L') or right (side='R').
This definitions is general enough to treat (almost) every Operator-Matrix multiplication and also provides a well defined area for optimizations/specializations.Johannes HofmannJohannes Hofmannhttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/65Filter Branch 592017-10-19T09:56:34ZFlorian GothFilter Branch 59Branch 59 contains stuff that works and stuff that doesn't work.
the simple optimizations for the Hubbard model are among the good things and the fiddling with the QR related routines
are among the things that don't make things go faster...Branch 59 contains stuff that works and stuff that doesn't work.
the simple optimizations for the Hubbard model are among the good things and the fiddling with the QR related routines
are among the things that don't make things go faster. We want to keep branch 59 for reference.Florian GothFlorian Gothhttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/58Exploit the property that hopping matrices are hermitian2017-07-04T14:29:18ZFlorian GothExploit the property that hopping matrices are hermitianBetter exploit the fact that hopping matrices and their exponentials are hermitian in hop_mod.f90.Better exploit the fact that hopping matrices and their exponentials are hermitian in hop_mod.f90.Florian GothFlorian Gothhttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/44Lapack 3.7 released2017-06-22T10:16:55ZFlorian GothLapack 3.7 releasedlapack 3.7 got released: http://www.netlib.org/lapack/lapack-3.7.0.html
So at the very least we should update libqrref somewhen.
Amongst the more notable changes are:
- changes to the interface of the QR factorizations
- improvements to...lapack 3.7 got released: http://www.netlib.org/lapack/lapack-3.7.0.html
So at the very least we should update libqrref somewhen.
Amongst the more notable changes are:
- changes to the interface of the QR factorizations
- improvements to the SVD
- improvements to the SEP.https://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/60Adding makefiles and job scripts for an efficient OpenMP+MPI usage on SuperMU...2017-06-13T10:09:32ZJohannes HofmannAdding makefiles and job scripts for an efficient OpenMP+MPI usage on SuperMUC and JURECAI will add the proper makefile and job scripts which pin the threads for an efficient hybrid version.I will add the proper makefile and job scripts which pin the threads for an efficient hybrid version.Johannes HofmannJohannes Hofmannhttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/38Remove explicit constructions of the Q Matrix from the QR decomposition2017-04-08T17:36:26ZFlorian GothRemove explicit constructions of the Q Matrix from the QR decompositionCurrently we explicitly construct the Q Matrix to form the matrix product Q*A.
the explicit construction can be avoided by using lapacks ZUNMRQ. I also hope that we gain some stability.Currently we explicitly construct the Q Matrix to form the matrix product Q*A.
the explicit construction can be avoided by using lapacks ZUNMRQ. I also hope that we gain some stability.Florian GothFlorian Gothhttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/19Make it faster2017-03-29T16:17:55ZFlorian GothMake it fasterAs is often the case it would be desireable to get this code faster. There's not so much to be gained since already for the small test-cases around 30%-50% of the time is spent in lapack routines. Therefore the performance of the code is...As is often the case it would be desireable to get this code faster. There's not so much to be gained since already for the small test-cases around 30%-50% of the time is spent in lapack routines. Therefore the performance of the code is a direct measure of the underlying lapack implementation.
Still for the smaller system sizes some benefits can be obtained:
Some results obtained on my laptop using the (awful....) reference lapack implementation:
Ising: old new
3.586 +- 0.10 s 3.2333 +- 0.0078s
Hub:
339.5 s 340.9s
Kondo:
38.18s +- 0.13s 34.377 +- 0.093sFlorian GothFlorian Gothhttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/29cgr2_2 contains the last Conjgugate Transpose and the last explicit matrix in...2017-03-29T16:17:55ZFlorian Gothcgr2_2 contains the last Conjgugate Transpose and the last explicit matrix inversionin the file cgr2_2 we find the last Conjugate transpose and the last explicit matrix inversion. Additionally we find some block structure that is not exploited.
Fakher says that this block structure is important for the stability but let...in the file cgr2_2 we find the last Conjugate transpose and the last explicit matrix inversion. Additionally we find some block structure that is not exploited.
Fakher says that this block structure is important for the stability but let's ignore him....Florian GothFlorian Gothhttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/39Some optimizations concerning lapack and copying2017-03-29T16:17:55ZJohannes HofmannSome optimizations concerning lapack and copyingIn the current version, it is possible to replace zaxpy by zgemv called when an Ising spin is flipped which couples to more than one DoF.
Also, at least the Intel compiler issues some unnecessary memcopies when if fails to analyse the co...In the current version, it is possible to replace zaxpy by zgemv called when an Ising spin is flipped which couples to more than one DoF.
Also, at least the Intel compiler issues some unnecessary memcopies when if fails to analyse the consistency of the memory structure of the matrix, mostly caused by matrices passed as e.g. A(:,:,b,c) to zgemm. If the memory was allocated - as they were - this can be replaced by A(1,1,b,c) to avoid the additional temporary copies introduced by the compiler.
I think this should be fixed before the release and I will provide the optimizations I have already implemented for the SPT_optimized version.Johannes HofmannJohannes Hofmann