ALF issueshttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues2019-09-25T11:53:24Zhttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/12Print stabilization information after each bin2019-09-25T11:53:24ZFakher F. AssaadPrint stabilization information after each binFakher F. AssaadFakher F. Assaadhttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/16Fortran 2003 compatibility of Analysis tools2019-09-25T17:33:23ZFlorian GothFortran 2003 compatibility of Analysis toolsAs it stands now Max_SAC.f90 is not Fortran2003 compliant since it employs the system call. Solutions could be to delete the respective statements or using the respective compiler switches. Or to just ignore it since the analysis tools a...As it stands now Max_SAC.f90 is not Fortran2003 compliant since it employs the system call. Solutions could be to delete the respective statements or using the respective compiler switches. Or to just ignore it since the analysis tools are currently not built using the F2003 standard. The respective compiler warning is:
Warning: The intrinsic 'system' at (1) is not included in the selected standard but a GNU Fortran extension and 'system' will be treated as if declared EXTERNAL. Use an appropriate -std=* option or define -fall-intrinsics to allow this intrinsic.Jefferson Stafusa E. PortelaJefferson Stafusa E. Portelahttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/22Save Stack space2019-09-25T17:36:36ZFlorian GothSave Stack spaceArrays that are declared with their size upon their declaration(i.e. they don't have the allocatable keyword) get by default allocated on the stack.
On systems that have set the available stack space to unlimited (Most Supercomputing ce...Arrays that are declared with their size upon their declaration(i.e. they don't have the allocatable keyword) get by default allocated on the stack.
On systems that have set the available stack space to unlimited (Most Supercomputing centers do that for compatibility reasons) this will not pose an issue.
To enhance compatibility with systems that impose stack space we should use instead arrays that are allocatable.Jefferson Stafusa E. PortelaJefferson Stafusa E. Portelahttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/24Ising Hamiltonian mostly only works for power of two square lattices2021-12-06T12:21:47ZFlorian GothIsing Hamiltonian mostly only works for power of two square latticesI have checked out the master branch and compiled the Ising model without optimizations with gfortran. For quadratic lattices I get the following results:
2x2 runs
3x3 crash
4x4 runs
5x5 crash
6x6 runs
7x7 crash
8x8 works
9x9 crash
10x10...I have checked out the master branch and compiled the Ising model without optimizations with gfortran. For quadratic lattices I get the following results:
2x2 runs
3x3 crash
4x4 runs
5x5 crash
6x6 runs
7x7 crash
8x8 works
9x9 crash
10x10 runs
Now follows some debugging:
For 3x3 I see this:
```
Program received signal SIGSEGV, Segmentation fault.
0x000055555557791e in wrapul (ntau1=100, ntau=90, ul=..., dl=..., vl=...) at wrapul.f90:37
37 X = Phi(nsigma(n,nt),Op_V(n,nf)%type)
(gdb) p n
$1 = 16
(gdb) p nt
$2 = 100
```
For 5x5 I see this:
```
in confin () at inconfc.f90:28
28 Allocate (Nsigma(Size(Op_V,1),Ltrot))
```
For 7x7 I get pointed to wrapul, but I think this trace is useless
For 9x9 I get pointed by the debugger here:
```
in hamiltonian::ham_v () at Hamiltonian_Ising.f90:303
303 Ising_nnlist(n,1) = n1
```
with rectangular lattices I see the following:
3x7 crash
5x7 crash
9x7 crash
6x8 crash
4x8 crash
8x16 runs
14x6 crash
28x12 seems to run
20x28 seems to run
36x28 seems to run
For the 3x7 case I see this:
```
#0 0x00007ffff65c17a6 in ?? () from /lib64/libc.so.6
#1 0x00007ffff65c2c10 in ?? () from /lib64/libc.so.6
#2 0x00007ffff65c4fd2 in malloc () from /lib64/libc.so.6
#3 0x0000555555581dc7 in confin () at inconfc.f90:28
#4 0x00005555555701ac in MAIN__ () at main.f90:107
```
For the 5x7 case I see this:
```
#0 0x00007ffff657870b in raise () from /lib64/libc.so.6
#1 0x00007ffff6579cd1 in abort () from /lib64/libc.so.6
#2 0x00007ffff65bb6ec in ?? () from /lib64/libc.so.6
#3 0x00007ffff65c1567 in ?? () from /lib64/libc.so.6
#4 0x00007ffff65c1dcb in ?? () from /lib64/libc.so.6
#5 0x00005555555785a5 in wrapul (ntau1=100, ntau=90, ul=<error reading variable: frame address is not available.>,
dl=<error reading variable: frame address is not available.>,
vl=<error reading variable: frame address is not available.>) at wrapul.f90:23
#6 0x0000555555572d6b in MAIN__ () at main.f90:207
```
For the 9x7 case I see this:
```
(gdb) bt
#0 0x00005555555656c4 in hamiltonian::ham_v () at Hamiltonian_Ising.f90:303
#1 0x00005555555678df in hamiltonian::ham_set () at Hamiltonian_Ising.f90:134
#2 0x00005555555701a2 in MAIN__ () at main.f90:105
(gdb) frame 0
#0 0x00005555555656c4 in hamiltonian::ham_v () at Hamiltonian_Ising.f90:303
303 Ising_nnlist(n,1) = n1
(gdb) p n
$1 = -255588285
(gdb) p n1
$2 = 26
```
For the 6x8 and similarly for the 4x8 case I see this:
```
#0 0x0000555555563f02 in hamiltonian::s0 (n=49, nt=1) at Hamiltonian_Ising.f90:339
#1 0x000055555557e897 in upgrade (gr=<error reading variable: Cannot access memory at address 0x55555587d000>, n_op=49,
nt=1, phase=(1,0), op_dim=2) at upgrade.f90:65
#2 0x000055555557be8a in wrapgrup (gr=..., ntau=0, phase=(1,0)) at wrapgrup.f90:46
#3 0x0000555555573b8a in MAIN__ () at main.f90:290
(gdb) frame 0
#0 0x0000555555563f02 in hamiltonian::s0 (n=49, nt=1) at Hamiltonian_Ising.f90:339
339 S0 = S0*DW_Ising_space(nsigma(n,nt)*nsigma(Ising_nnlist(n,i),nt))
(gdb) p n
$1 = 49
(gdb) p nt
$2 = 1
(gdb) p i
$3 = 1
```
Most of the backtraces seem to be corrupted, but maybe ham_V() and S0() look like good places to start. Somewhere it is built in that the length in every dimension is divisible by 4.Fakher F. AssaadFakher F. Assaadhttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/25Test Prog/13 fails for unknown reasons2017-09-24T20:16:56ZFlorian GothTest Prog/13 fails for unknown reasonsTest 13 fails for reasons not understandable to me. Currently the testsuite has this test as an expected fail.
So whoever has an idea....Test 13 fails for reasons not understandable to me. Currently the testsuite has this test as an expected fail.
So whoever has an idea....https://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/27Adapted timing for openmp mulithreading.2017-09-24T20:16:56ZMartin BercxAdapted timing for openmp mulithreading.@all I suggest to change the current timing which is based on calls to the subroutine cpu_time to a timing based on calls to the subroutine system_clock.
At the moment, the measured time is in general not the elapsed wall clock time (wh...@all I suggest to change the current timing which is based on calls to the subroutine cpu_time to a timing based on calls to the subroutine system_clock.
At the moment, the measured time is in general not the elapsed wall clock time (which we want for accounting) but the (potentially accumulated) time during which the CPU did some work. This difference matters in case of openmp multithreading.https://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/35Improved stabilization.2017-09-24T20:16:56ZFakher F. AssaadImproved stabilization.@all Dear all, in the commit 8cfecdeca2a510f6afa5c60babf88375c7b1ee02 of the master branch. (This is the last commit) you will find yet another attempt to make the stabilization more robust. It would of great help if you could tr...@all Dear all, in the commit 8cfecdeca2a510f6afa5c60babf88375c7b1ee02 of the master branch. (This is the last commit) you will find yet another attempt to make the stabilization more robust. It would of great help if you could try this out for your project, and get back to me. Positive or negative feedback is really useful.
In the directory Library/Modules you will find two modules, mat_mod_nag.f90 and mat_mod_nonag.f90. As the name suggests, in mat_mod_nonag.f90 Florian has replaced all the NAG calls to lapack calls. By copying mat_mod_nonag.f90 onto mat_mod.f90 you will also be able to check how the stabilization is effected, for the best or the worse, when replacing outdated NAG routines with lapack. Again, your feedback is important. Have a nice evening, Fakher.https://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/41Problems of some Intel Compilers and libqrref2017-09-24T20:16:56ZFlorian GothProblems of some Intel Compilers and libqrref@fassaad @Hofmann @mbercx
Some Intel compilers(notably that on our cluster) have problems compiling libqrref.
It does not seem to be related to Fortran standards since compiling libqrref with compilation flags fixed to -O0 -c
does not ...@fassaad @Hofmann @mbercx
Some Intel compilers(notably that on our cluster) have problems compiling libqrref.
It does not seem to be related to Fortran standards since compiling libqrref with compilation flags fixed to -O0 -c
does not solve it on our cluster either.
So to collect from some E-Mails we have:
It seems that the QRREF flag
**works* with the ifort implementation on Jureca
**fails** with the ifort implementation on mat23 (our local cluster)
**fails** with the gfortran implementation on my laptop (my laptop has an automatic update, so it can very well happen that things which used to work suddenly stop functioning. Admittedly pretty nerve racking…… )https://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/42ALF-052017-09-24T20:16:56ZFakher F. AssaadALF-05@all Dear all, in our git you will find a new branch named ALF-05. It contains the first full version with documentation etc, and is derived from this commit https://git.physik.uni-wuerzburg.de/fassaad/General_QMCT_code/commit/59c9...@all Dear all, in our git you will find a new branch named ALF-05. It contains the first full version with documentation etc, and is derived from this commit https://git.physik.uni-wuerzburg.de/fassaad/General_QMCT_code/commit/59c9c40787eb70f7ebace5a1576bc043917e8aff to the master branch. Florian has created a new website http://alf.physik.uni-wuerzburg.de where we will place this version. Please take a look at this ALF-05 release so as to avoid bugs and inconsistencies. All the changes and discussion should be carried out in this issue, so that we can track things. Cheers, Fakherhttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/50Collect ideas for the website2017-09-24T20:16:56ZFlorian GothCollect ideas for the website@fassaad , @Hofmann , @mbercx Let's try to create a list with the content for the website.@fassaad , @Hofmann , @mbercx Let's try to create a list with the content for the website.2017-04-04https://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/55Get rid of plain complex division2019-09-25T17:38:19ZFlorian GothGet rid of plain complex divisionConsidering that complex division seems to still be an active field of research:
https://arxiv.org/pdf/1210.4539.pdf
We should maybe go through the code and replace / with zladiv which incorporates that feature starting from ~3.5 from ...Considering that complex division seems to still be an active field of research:
https://arxiv.org/pdf/1210.4539.pdf
We should maybe go through the code and replace / with zladiv which incorporates that feature starting from ~3.5 from lapack....Jefferson Stafusa E. PortelaJefferson Stafusa E. Portelahttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/57Predict Global Moves2017-06-18T13:38:40ZFlorian GothPredict Global MovesThis branch serves to test strategies to predict various global moves.
It seems that in the end it boils down to find an approximate model for the transition probability T(s, s')
Given the current state s' we can either try to invert...This branch serves to test strategies to predict various global moves.
It seems that in the end it boils down to find an approximate model for the transition probability T(s, s')
Given the current state s' we can either try to invert the equation for s or, just throw random configurations until we
find a configuration where T is sufficiently large.
For interpolating the function T and throwing random configurations s against it we can use a feed-forward network.
If we want to go for inverting the function for s we can use a Deep Belief network or any other (and hopefully simpler)
generative network.Florian GothFlorian Gothhttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/62clALF2017-06-22T17:05:01ZFlorian GothclALFSo we'd like to use ALF on certain Accelerators. The most simple solution would be to use AMDs ACML:
http://developer.amd.com/tools-and-sdks/archive/compute/amd-core-math-library-acml/acml-downloads-resources/
This library provides...So we'd like to use ALF on certain Accelerators. The most simple solution would be to use AMDs ACML:
http://developer.amd.com/tools-and-sdks/archive/compute/amd-core-math-library-acml/acml-downloads-resources/
This library provides a full lapack and BLAS Implementation with a working Fortran Interface that can autoamtically use
external accelerators via OpenCL. Sadly AMD has only 2 profiles for two of their GPUs from around 2014 and I could not get it to work satisfactory.
The next idea is to use clMAGMA from the MAGMA initiative:
http://icl.cs.utk.edu/magma/
The most recent version has a working Fortran interface(I suppose), support for sparse vector operations and is maintained by a similar set of people as the reference lapack. But it essentially only supports CUDA.
There is magmaMIC that can utilize Xeon Phi's and there's clMAGMA that uses OpenCL as backend.
Sadly when the authors tried to add a Fortran Interface they found out that there would be some work involved. So this is also out....
There is ViennaCL:
http://viennacl.sourceforge.net/
But this is only C++ but it looks very powerful especially for sparse operations.
Since ALF spends its time mostly in low-level BLAS3 Routines (ZHEMM's in my branch on the Hubbard-model)
we can get away with just trying to plug in a library that emulates the BLAS interface.
To my knowledge thereis no library that provides a full Fortran Interface.
If we go on to write our own wrappers there are two contenders:
clBLAS: https://github.com/clMathLibraries/clBLAS
This was a part of AMD's ACML and is now open sourced and seems to be a little bit maintained.
and TomTom recently released clBLAST:
https://arxiv.org/abs/1705.05249 , https://cnugteren.github.io/clblast/clblast.html
It is very new and being from the outside of HPC puts a lot of effort into ensuring the portability, and also has Netlib.org lapack interface that can be almost linked against fortran.
So for now I will try to see wether clBLAS works and I can offload ZHEMM calls...
First experiences with clBLAS:
Adding the ZHEMM call is now finished. This works and gives correct results.
For now I could only test execution on a CPU(i7-2600). The Multiplication is automatically
parralellized but oversubscribes my CPU with ~ 8 threads. This would be OK, but the runtime is 5 times longer than plain single thread execution...
Some numbers:
(core-i7 920, 8x8 lattice ) master: 13s clalf: 97s (upto 4 threads...), CLBlast: 97s (around 1.5 threads effectively used)
(core-i7 920, 12x12 lattice ) master: 136s clalf: 415s (upto 4 threads...), clBlast: 171s (~ 2.5 threads)
(core-i7 920, 16x16 lattice ) master: 776(single thread) , clBlast: 545s (~ 4 threads used well)
(core -i7 2600 20x20) master: 357s, clBlas: 880s
For now I concur that clBLAS is an AMD GPU only solution.
The numbers didn't change much by using the inbuilt auto-tuner for my CPU for CLBlast.Florian GothFlorian Gothhttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/64Wannier90 interface2017-06-22T14:05:18ZFlorian GothWannier90 interfaceIn the DFT Community there is the Wannier90 program that seems to be able to output hopping matrices from DFT calculations and therefore defines a certain file format.
It might be useful if ALF can be interfaced with that type of program...In the DFT Community there is the Wannier90 program that seems to be able to output hopping matrices from DFT calculations and therefore defines a certain file format.
It might be useful if ALF can be interfaced with that type of programs on the input file level.https://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/66openbc branch2017-07-15T16:52:09ZFrancesco Parisen Toldinopenbc branchHi,
I opened a branch "openbc", where I am adding the option of open boundary conditions on the example hamiltonian file Hamiltonian_Examples.f90.Hi,
I opened a branch "openbc", where I am adding the option of open boundary conditions on the example hamiltonian file Hamiltonian_Examples.f90.https://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/74Optimization for hopping operators2017-09-12T15:59:41ZJohannes HofmannOptimization for hopping operatorsIt might be useful to specialize hopping operator as many of us simulate Ising fields coupled to bond hopping or hopping operators squared in Kondo type couplings.It might be useful to specialize hopping operator as many of us simulate Ising fields coupled to bond hopping or hopping operators squared in Kondo type couplings.https://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/76Automatically set values for certain parameters2017-10-02T07:13:10ZMartin HohenadlerAutomatically set values for certain parametersWould it be possible to enable automatically setting variables such as LOBS_EN for given beta, dtau? In my codes, I usually allow for negative values of such flags to signal automatic values. This avoids accidentally using incorrect values.Would it be possible to enable automatically setting variables such as LOBS_EN for given beta, dtau? In my codes, I usually allow for negative values of such flags to signal automatic values. This avoids accidentally using incorrect values.https://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/77Examples/Hubbard_SU2_Ising_Square2/parameters2017-10-20T16:34:52ZMartin HohenadlerExamples/Hubbard_SU2_Ising_Square2/parametersreplace &VAR_Hubbard by &VAR_Hub_Isingreplace &VAR_Hubbard by &VAR_Hub_Isinghttps://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/79./analysis.sh: line 20: 28106 CPU time limit exceeded $ANNAL/cov_eq.out2017-12-04T19:12:52ZMartin Hohenadler./analysis.sh: line 20: 28106 CPU time limit exceeded $ANNAL/cov_eq.outSince pulling the latest version yesterday and compiling with standard flags (O3), the analysis has become extremely slow on JURECA. The QMC code itself runs normally. The numbers of bins was only 120 and we are talking about equal-time ...Since pulling the latest version yesterday and compiling with standard flags (O3), the analysis has become extremely slow on JURECA. The QMC code itself runs normally. The numbers of bins was only 120 and we are talking about equal-time analysis.https://git.physik.uni-wuerzburg.de/ALF/ALF/-/issues/86ALF-for-Summit2018-05-02T15:21:24ZFlorian GothALF-for-SummitWe intend to port ALF to the Summit HPC System with the help of the MAGMA library.We intend to port ALF to the Summit HPC System with the help of the MAGMA library.Johannes HofmannJohannes Hofmann