Stuck in infinite loop using multiprocessing
When using multiprocessing for calculations the program can get stuck in an infinite loop when the eigensolver didn't converge in very rare cases.
The following stderr
Output is produced with multiprocessing
, while the program is not terminated:
Exception in thread Thread-3:
Traceback (most recent call last):
File "C:\Users\Christian\miniconda3\envs\kdotpy\lib\threading.py", line 980, in _bootstrap_inner
self.run()
File "C:\Users\Christian\miniconda3\envs\kdotpy\lib\threading.py", line 917, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\Christian\miniconda3\envs\kdotpy\lib\multiprocessing\pool.py", line 576, in _handle_results
task = get()
File "C:\Users\Christian\miniconda3\envs\kdotpy\lib\multiprocessing\connection.py", line 251, in recv
return _ForkingPickler.loads(buf.getbuffer())
TypeError: __init__() missing 2 required positional arguments: 'eigenvalues' and 'eigenvectors'
When using a single process the error message changes
Traceback (most recent call last):
File "D:\Studium\4Promotion\GitLab_Repos\kdotpy\kdotpy-ll.py", line 327, in <module>
data.diagonalize(ModelLL(modelopts_bdep), solver, list_kwds)
File "D:\Studium\4Promotion\GitLab_Repos\kdotpy\diagdata.py", line 2657, in diagonalize
tm.do_all()
File "D:\Studium\4Promotion\GitLab_Repos\kdotpy\tasks.py", line 233, in do_all
task.run() # run
File "D:\Studium\4Promotion\GitLab_Repos\kdotpy\tasks.py", line 90, in run
self.callback(self.worker_func())
File "D:\Studium\4Promotion\GitLab_Repos\kdotpy\tasks.py", line 24, in run
return self.func(*self.args, **self.kwds)
File "D:\Studium\4Promotion\GitLab_Repos\kdotpy\models.py", line 357, in _solve_ham
eival1, eivec1 = solver.solve(ham)
File "D:\Studium\4Promotion\GitLab_Repos\kdotpy\diagsolver.py", line 236, in solve
eival, eivec = eigsh(mat, self.neig, sigma=self.targetval, v0=self.lasteivec if self.reuse_eivec else None)
File "C:\Users\Christian\miniconda3\envs\kdotpy\lib\site-packages\scipy\sparse\linalg\_eigen\arpack\arpack.py", line 1574, in eigsh
ret = eigs(A, k, M=M, sigma=sigma, which=which, v0=v0,
File "C:\Users\Christian\miniconda3\envs\kdotpy\lib\site-packages\scipy\sparse\linalg\_eigen\arpack\arpack.py", line 1352, in eigs
params.iterate()
File "C:\Users\Christian\miniconda3\envs\kdotpy\lib\site-packages\scipy\sparse\linalg\_eigen\arpack\arpack.py", line 757, in iterate
self._raise_no_convergence()
File "C:\Users\Christian\miniconda3\envs\kdotpy\lib\site-packages\scipy\sparse\linalg\_eigen\arpack\arpack.py", line 377, in _raise_no_convergence
raise ArpackNoConvergence(msg % (num_iter, k_ok, self.k), ev, vec)
scipy.sparse.linalg._eigen.arpack.arpack.ArpackNoConvergence: ARPACK error -1: No convergence (10801 iterations, 17/20 eigenvectors converged)
and code execution is terminated. This is the intended behaviour.
The reason for that is most probably that the ArpackNoConvergence
class is not picklable. (for comparison: ArpackNoConvergence
Doc and multiprocessing breaks when payload fails to unpickle)
This behaviour can be reproduced by using the commandline (tested on commit 34062e0d)
8o msubst CdZnTe 4% mlayer HgCdTe 68% HgTe HgCdTe 68% llayer 10 7 10 zres 0.1 b 0 2 // 10 nll 10 neig 200 split 0.01 ax targetenergy 10 config "luttinger_gamma1=0;luttinger_gamma2=0;luttinger_gamma3=0;luttinger_kappa=0;luttinger_F=0;luttinger_q=0" cpu 7
and applying Luttinger_as_cmd_parameters.patch.
I could only reproduce this with the combinations of zres 0.1 targetenergy 10 cpu 7
and the given config values for Luttinger parameters using the patch. Changing a single value or don't setting Luttinger parameter via patch won't result in an infinite loop.
Based on the fact, that this behaviour is highly situational, I think this issue is not of high priority.