Skip to content

Stuck in infinite loop using multiprocessing

When using multiprocessing for calculations the program can get stuck in an infinite loop when the eigensolver didn't converge in very rare cases.
The following stderr Output is produced with multiprocessing, while the program is not terminated:

Exception in thread Thread-3:
Traceback (most recent call last):
  File "C:\Users\Christian\miniconda3\envs\kdotpy\lib\threading.py", line 980, in _bootstrap_inner
    self.run()
  File "C:\Users\Christian\miniconda3\envs\kdotpy\lib\threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\Christian\miniconda3\envs\kdotpy\lib\multiprocessing\pool.py", line 576, in _handle_results
    task = get()
  File "C:\Users\Christian\miniconda3\envs\kdotpy\lib\multiprocessing\connection.py", line 251, in recv
    return _ForkingPickler.loads(buf.getbuffer())
TypeError: __init__() missing 2 required positional arguments: 'eigenvalues' and 'eigenvectors'

When using a single process the error message changes

Traceback (most recent call last):
  File "D:\Studium\4Promotion\GitLab_Repos\kdotpy\kdotpy-ll.py", line 327, in <module>
    data.diagonalize(ModelLL(modelopts_bdep), solver, list_kwds)
  File "D:\Studium\4Promotion\GitLab_Repos\kdotpy\diagdata.py", line 2657, in diagonalize
    tm.do_all()
  File "D:\Studium\4Promotion\GitLab_Repos\kdotpy\tasks.py", line 233, in do_all
    task.run()  # run
  File "D:\Studium\4Promotion\GitLab_Repos\kdotpy\tasks.py", line 90, in run
    self.callback(self.worker_func())
  File "D:\Studium\4Promotion\GitLab_Repos\kdotpy\tasks.py", line 24, in run
    return self.func(*self.args, **self.kwds)
  File "D:\Studium\4Promotion\GitLab_Repos\kdotpy\models.py", line 357, in _solve_ham
    eival1, eivec1 = solver.solve(ham)
  File "D:\Studium\4Promotion\GitLab_Repos\kdotpy\diagsolver.py", line 236, in solve
    eival, eivec = eigsh(mat, self.neig, sigma=self.targetval, v0=self.lasteivec if self.reuse_eivec else None)
  File "C:\Users\Christian\miniconda3\envs\kdotpy\lib\site-packages\scipy\sparse\linalg\_eigen\arpack\arpack.py", line 1574, in eigsh
    ret = eigs(A, k, M=M, sigma=sigma, which=which, v0=v0,
  File "C:\Users\Christian\miniconda3\envs\kdotpy\lib\site-packages\scipy\sparse\linalg\_eigen\arpack\arpack.py", line 1352, in eigs
    params.iterate()
  File "C:\Users\Christian\miniconda3\envs\kdotpy\lib\site-packages\scipy\sparse\linalg\_eigen\arpack\arpack.py", line 757, in iterate
    self._raise_no_convergence()
  File "C:\Users\Christian\miniconda3\envs\kdotpy\lib\site-packages\scipy\sparse\linalg\_eigen\arpack\arpack.py", line 377, in _raise_no_convergence
    raise ArpackNoConvergence(msg % (num_iter, k_ok, self.k), ev, vec)
scipy.sparse.linalg._eigen.arpack.arpack.ArpackNoConvergence: ARPACK error -1: No convergence (10801 iterations, 17/20 eigenvectors converged)

and code execution is terminated. This is the intended behaviour.

The reason for that is most probably that the ArpackNoConvergence class is not picklable. (for comparison: ArpackNoConvergence Doc and multiprocessing breaks when payload fails to unpickle)

This behaviour can be reproduced by using the commandline (tested on commit 34062e0d)
8o msubst CdZnTe 4% mlayer HgCdTe 68% HgTe HgCdTe 68% llayer 10 7 10 zres 0.1 b 0 2 // 10 nll 10 neig 200 split 0.01 ax targetenergy 10 config "luttinger_gamma1=0;luttinger_gamma2=0;luttinger_gamma3=0;luttinger_kappa=0;luttinger_F=0;luttinger_q=0" cpu 7
and applying Luttinger_as_cmd_parameters.patch.

I could only reproduce this with the combinations of zres 0.1 targetenergy 10 cpu 7 and the given config values for Luttinger parameters using the patch. Changing a single value or don't setting Luttinger parameter via patch won't result in an infinite loop.
Based on the fact, that this behaviour is highly situational, I think this issue is not of high priority.