DMTCP¶
DMTCP (Distributed MultiThreaded Checkpointing) transparently checkpoints a single-host or distributed computation in user-space with no modifications to user code or to the O/S. It works on most Linux applications, including Python, Matlab, R, etc. [1]
Warning
The current Apolo implementation only offers checkpointing support to serial and parallel programs. It isn’t compatible with distributed programs like those based on MPI.
[1] | DMTCP: Distributed MultiThreaded CheckPointing. (n.d.). Retrieved from http://dmtcp.sourceforge.net/ |