We run a significant amount of stuff on spot-instances (AKS nodes) and use the service detect, monitor and gracefully handle the imminent shutdown on the Kubernetes side.
Could you elaborate why mnesia is useless? I've only used erlang and elixir in hobbyist projects but in theory mnesia felt like great fit into the erlang ecosystem.
i'd encourage everyone thinking about using it to go read the code (it's fairly approachable) but in short it just connects each node in the cluster to each other node in the cluster via a tcp connection and sends packets between them. there's almost nothing as far as protocol, queue management, congestion control or traffic management. it's discovery mechanism is just shipping lists of peers around. pretty much any competent programmer with networking experience could put together something comparable
it works okay in small clusters on reliable low latency networks but it's not a replacement for an actual network mesh
there have been attempts to fix it but none have really found adoption. i think this is the most notable: https://lasp-lang.readme.io/docs
thanks for your answer so basically it relies on plain TCP and its reliability features. If you need something more sophisticated you gotta do it yourself within the application code (or using libs).
Elixir tends to have sophisticated libraries around what you really want that are plug and play with distributed erlang. (Phoenix pubsub/tracker/presence). Horde/swarm, etc.
https://learn.microsoft.com/en-us/azure/virtual-machines/lin...