Ian Clark has some interesting results with a new distributed programming language called Swarm. As the founder of Freenet and other large-scale computing systems, his opinions on the subject are not idle thoughts.
He does a fine job of explaining Swarm with preliminary results, so I’ll limit myself here to a pile of off-the-cuff reactions:
- If keys in the store are an MD5 or SHA1 of the content they are storing, caching data across nodes becomes trivial. If stack state is stored like this as well (compute only on send?) and chained (one stack frame per hashed data element, each with a “pointer” to the hash of the next stack frame), transmission of continuations can also be saved.
- The graphics are the compelling thing. It’s what separates this from other projects. Emotional? Yes but adherence to a programming language is as much emotion as anything else. I say build visualization into the language itself. Make it trivial to produce. This is what everyone’s going to share and talk about.
- Use the JVM model; don’t invent a new VM. But you need full control. So, start with a JVM interpreter written in Java (see joeq or jikes or write your own using ASM) which you could then modify. Then you inherit all existing Java libraries to your new language. You have to inherit a bunch of libraries, at least at first!
This also opens the door for interesting optimizations, like identifying short stretches of bytecode where break-for-continuation is not permitted, breaking those out into dynamically-written subroutines, and allowing the underlying JVM to JIT it.
- Scala might be fun to learn, but if this project gets going it will be hard enough to root out the bugs without the underlying language also riddled with bugs! Not to mention the extra barrier to entry (“Wait, I have to learn Scala first?”).
- How does user interaction work when the execution is moved? Even something as simple as a command-line, much less a GUI. Doesn’t this imply that at some point in the execution stack you have to return to the original machine?
(More reason to use Java directly — bridge between distributed-mode and local-mode for the non-distributed part of the work.)
- Same question with external resources. File system is easy, but what about a TCP connection or a database connection? How shared across machines? Or do you need a way to say “Send the execution to this specific node, the one that houses this resource?” Maybe with an instruction that says “When this routine completes, redistribute this execution.” Maybe that instruction has a back-pointer to the original executing node, not requiring you to return there (i.e. what if that node is now overloaded?) but suggesting since that node does have all the necessary data cached.
- In Java some critical, tight, high-performance routines are in C; in Swarm perhaps tight routines can be in Java! Java Annotations might be a way to specify “don’t distribute” on a method.
- If you base on the JVM and use Annotations, perhaps existing code could be ported with no alteration! Or you can mix Swarm and plain Java with one line of code. This “easy to revert back” attribute is critical for adoption because people don’t like lock-in.
- How does synchonization work? Locks-held need to be part of the continuation. But are there other subtle issues?
- You’ll need your own synchronization of course. Please please please use deadlock-detection, throwing an exception instead of just locking up. It’s not hard to implement.
- Suggestion that MapReduce be the next thing that is implemented because it’s the hot thing in distributed computation and folks are convinced that many useful things can be expressed that way. Demoing efficiency (and pretty pictures) here would be compelling to many people.
- Fault tolerance. Probably don’t have to have this at first, but need a thought-experiment-level concept of how to handle.
- Computational caching. With SHA of input and full knowledge of bytecode, you could perhaps automatically cache the results of computation! Think of algorithms where you should use functional programming. Or even just dynamic webpages where the home page doesn’t change that often.
- Consider JavaSpaces for object transfer? Might solve some issues with fault tolerance.
Giving advice and asking questions is easy. Hopefully some brave souls will do the real work of getting Swarm up and running. Good luck Ian!