Monday, November 27, 2006

Threadless Concurrency on the JVM - aka Scala Actors

Billy has written an interesting blog on the impact of multicore processors on Java. He concludes that the Java EE platform will have to be redressed to some extent in order to address the new threading patterns that applications will use and the consequences of reduced clock speed to accomodate the extra cores on the die. He has made some very thoughtful observations regarding the evolution of the future Java EE platforms and the JVM. Definitely worth a couple of reads ..

Concurrency Concurrency

One of the commonly mentioned fallouts of the new processor architectures is the new face of the applications written on the JVM. In order to take performance advantage from the multiple cores, applications need to be more concurrent, programmers need to find more parallelism within the application domain. Herb Sutter sums it up nicely in this landmark article :
But if you want your application to benefit from the continued exponential throughput advances in new processors, it will need to be a well-written concurrent (usually multithreaded) application. And that’s easier said than done, because not all problems are inherently parallelizable and because concurrent programming is hard.


Look Maa ! No Threads !

Writing multi-threaded code is hard, and, as the experts say, the best way to deal with multi-threading is to avoid it. The two dominant paradigms of concurrency available in modern day languages are :

  • Shared State with Monitors, where concurrency is achieved through multiple threads of execution synchronized using locks, barriers, latches etc.

  • Message Passing, which is a shared-nothing model using asynchronous messaging across lightweight processes or threads.


The second form of concurrent programming offers a higher level of abstraction where the user does not have to interact directly with the lower level primitives of thread models. Erlang supports this model of programming and has been used extensively in the telecommunications domain to achieve a great degree of parallelism. Java supports the first model, much to the horror of many experts of the domain and unless you are Brian Goetze or Doug Lea, designing concurrent applications in Java is hard.

Actors on the JVM

Actor based concurrency in Erlang is highly scalable and offers a coarser level of programing model to the developers. Have a look at this presentation by Joe Armstrong which illustrates how the share-nothing model, lightweight processes and asynchronous messaging support makes Erlang a truly Concurrency Oriented Programming Language. The presentation also gives us some interesting figures - an Erlang based Web server supported more than 80,000 sessions while Apache crashed at around 4,000.

The new kid on the block, Scala brings Erlang style actor based concurrency on the JVM. Developers can now design scalable concurrent applications on the JVM using the actor model of Scala which will automatically take advantage of the multicore processors, without programming to the complicated thread model of Java. In applications which demand large number of concurrent processes over a limited amount of memory, threads of the JVM, prove to be of significant footprint because of stack maintenance overhead and locking contentions. Scala actors provide an ideal model for programming in the non-cooperative virtual machine environment. Coupled with the pattern matching capabilities of the Scala language, we can have the full power of Erlang style concurrency on the JVM. The following example is from this recent paper by Philipp Haller and Martin Odersky:


class Counter extends Actor {
  override def run(): unit = loop(0)

  def loop(value: int): unit = {
    Console.println("Value: " + value)
    receive {
      case Incr() => loop(value + 1)
      case Value(a) => a ! value; loop(value)
      case Lock(a) => a ! value
        receive { case UnLock(v) => loop(v) }
      case _ => loop(value)
    }
  }
}



and its typical usage also from the same paper :


val counter = new Counter // create a counter actor
counter.start() // start the actor
counter ! Incr() // increment the value by sending the Incr() message
counter ! Value(this) // ask for the value

// and get it printed by waiting on receive
receive { case cvalue => Console.println(cvalue) }



Scala Actors

In Scala, actors come in two flavors -

  1. Thread based actors, that offer a higher-level abstraction of threads, which replace error-prone shared memory accesses and locks by asynchronous message passing and

  2. Event based actors, which are threadless and hence offer the enormous scalability that we get in Erlang based actors.


As the paper indicates, event based actors offer phenomenal scalability when benchmarked against thread based actors and thread based concurrency implementations in Java. The paper also demonstrates some of the cool features of library based design of concurrency abstractions in the sense that Scala contains no language support for concurrency beyond the standard thread model offered by the host environment.

I have been playing around with Scala for quite some time and have been thoroughly enjoying the innovations that the language offers. Actor based concurrency model is a definite addition to this list, more so since it promises to be a great feature that programmers would love to have as part of their toolbox while implementing on the JVM. JVM is where the future is, and event based actors in Scala will definitely be one of the things to watch out for ..

8 comments:

Anonymous said...

How about dataflow concurrency, very simplistic and error-free way of dealing with many threads.
See Mozart/Oz -language.

zproxy said...

This is exactly what games are trying to do - cut back threading :)

Francisco A. Lozano said...

Async concurrency paradigm was used in many environments *MANY* years ago, it's nothing new. Delphi's ICS components worked that way, for example.

Chris Double said...

Scala has dataflow concurrency in the form of a library that does futures and promises.

Dan Nugent said...

One of my Professors was working on an Actor Language extension to Java called SALSA: http://wcl.cs.rpi.edu/salsa/

I found that dealing with actor location was the largest annoyance... aside from the multi-step compiling.

Unknown said...

[for Lozano :]
Asynch concurrency paradigm is nothing new. But the points of interest with Scaal actors are 2-fold :
a) Scala brings Erlang style concurrency on the JVM and today we are approaching the situation where JVM is being portrayed as the default platform for your application.
b) Threaded actors in Scala provide a higher level of abstraction than raw threads and locks. While event based actors provide huge scalability.

So it's a win-win situation.

Anonymous said...

Interesting but: I know the Erlang/Yaws comparison with Apache and I know Erlang can map all its threads to one OS-thread to avoid overhead. But in order to exploit multiple CPUs you need to start multiple processes, not just Erlang threads. Which is possible in Erlang, too, but how much overhead can we still save then? Does the message passing approach really make the scalability gains or isn't it more the saved overhead by mapping multiple Erlang threads to one OS-thread? Note that we have seen big scalability gains with JVMs that did "Green Threads" - i.e. an N to one mapping - and there are also JVMs that can do N to M in order to exploit multiple CPUs while still keeping then umber of native threads low. So, to me, the question is: How much scalability do we get from the thread mapping and how much is really due to the radical "message passing share nothing" approach? Is explicit threading really that bad or did we just fall victim to "suboptimal schedulers" in existing operating systems?

Anonymous said...

it the rumination of a programmer is very nice

regards
aegan stills, songs