Ruminations of a Programmer: May 2008

Sunday, May 25, 2008

Designing Internal DSLs in Scala

In an earlier post I had talked about building external DSLs using parser combinators in Scala. External DSLs involve parsing of syntax foreign to the native language - hence the ease of developing external DSLs depends a lot on parsing and parse tree manipulation capabilities available in existing libraries and frameworks. Internal DSLs are a completely different beast altogether, and singularly depends on the syntax and meta programming abilities that the language offers. If you have a language that is syntactically rigid and does not offer any meta (or higher level) programming features, then there is no way you can design usable internal DSLs. You can approximate to the extent of designing fluent interfaces.

Scala is a statically typed language that offers lots of features for designing concise, user friendly internal DSLs. I have been working on designing a DSL in Scala for interacting with a financial trading and settlement system, implemented primarily in Java. Since Scala interoperates with Java quite well, a DSL designed in Scala can be a potent tool to provide your users with nicely concise, malleable APIs even over existing Java domain model.

Here is a snippet of the DSL in action, for placing client orders for buy/sell of securities to the exchange ..

val orders = List[Order](

  // use premium pricing strategy for order
  new Order to buy(100 sharesOf "IBM")
            maxUnitPrice 300
            using premiumPricing,

  // use the default pricing strategy
  new Order to buy(200 sharesOf "GOOGLE")
            maxUnitPrice 300
            using defaultPricing,

  // use a custom pricing strategy
  new Order to sell(200 bondsOf "Sun")
            maxUnitPrice 300
            using {
              (qty, unit) => qty * unit - 500
            }
)

The DSL looks meaningful enough for the business analysts as well, since it uses the domain language and does not contain much of the accidental complexities that we get in languages like Java. The language provides easy options to plug in default strategies (e.g. for pricing orders, as shown above). Also it offers power users the ability to define custom pricing policies inline when instantiating the Order.

Here are some of the niceties in the syntax of Scala that makes it a DSL friendly language ..

Implicits

Higher order functions

Optional dots, semi-colons and parentheses

Operators like methods

Currying

Implicits are perhaps the biggest powerhouse towards designing user friendly DSLs. They do away with the static cling of Java and yet offer a safe way to extend existing abstractions. Implicits in Scala is perhaps one of the best examples of a meaningful compromise in language design between uncontrolled open classes of Ruby and the sealed abstractions of Java. In the above example DSL, I have opened up the Int class and added methods that convert a raw number to a quantity of shares. Here is the snippet that allows me to write code like sell(200 bondsOf "Sun") as valid Scala code.

class PimpedInt(qty: Int) {
  def sharesOf(name: String) = {
    (qty, Stock(name))
  }

  def bondsOf(name: String) = {
    (qty, Bond(name))
  }
}

implicit def pimpInt(i: Int) = new PimpedInt(i)

And the best part is that the entire extension of the class Int is lexically scoped and will only be available within the scope of the implicit definition function pimpInt.

Scala, having strong functional capabilities, offer higher order functions, where, first class methods can be passed around like objects in the OO world. This helps define custom control abstractions that look like the natural syntax of the programming language. This is another great feature that helps design DSLs in Scala. Add to that, optional parentheses and dots, and you can have syntax like ..

to buy(200 sharesOf "GOOGLE")
maxUnitPrice 300
using defaultPricing

where defaultPricing and premiumPricing are functions that have been passed on as arguments to methods. The method using in class Order takes a function as input. And you can define the function inline as well, instead of passing a predefined one. This is illustrated in the last Order created in the above example.

Another small subtlety that Scala offers is the convenience to sugarize methods of a class that takes one parameter. In the above example, 100 sharesOf "IBM" is actually desugared as 100.sharesOf("IBM"). Though the former looks more English like, without the unnecessary dot and parenthesis. Nice!

Here is the complete listing of an abbreviated version of the DSL and a sample usage ..

object TradeDSL {

  abstract class Instrument(name: String) { def stype: String }
  case class Stock(name: String) extends Instrument(name) {
    override val stype = "equity"
  }
  case class Bond(name: String) extends Instrument(name) {
    override val stype = "bond"
  }

  abstract class TransactionType { def value: String }
  case class buyT extends TransactionType {
    override val value = "bought"
  }
  case class sellT extends TransactionType {
    override val value = "sold"
  }

  class PimpedInt(qty: Int) {
    def sharesOf(name: String) = {
      (qty, Stock(name))
    }

    def bondsOf(name: String) = {
      (qty, Bond(name))
    }
  }

  implicit def pimpInt(i: Int) = new PimpedInt(i)

  class Order {
    var price = 0
    var ins: Instrument = null
    var qty = 0;
    var totalValue = 0
    var trn: TransactionType = null
    var account: String = null

    def to(i: Tuple3[Instrument, Int, TransactionType]) = {
      ins = i._1
      qty = i._2
      trn = i._3
      this
    }
    def maxUnitPrice(p: Int) = { price = p; this }

    def using(pricing: (Int, Int) => Int) = {
      totalValue = pricing(qty, price)
      this
    }

    def forAccount(a: String)(implicit pricing: (Int, Int) => Int) = {
      account = a
      totalValue = pricing(qty, price)
      this
    }
  }

  def buy(qi: Tuple2[Int, Instrument]) = (qi._2, qi._1, buyT())
  def sell(qi: Tuple2[Int, Instrument]) = (qi._2, qi._1, sellT())

  def main(args: Array[String]) = {

    def premiumPricing(qty: Int, price: Int) = qty match {
      case q if q > 100 => q * price - 100
      case _ => qty * price
    }

    def defaultPricing(qty: Int, price: Int): Int = qty * price

    val orders = List[Order](

      new Order to buy(100 sharesOf "IBM")
                maxUnitPrice 300
                using premiumPricing,

      new Order to buy(200 sharesOf "CISCO")
                maxUnitPrice 300
                using premiumPricing,

      new Order to buy(200 sharesOf "GOOGLE")
                maxUnitPrice 300
                using defaultPricing,

      new Order to sell(200 bondsOf "Sun")
                maxUnitPrice 300
                using {
                  (qty, unit) => qty * unit - 500
                }
    )
    println((0 /: orders)(_ + _.totalValue))
  }
}

Sunday, May 18, 2008

Choosing between Abstract Type Members and Existentials in Scala

Scala offers abstract type members as a means to abstract over concrete types of components. The internal type of an abstraction can be hidden from the user who can invoke methods on objects without knowing the actual type of the member. Recently I was implementing a Strategy design pattern based abstraction for computing salaries of all types of workers in an organization. I did not design a common base class of all the types of workers in the model and was trying to build up my abstraction hierarchy based on composition of components. Here is what I came up with (simplified for brevity) ..

object SalaryComputation {

  trait Source {
    type E
    val salaried: List[E]
    val salCalculator: (E => Double)
  }

  class Context {
    val payees = new ArrayBuffer[Source]

    def addSource[T](p: List[T], salCalc: (T => Double)) = {
      payees += new Source {
        type E = T
        val salaried = p
        val salCalculator = salCalc
      }
    }
  }

  def compute = {
    val emps = List[Employee](
      //..
    )

    val cemps = List[Contractor](
      //..
    )

    val c = new Context()
    c.addSource(emps, (e: Employee) => (e.monthlyBasic + e.allowance - e.monthlyBasic * 0.3))
    c.addSource(cemps, (e: Contractor) => (e.hourlyRate * 8 * 20) * 0.7)

    c.payees.map(p => p.salaried.map(p.salCalculator(_))).foreach(println)
  }
}

and the workers are modeled as case classes ..

case class Employee(id: Int, name: String, monthlyBasic: Int, allowance: Int)
case class Contractor(id: Int, name: String, hourlyRate: Int)
case class DailyWorker(id: Int, name: String, dailyRate: Int, overtimeRate: Int)

The salary computation part has a separate Source component which abstracts the data source and salary computation strategy for one particular class of worker. I can have multiple classes of workers being fed into the salary computation component through multiple instances of the Source component. And this aggregation is managed as the Context of the computation. Have a look at the compute method that adds up Sources into the context and builds up the collection for total computation of salaries.

In the above implementation, the trait Source abstracts the type of the worker over the collection for which salary will be computed and the salary computation strategy. And both of them are injected when new instances of Source are being created in method addSource(). The client interface to the abstraction (in method compute) looks nice and DRY and it works as per expectation.

But what about Existential Types ?

In one of the recent releases, Scala has introduced existential types, which offers hiding (pack) of concrete types from the user and allows them to manipulate objects using a given name (unpack) and bounds. Keeping aside all type theoretic foundations behind existential types, I found it equally convenient to implement the above model using existential types instead of abstract type members. Here it goes ..

object SalaryComputation {

  case class Source[E](salaried: List[E], salCalculator: (E => Double))

  class Context {
    val payees = new ArrayBuffer[Source[T] forSome { type T }]

    def addSource[T](p: List[T], salCalc: (T => Double)) = {
      payees += new Source[T](p, salCalc)
    }
  }

  def compute = {
    val emps = List[Employee](
      //..
    )

    val cnts = List[Contractor](
      //..
    )

    val c = new Context()
    c.addSource(emps, (e: Employee) => (e.monthlyBasic + e.allowance - e.monthlyBasic * 0.3))
    c.addSource(cnts, (e: Contractor) => (e.hourlyRate * 8 * 20) * 0.7)

    def calc[T](c: Source[T]) = {
      c.salaried.map(c.salCalculator(_))
    }

    c.payees.map(calc(_)).foreach(println(_))
  }
}

The Scala book by Artima mentions that existential types have been introduced in Scala mainly for interoperability between Scala and Java's wild card types and raw types. But as I found above, they offer all benefits of type abstraction as abstract type members. Ignoring the stylistic issues, is there any reason to choose one approach over the other in the above implementations ?

Monday, May 12, 2008

Thinking in JVM languages

When I find a language expressive enough to implement the programming idioms succinctly, I like to use the language in my projects. But I constantly have to thunder on myself the fact that a project development is a team game and the language has to be usable by all members of the development team. Another fact that stares at me is the deadline for delivery that has long been committed to the client by a different team, completely oblivious of all constraints of software development lifecycle that appear to rock the project right from inception. Hence choosing the programming language is also a function of the number of available frameworks, libraries, refactoring-friendly IDEs and community support that can perceivably add to the velocity of program development. Considering all such forces, it is a no brainer that both us and the client happily choose the Java programming language for most of our development projects.

Is there any value in learning new languages according to the epigrams of Alan Perlis ? Mychael Nygard recently wrote a very thoughtful essay on this, keeping in mind the recent trends of development planned for Java, the language and Java, the platform. Here are some of our thoughts from the trenches of the development team ..

Of late, a couple of new developments on the JVM platform have added a new dimension to our project infrastructure. One of them is the scripting support that comes bundled with Java 6 and the other is the emergence of languages like Scala, Groovy and JRuby that can coexist happily in a single project under the caring patronage of the Java virtual machine. As a result, things become a little more interesting nowadays, and we can enjoy some amount of polyglotism by sneaking in a few of these spices into the mass of Java objects. I had earlier blogged on some such (not all successful) attempts -

trying to use Scheme as an executable XML through SISC, an attempt that got shot down the moment I uttered the word Lisp

using Javascript through Rhino engine to externalize some of the customizable business rules in a big Java project

making Java objects smarter with nicely scaffolded APIs using Scala's lexically scoped open classes

Is there any value to such attempts in projects where the bulk of the code is churned out by new inexperienced developers and managed by delivery managers encouraging maintainable programming paradigms ?

Isn't Java enough ?

Java has been the most dominant programming language for the enterprise. Given the proper set of tools, I love programming in Java, however unhip may it sound today. Then, why do I try to sneak in those weird features of Groovy, Scala or Rhino, when the ecosystem of the project thrives mainly on a Java diet ? Because I believe syntax matters, succinctness makes abstractions more resilient and powerful idioms always reduce the solution space noise, making the code base a more true representation of the business problem that I am trying to solve. Design patterns encourage best practices in application design, but their implementations in Java often tend to generate lots of boilerplate codes, which, in other higher level languages can be addressed through richer levels of abstractions.

I understand that implementing a whole new enterprise project in an evolving language like Scala or Groovy is not a feasible option today (yet). But I can certainly use some of their goodness to make my APIs more like the domain that I am modeling. And the best thing is that I can use these power on top of my existing Java objects. The key part is to identify the use cases that makes this exercise non-invasive, risk free and maintainable by the profile of programmers on the job.

In a typical multi-tiered application stack, there are layers which do not need the safety net of static typing e.g. the modules which interact with the external data sources and receive amorphous data streams that my application has to consume and forward to the domain layer underneath. Groovy is a dynamically typed language on the JVM and provides strong support for XML consumption and generation without the verbosity of Java. Hamlet D'Arcy demonstrates how effectively we can use Groovy in the service layer acting as the perfect glue to my Java based domain layer. This makes the code base smaller through programming at a higher level of abstraction, and at the same time keeps dynamic modules decoupled from the core static domain layer. External interfaces are usually required to be kept malleable enough, so that changes to them do not impact your core logic - and Groovy or JRuby provides ample scope for such decoupling.

In one of our Java EE projects, I had used Rhino scripting to keep business rules externalized from the core model. The requirement was to have the rules configurable by the users without a full build of the application code and hot deployment of those rules within the running application containers. Scripting engines, bundled with Java 6 is a great option where I can provide dynamic loading capabilities for all such scripting tasks. With OSGi becoming mainstream, I am sure, we will have better options for application packaging, versioning and deployment very soon.

And for the static typing afficionados .. (believe me, I am also one !)

Not only with dynamically typed languages, you can get the benefits of static typing, along with nice concise syntax on the JVM today. Scala is a multi-paradigm language for the JVM offering all the goodness of statically checked duck typing, type inferencing, rich functional features and some great library support for threadless concurrency and parser combinators. Scala supports XML literals as part of the language, which can very well be used to implement elegant XML crunching modules, much concise compared to the DOM APIs or JAXB frameworks that Java offers.

Recently in one of my programming assignments, I had to design a few adapters to existing Java classes, not related througn common parentage. The requirement was to define a set of uniform operations over a collection of the adapted classes based on some common structural classifiers. Initially I came up with a Java solution. It was standard idiomatic Java which would pass any careful review if it were a couple of years ago. I tried the same problem in Scala and could come up with a far more elegant and succinct solution. The three features of Scala that made the solution more precise are the supports for structural typing, implicit adapters and of course, functional programming. And since the entire development was additive and belonged to the service layer of the application, the core domain model was not impacted. The client was never bothered as long as his investments and commitments on the JVM were protected. As David Pollak has recently stated in one of his posts, it is only an additional jar. So true.

Is the infrastructure ready ?

All the JVM languages are evolving - even Java is slated to undergo lots of evolutions in the coming days (closures, JSR-308, modularity, ..). The most important thing, as I mentioned above is to follow the evolutionary path and carefully choose the use cases to plugin the goodness of these languages. To me, lots of risks are mitigated once you start using them as additive glue, rather than invasive procedures. These languages are becoming performant by the day, and innovations on hosting languages on a common runtime are now a reality. Groovy 1.6 has seen significant performance improvements in method dispatch by shortening the call path between the caller and the receiver through using method handles and call site cache, a technique previously applied in optimizing JRuby performance, and documented very well by Charles Nutter in one of his recent posts. This is one big JVM community force in action towards improving the daily life of all languages hosted there.

The best part of "polyglotism under a common runtime" is that I can very well use a uniform toolset for all the languages that I use. Unit testing frameworks like JUnit, TestNG are available to all developers working on multiple languages like Java, Groovy, Scala etc. Maven and Ant with all their eyesore XMLs are still available for any of them. And of course I can use my favorite IDE polymorphically over all languages, albeit with varying degrees of refactoring abilities. And if I am adventurous enough, I can also use additional power of JTestR, ScalaCheck and specs for doing all sorts of BDD and PDD stuff. Real fun huh!

Are you planning to use any of the non-Java, JVM friendly languages in your Java project ? What are the typical use cases that you think fits the bill for another JVM language ?

Sunday, May 04, 2008

More Coin Changers in Scala - Recursive, Memoized and Dynamic

In my last post I implemented the greedy algorithm for coin changer in Scala. While greedy algorithms provide optimal solutions for some combinations of coin denominations, it does not offer correct optimal solution for all combinations. In fact, coin changing problem is a well discussed example of a dynamic programming algorithm that exhibits the optimal substructure property, where the optimal solution for the problem can be implemented in terms of optimal solutions of smaller subproblems. Just as a refresher course in basic algorithms and keeping up the spirit of my learning Scala, I thought it might be a good idea to try to implement this in Scala.

One of the approaches to solving the problem is by defining the subproblems through a recurrence relation and implementing a top down recursive algorithm like the following :

object CoinChanger {
  def changeRecursive(denominations: List[Int], amount: Int): (Int, List[Int]) = amount match {
    case 0 => (0, List())
    case a =>
      val ch = ((java.lang.Integer.MAX_VALUE, List[Int](), 0) /: denominations)
        {
          (x, y) =>
            if (amount >= y) {
              lazy val ret = changeRecursive(denominations, amount - y)
              if (ret._1 + 1 < x._1) {
                (ret._1 + 1, ret._2, y)
              } else x
            } else x
        }
      (ch._1, ch._3 :: ch._2)
  }
}

It is a top down approach of solving the problem but has the obvious drawback of naive recursive implementations. Same subproblems are being solved repeatedly, leading to an exponential complexity of O(M exp d) for computing the optimal combination for M amount with d denominations.

Using memoization, we can speed up the implementation significantly, while keeping the current top down approach. The following implementation uses the memo functions from Scalaz :

object CoinChanger {
  def changeRecursiveMemoized(denominations: List[Int], amount: Int): (Int, List[Int]) = {
    val m = arrayMemo[(Int, List[Int])](amount)
    def change(a: Int): (Int, List[Int]) = a match {
      case 0 => (0, List())
      case x =>
        val ch = ((java.lang.Integer.MAX_VALUE, List[Int](), 0) /: denominations)
          {
            (x, y) =>
              if (a >= y) {
                lazy val ret = m(change)(a - y);
                if (ret._1 + 1 < x._1) {
                  (ret._1 + 1, ret._2, y)
                } else x
              } else x
          }
        (ch._1, ch._3 :: ch._2)
    }
    change(amount)
  }
}

Problems that exhibit optimal substructure property have efficient dynamic programming solutions. However the subproblems have to be solved before, so as to use the precomputed results in solving the bigger problem. This implies a bottom up approach instead of the top dopwn approach that the above recursive algorithm implements. The following algorithm attempts to solve an apparently harder problem of finding out the optimal combination of changes for all amounts from 1 to the specified input amount. In the following dynamic programming implementation, we get the same result, but by changing the problem definition to one which uses the precomputed values from smaller subproblems more deterministically and efficiently. The algorithm has a complexity of O(Md) for computing the optimal combination for M amount with d denominations :

object DPCoinChanger {
  def dpChange(denominations: List[Int], amount: Int) = {
    var bestNumCoins = new Array[(Int, List[Int])](amount + 1)
    bestNumCoins(0) = (0, List[Int]())
    List.range(1, amount + 1).foreach{a =>
      bestNumCoins(a) = (java.lang.Integer.MAX_VALUE, List())
      denominations.foreach{d =>
        if (a >= d) {
          if (bestNumCoins(a - d)._1 + 1 < bestNumCoins(a)._1) {
            bestNumCoins(a) = (bestNumCoins(a - d)._1, d :: bestNumCoins(a - d)._2)
          }
        }
      }
    }
    bestNumCoins(amount)
  }
}

Implementing well known algorithms is also a nice way to learn a programming language. And in course of this learning process, be sure to stick to the idioms that the language shines in. My initial implementations of the recursive algorithms had some mutable variables, which is, kind of expected, coming from someone thriving on diets of Java (not that there is anything wrong with it). Thinking functionally is one of the newer paradigms that I am learning these days.