Scala Shell

Scala Shell (scalash) is a shell for programming in Scala. Scalash is run from the command line and allows the programmer to experiment with code in real time. It allows you to enter Scala commands at the prompt and have the interpreter respond immediately.
A quick summary of the features present:
  • colourized output (highlighting)
  • auto-completion, aka Tab-completion
  • start script support - when an interactive shell is started, Scala Shell reads and executes commands from ~/.scalarc, if that file exists.
  • persistent history (~/.scala_history)
  • command history
  • command load

You can install the current release directly with Scala Bazaar:
sbaz update
sbaz install scalashell-scala

Escape from Zurg

A long time ago, I got fascinated with the Zurg riddle. Aka Escape from Zurg, the Zurg riddle is a puzzle (featuring characters from the movie Toy Story) which has been used to teach students Logic Programming. Recently, Sam Halliday from the Javablog wrote a blog post with a Java implementation, and Paul Butcher from the Texperts wrote another blog post with a Ruby implementation of Escape from Zurg. Others implementations can to be find here. However it wasn’t long before I got the craving to write a Scala version. The idea is not to solve this specific problem, but to use it to show some Scala features.

Here’s the puzzle:

Buzz, Woody, Rex, and Hamm have to escape from Zurg. They merely have to cross one last bridge before they are free. However, the bridge is fragile and can hold at most two of them at the same time. Moreover, to cross the bridge a flashlight is needed to avoid traps and broken parts. The problem is that our friends have only one flashlight with one battery that lasts for only 60 minutes. The toys need different times to cross the bridge (in either direction):

Buzz: 5 minutes
Woody: 10 minutes
Rex: 20 minutes
Hamm: 25 minutes

Since there can be only two toys on the bridge at the same time, they cannot cross the bridge all at once. Since they need the flashlight to cross the bridge, whenever two have crossed the bridge, somebody has to go back and bring the flashlight to those toys on the other side that still have to cross the bridge.

The problem is: In which order can the four toys cross the bridge in time (that is, in 60 minutes) to be saved from Zurg?

Let’s start defining the Toys as a case class with name and time fields.

case class Toy(name: String, time: Int)

Like as the Toys, the sides of the bridge are case classes, in truth, case objects.

abstract class Direction
case object Left extends Direction
case object Right extends Direction

What are case classes/objects?

Case classes and case objects are defined like a normal classes or objects, except that the definition is prefixed with the modifier case. The case modifier in front of a class or object definition has the following effects:

  1. Case classes implicitly come with a constructor function, with the same name as the class.
  2. Case classes and case objects implicitly come with implementations of methods toString, equals and hashCode.
  3. Case classes implicitly come with nullary accessor methods which retrieve the constructor arguments.
  4. Case classes allow the constructions of patterns which refer to the case class constructor (Pattern Matching).

For more details about case classes/objects see here or in Scala Documentation.

Next, we define Move which we will use to represent transitions between states. Each move consists of a direction (right or left) and a group of toys. Move provides two methods, the cost method which returns the time taken to complete the move and the overrided toString method which provides a better print version from the move.

class Move(direction: Direction, toys: List[Toy]) {

  def cost = Iterable.max({_.time})

  override def toString = "Move: " + direction + " " +{}.mkString("[", ",", "]")


A State comprises two fields, direction which represents the current flashlight position and group which represents the toys remaining on the left-hand side of the bridge.

class State(direction: Direction, group: List[Toy]) {

  def done = group.isEmpty

  def next(f: (Move, State) => Unit) = direction match {

    case Left => for { tuple <- group.zipWithIndex
                       toy <- group drop (tuple._2 + 1)
                       toys = List(toy, tuple._1) }
                   f(new Move(Right, toys), new State(Right, group diff toys))

    case Right => for(toy <- ( diff group))
                    f(new Move(Left, List(toy)), new State(Left, toy :: group))

What are the Scala features in State class?

  1. First-Class Functions
  2. Pattern Matching
  3. For-Comprehensions

First-Class Functions

In Scala each function is a “first-class value”. Like any other value, it may be passed as a parameter or returned as a result. Functions which take other functions as parameters or return them as results are called higher-order functions. The next method takes a function from the type (Move, State) => Unit as parameter named f.

For more details about functions see here or in Scala Documentation.

Pattern Matching

Scala has a built-in general pattern matching mechanism. Pattern matching is a generalization of C or Java’s switch statement to class hierarchies. Instead of a switch statement, there is a standard method match, which is defined in Scala’s root class Any, and therefore is available for all objects. The match method takes as argument a number of cases. Scala’s pattern matching statement is most useful for matching on algebraic types expressed via case classes. Scala also allows the definition of patterns independently of case classes, using unapply methods in extractor objects. The next method starts with the statement: direction match..., a pattern matching expression with two options: Left and Right.

For more details about pattern matching and extractor objects see here or in Scala Documentation.


Scala offers special syntax to express combinations of certain higher-order functions more naturally. For comprehensions are a generalization of list comprehensions found in languages like Haskell and Python. They are mapped to combinations involving methods foreach and filter. For instance, the for loop for (path <- problem) ... in ToyStory object is mapped to problem foreach (path => ...) defined in SearchProblem class.

class SearchProblem(initial: State) {

  def foreach(f: List[Move] => Unit) {

    def solve(path: List[Move], state: State) {
      if (state.done) {
      } else {
        state next { (move, state) => solve(move :: path, state) }
    solve(Nil, initial)

object ToyStory extends Application {

  val toys = Toy("Buzz", 5) :: Toy("Woody", 10) :: Toy("Rex", 20) :: Toy("Hamm", 25) :: Nil

  val problem = new SearchProblem(new State(Left, toys)) 
  for (path <- problem)
    if ((0 /: path) {(cost, move) => cost + move.cost} <= 60)
      println("Solution: " + path)

The complete source code can be downloaded here.

Guide to Scala Bazaar auto completion using BASH

The Scala Bazaar system, “sbaz” in short, is a system used by Scala enthusiasts to share computer files with each other. In particular, it makes it easy to share libraries and applications. In this post, I’ll show you how easy it to use one of the nicest facilities of the modern shell, the built in “completion” support, to become more easy to use sbaz in command line.

First you must go to the following site to install the BASH programmable auto completion setup if your distro doesn’t have it by default. I don’t think many do so you’ll need to go to the Programmable Completion Website.

Once you’ve setup your system for auto completion you need to take the following:


  local cur commands


  commands='available compact help install installed keycreate keyforget keyknown
  keyremember keyremoteknown keyrevoke pack remove retract setuniverse setup share show
  showuniverse update packages upgrade'
  cur=`echo $cur | sed 's/\\\\//g'`

  COMPREPLY=($(compgen -W "${commands}" ${cur} | sed 's/\\\\//g') )

complete -F _sbaz_complete -o filenames sbaz

And place it in /etc/bash_completion.d/sbaz. Once you’ve done that the next time you start up your BASH shell you will have sbaz auto completion!

Download the source code here.


What is Scala and Hadoop?

Scala is a modern multi-paradigm programming language designed to express common programming patterns in a concise, and type-safe way. It smoothly integrates features of object-oriented and functional languages including mixins, algebraic datatypes with pattern matching, genericity, and more.
Hadoop is a free Java software platform that supports running applications to process vast amounts of data. It has been developed under the Apache Lucene Project and was originally developed to support distribution for Nutch, which is an effort to build an open source search engine for the search and index component. Hadoop consists of an open source implementation of Google’s published computing infrastructure, specifically MapReduce and the Google File System.

Scala + Hadoop!?

The Hadoop Map-Reduce framework is map based, whose keys and values are serializable objects which implements a simple serialization protocol. This serialization protocol is defined by the Writable interface. In addition, Hadoop provides Writable‘s implementations for each basic types(Ints, Long, Float, String, …). This implementations wrap a value of the basic type in an Writable object. Let us take this example, and think about possible utilization of int values in Hadoop:
private final static IntWritable one = new IntWritable(1);
Sounds like primitive wrappers before Java 5 boxing!

Why Scala?

Read this and this. Moreover, Scala offers a bag of others features:
  • Implicit conversion methods
Implicit methods that are often used for converting types, essentially give you statically guaranteed and provided dynamic scoping. Implicit conversions between types (You know the way a number get implicitly converted to another number type in Java, e.g. int to long, short to int, …? Scala does that too, but it’s all programmer definable. They’re just library functions in scala.Predef).
  • Type inference
The Scala compiler can oftentimes infer the type of an object so there is no need to explicitly specify the type. It is, for instance, often not necessary in Scala to specify the type of a variable, since the compiler can deduce the type from the initialization expression of the variable. Also return types of methods can often be omitted since they corresponds to the type of the body, which gets inferred by the compiler.
In short, Scala provides a clear, concise and stylized syntax.

SHadoop = Scala + Hadoop

What we would like is something like this:
val one = 1
Or like this:
def map(key: LongWritable, value: Text, output: OutputCollector[Text, IntWritable], reporter: Reporter) =
  (value split " ") foreach (output collect (_, one))
The interesting point is that with Scala, this is quite simple to implement. SHadoop is the proof!!! SHadoop consists in only one source file containing a Scala object with some implicit methods that are often used for converting primitives Java types (including String) to Writable instances. Furthermore, the SHadoop object provides implicit methods that are often used for converting writable java iterators to primitives type scala iterators - scala iterators provides a lot of useful methods, like foreach, map, filter and others.


The Hadoop Map-Reduce Tutorial shows a very simple Map-Reduce application that counts the number of occurences of each word in a given input set.
Source Code - WordCount.scala
package shadoop

import SHadoop._
import java.util.Iterator
import org.apache.hadoop.fs._

import org.apache.hadoop.mapred._

object WordCount {

  class Map extends MapReduceBase with Mapper[LongWritable, Text, Text, IntWritable] {

    val one = 1

    def map(key: LongWritable, value: Text, output: OutputCollector[Text, IntWritable], reporter: Reporter) =
      (value split " ") foreach (output collect (_, one))

  class Reduce extends MapReduceBase with Reducer[Text, IntWritable, Text, IntWritable] {

    def reduce(key: Text, values: Iterator[IntWritable],
      output: OutputCollector[Text, IntWritable], reporter: Reporter) = {

      val sum = values reduceLeft ((a: Int, b: Int) => a + b)
      output collect (key, sum)

  def main(args: Array[String]) = {
    val conf = new JobConf(classOf[Map])
    conf setJobName "wordCount"

    conf setOutputKeyClass classOf[Text]
    conf setOutputValueClass classOf[IntWritable]

    conf setMapperClass classOf[Map]
    conf setCombinerClass classOf[Reduce]

    conf setReducerClass classOf[Reduce]

    conf setInputFormat classOf[TextInputFormat]

    conf setOutputFormat classOf[TextOutputFormat[_ <: WritableComparable, _ <: Writable]]

    conf setInputPath(args(0))
    conf setOutputPath(args(1))

    JobClient runJob conf

Source code explained: Java x Scala

1. The one field from the Map class

private final static IntWritable one = new IntWritable(1);
val one = 1
Scala to specify the type of the field by type inference. Very cool!!!

2. The map method from the Map class

public void map(LongWritable key, Text value,
                 OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {

  String line = value.toString();
  StringTokenizer tokenizer = new StringTokenizer(line);

  while (tokenizer.hasMoreTokens()) {
    output.collect(word, one);
def map(key: LongWritable, value: Text,
         output: OutputCollector[Text, IntWritable], reporter: Reporter) =
  (value split " ") foreach (output collect (_, one))
Wow!!! Scala implicitly converts value to String and applies String’s split method that returns a String array. This array iterate over each String adding it as key(implicitly converted to Text) from output object and whose value is one. Note: Scala doesn’t require semicolons at the end of each instruction, they are optionals.

3. The reduce method from the Reduce class

public void reduce(Text key, Iterator<IntWritable> values,
                    OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
  int sum = 0;
  while (values.hasNext()) {
    sum +=;
  output.collect(key, new IntWritable(sum));
def reduce(key: Text, values: Iterator[IntWritable],
            output: OutputCollector[Text, IntWritable], reporter: Reporter) = {
  val sum = values reduceLeft ((a: Int, b: Int) => a + b)
  output collect (key, sum)
Again, wow!!! On first line Scala calculates the sum of the values using the reduceLeft method from a Int iterator, implicitly converted from IntWritable java iterator. After, the output object collects the sum result.


Assuming HADOOP_HOME is the root of the installation from Hadoop:
  • Copy the scala-library.jar to ${HADOOP_HOME}/lib directory
  • Run the application:
    ${HADOOP_HOME}/bin/hadoop jar shadoop-0.0.1-alpha.jar shadoop.WordCount input/ output/
input/ - a directory containing the text-files as input set
ouput/ - a ouput directory
Download jar with sources here.