Scala as a platform… « Another Word For It

Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

December 30, 2013

Scala as a platform…

Filed under: Data Science,Programming,Scala — Patrick Durusau @ 3:19 pm

Scala as a platform for statistical computing and data science by Darren Wilkinson

From the post:

There has been a lot of discussion on-line recently about languages for data analysis, statistical computing, and data science more generally. I don’t really want to go into the detail of why I believe that all of the common choices are fundamentally and unfixably flawed – language wars are so unseemly. Instead I want to explain why I’ve been using the Scala programming language recently and why, despite being far from perfect, I personally consider it to be a good language to form a platform for efficient and scalable statistical computing. Obviously, language choice is to some extent a personal preference, implicitly taking into account subjective trade-offs between features different individuals consider to be important. So I’ll start by listing some language/library/ecosystem features that I think are important, and then explain why.

A feature wish list

It should:

be a general purpose language with a sizable user community and an array of general purpose libraries, including good GUI libraries, networking and web frameworks

be free, open-source and platform independent

be fast and efficient

have a good, well-designed library for scientific computing, including non-uniform random number generation and linear algebra

have a strong type system, and be statically typed with good compile-time type checking and type safety

have reasonable type inference

have a REPL for interactive use

have good tool support (including build tools, doc tools, testing tools, and an intelligent IDE)

have excellent support for functional programming, including support for immutability and immutable data structures and “monadic” design

allow imperative programming for those (rare) occasions where it makes sense

be designed with concurrency and parallelism in mind, having excellent language and library support for building really scalable concurrent and parallel applications

The not-very-surprising punch-line is that Scala ticks all of those boxes and that I don’t know of any other languages that do. But before expanding on the above, it is worth noting a couple of (perhaps surprising) omissions. For example:

have excellent data viz capability built-in

have vast numbers of statistical routines in the standard library

Darren reviews Scala on each of these points.

Although he still uses R and Python, Darren has hopes for future development of Scala into a full featured data mining platform.

Perhaps his checklist will contribute the requirements needed to make that one of the futures of Scala.

Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

December 30, 2013

Scala as a platform…

No Comments