Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

July 12, 2011

Scaling Scala at Twitter by Marius Eriksen

Filed under: Geographic Information Retrieval,Scala — Patrick Durusau @ 7:10 pm

Scaling Scala at Twitter by Marius Eriksen

From the description:

Rockdove is the backend service that powers the geospatial features on Twitter.com and the Twitter API (“Twitter Places”). It provides a datastore for places and a geospatial search engine to find them. To throw out some buzzwords, it is:

  • a distributed system
  • realtime (immediately indexes updates and changes)
  • horizontally scalable
  • fault tolerant

Rockdove is written entirely in Scala and was developed by 2 engineers with no prior Scala experience (nor with Java or the JVM). We think the geospatial search engine provides an interesting case study as it presents a mix of algorithm problems and “classic” scaling and optimization issues. We will report on our experience using Scala, focusing especially on:

  • “functional” systems design
  • concurrency and parallelism
  • using a “research language” in practice
  • when, where and why we turned the “functional dial”
  • avoiding mutable state

Not to mention being a well done presentation!

June 27, 2011

Spark – Lighting-Fast Cluster Computing

Filed under: Clustering (servers),Data Analysis,Scala,Spark — Patrick Durusau @ 6:39 pm

Spark – Lighting-Fast Cluster Computing

From the webpage:

What is Spark?

Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write.

To run programs faster, Spark provides primitives for in-memory cluster computing: your job can load data into memory and query it repeatedly much quicker than with disk-based systems like Hadoop MapReduce.

To make programming faster, Spark integrates into the Scala language, letting you manipulate distributed datasets like local collections. You can also use Spark interactively to query big data from the Scala interpreter.

What can it do?

Spark was initially developed for two applications where keeping data in memory helps: iterative algorithms, which are common in machine learning, and interactive data mining. In both cases, Spark can outperform Hadoop by 30x. However, you can use Spark’s convenient API to for general data processing too. Check out our example jobs.

Spark runs on the Mesos cluster manager, so it can coexist with Hadoop and other systems. It can read any data source supported by Hadoop.

Who uses it?

Spark was developed in the UC Berkeley AMP Lab. It’s used by several groups of researchers at Berkeley to run large-scale applications such as spam filtering, natural language processing and road traffic prediction. It’s also used to accelerate data analytics at Conviva. Spark is open source under a BSD license, so download it to check it out!

Hadoop must be doing something right to be treated as the solution to beat.

Still, depending on your requirements, Spark definitely merits your consideration.

TinySearchEngine

Filed under: Scala,Search Engines — Patrick Durusau @ 6:36 pm

TinySearchEngine

A search engine written in 30 lines of Scala.

Features:

  • in-memory index
  • norms and IDF calculated online
  • default OR operator between query terms
  • index a document per line from a single file
  • read stopwords from a file

June 24, 2011

How to use Scala and Lucene to create a basic search application

Filed under: Lucene,Scala,Search Engines — Patrick Durusau @ 10:45 am

How to use Scala and Lucene to create a basic search application

From the post:

How to use Scala and Lucene to create a basic search application. One of the powerful benefits of Scala is that it has full access to any Java libraries; giving you a tremendous number of available resources and technology. This example doesn’t tap into the full power of Lucene, but highlights how easy it is to incorporate Java libraries into a Scala project.

This example is based off a Twitter analysis app I’ve been noodling on; which I am utilizing Lucene. The code below takes a list of tweets from a text file; creates an index that you can search and extract info from.

Nice way to become familiar both with Scala and Lucene.

May 4, 2011

5 Books/Tutorials on Scala

Filed under: Scala — Patrick Durusau @ 12:08 pm

5 Free B-Books and Tutorials on Scala

From ReadWriteHack, a listing of books and tutorials for Scala.

One correction.

Programming in Scala, first edition, by Martin Odersky, Lex Spoon, and Bill Venners, is freely available.

There is a second edition out, which isn’t free.

April 20, 2011

Video: How Twitter Scales with Scala

Filed under: Functional Programming,Scala — Patrick Durusau @ 2:18 pm

Video: How Twitter Scales with Scala

From the post:

Last week we told you about how Twitter is migrating its search stack from Ruby to Java. But Twitter is also known for being an early adopter of Scala. This presentation by Marius Eriksen at the Commercial Users of Funtional Programming 2010 conference explains how Twitter uses Scala to scale.

April 19, 2011

Functional Scala – Mario Gleichmann

Filed under: Functional Programming,Scala — Patrick Durusau @ 9:36 am

Functional Scala – Mario Gleichmann

I ran across this delightful series of posts by Mario on Functional Scala.

To facilitate finding posts of interest, I have created this listing:

  1. Functional Scala: Introduction
  2. Functional Scala: Functions
  3. Functional Scala: Functions as Objects as Functions
  4. Functional Scala: Closures
  5. Functional Scala: Comprehending Comprehensions
  6. Functional Scala: High, Higher, Higher Order Functions
  7. Functional Scala: Lambdas and other shortcuts
  8. Functional Scala: Turning Methods into Functions (or WTF is eta expansion?)
  9. Functional Scala: Polymorphic Functions ?!?
  10. Functional Scala: Algebraic Datatypes – Enumerated Types
  11. Functional Scala: Algebraic Datatypes – Sum and Product Types
  12. Functional Scala: Algebraic Datatypes – ‘Sum of Products’ Types
  13. Functional Scala: Pattern Matching – the basics
  14. Functional Scala: Combinatoric Pattern Matching
  15. Functional Scala: Pattern Matching on product types
  16. Functional Scala: a little expression language with algebraic datatypes and pattern matching
  17. Functional Scala: Expressions, Extensions and Extractors
  18. Functional Scala: Tinkerbell, Frogs and Lists
  19. Functional Scala: List sugarization
  20. Functional Scala: Essential list functions
  21. Functional Scala: Quiz with Lists – common list functions, handcraftet

I will be updating this list as new posts appear.

A couple of general Scala you may find of interest:

Scala in practice: Composing Traits – Lego style

Scala Introduction – Slides available

April 17, 2011

Programming in Scala, First Edition

Filed under: Scala — Patrick Durusau @ 5:26 pm

Programming in Scala, First Edition

by Martin Odersky, Lex Spoon, and Bill Venners.

Entire text of the first edition (2008).

Concludes with writing a spreadsheet application in Scala.

I can’t imagine even Scala making spreadsheets interesting but we’ll see. 😉

A good starting point for an introduction to Scala.

April 10, 2011

Parallelizing Machine Learning– Functionally

Filed under: Graphs,Machine Learning,Scala — Patrick Durusau @ 2:49 pm

Parallelizing Machine Learning– Functionally

A Framework and Abstractions for Parallel Graph Processing

Abstract:

Implementing machine learning algorithms for large data, such as the Web graph and social networks, is challenging. Even though much research has focused on making sequential algorithms more scalable, their running times continue to be prohibitively long. Meanwhile, parallelization remains a formidable challenge for this class of problems, despite frameworks like MapReduce which hide much of the associated complexity. We present a framework for implementing parallel and distributed machine learning algorithms on large graphs, flexibly, through the use of functional programming abstractions. Our aim is a system that allows researchers and practitioners to quickly and easily implement (and experiment with) their algorithms in a parallel or distributed setting. We introduce functional combinators for the flexible composition of parallel, aggregation, and sequential steps. To the best of our knowledge, our system is the first to avoid inversion of control in a (bulk) synchronous parallel model.

I am particularly interested in the authors’ claim that:

While also based on graphs, Pregel is a closed system that was designed to solve large-scale “graph processing” problems, which are usually simpler in nature than typical real-world ML problems. In an effort to capitalize on Pregel’s strengths while focusing on a framework more aptly-suited to ML problems, we introduce a more flexible programming model, based on high-level functional abstractions.

Mostly because identifying where we are researching because our algorithms work versus areas where algorithms await discovery is important.

But, in part so that we know where it is appropriate to apply our usual algorithms and where those are likely to break down.

March 26, 2011

Scala Quick Reference

Filed under: Scala — Patrick Durusau @ 5:18 pm

Scala Quick Reference

If you are exploring Scala, this will be handy.

It isn’t often that I wish for a color printer but this is one of those times.

March 15, 2011

Hammurabi

Filed under: Domain-Specific Languages,Scala — Patrick Durusau @ 5:14 am

Hammurabi

From the website:

Hammurabi is a rule engine written in Scala that tries to leverage the features of this language making it particularly suitable to implement extremely readable internal Domain Specific Languages. Indeed, what actually makes Hammurabi different from all other rule engines is that it is possible to write and compile its rules directly in the host language. Anyway the Hammurabi’s rules also have the important property of being readable even by non technical person. As usual a practical example worth more than a thousand words.

I have to admit that my heart leaped at seeing a name from Ancient Near Eastern studies!

Then to discover it was for a rule engine written in Scala.

Well, still looks quite interesting, even if not ready for prime time project.

Not for any time soon, but it would be interesting to write a set of rules in Akkadian for use in constructing a topic map of Akkadian grammar.

That would be way cool.

And a nice way to brush up on my Akkadian.

Which I must admit has gotten rusty as I have worked on technical standards far a field from ancient language studies.

March 7, 2011

Scala Language Tour

Filed under: Scala,Software — Patrick Durusau @ 7:07 am

Scala Language Tour

A more recent Scala (2010) language tour.

Take the time, it will be time well spent.

March 6, 2011

An Introduction To The Scala Programming Language by Bill Venners- Webinar

Filed under: Scala — Patrick Durusau @ 3:33 pm

An Introduction To The Scala Programming Language by Bill Venners

From the post:

As those who know me will most definitely know, I have been dabbling with functional programming again. At first with F# and now with Scala. Just thought I’d share this webminar about it. Now I am starting to get it, back in university I never quite got what the deal is and found it very hard to comprehend.

Somewhat dated (2008) but interesting background and idea material. Does have some examples.

Scala + Processing – an entertaining way to learn a new language – Post

Filed under: Processing,Scala,Visualization — Patrick Durusau @ 3:32 pm

Scala + Processing – an entertaining way to learn a new language

From the post:

If you’ve read a book about some new technology it doesn’t necessarily mean that you learned or even understood it. Without practice your newly acquired knowledge will vanish soon. That’s why doing exercises from the book you are reading is important.

But all those examples are usually boring. Of course you can start your own pet project to master your skills. Several months ago to learn Scala I started my little command line tool which semi-worked at the end and I gave up on it. So, in a month or so I had to google syntax of “for loop”…

That’s where I decided that I should start writing simple examples for different Scala features that must be fun. Here’s where Processing comes into play. Using it, every novice like me can turn dull exercises into visual installations. And later you can try advanced stuff like fractals, particle systems or data visualisation.

You might be wondering what the hell is Scala. It’s a relatively new and extremely cool programming language. You can read more about it on Wikipedia or on official web site.

Processing, in case you haven’t heard, is a graphics language/environment. Has a great deal of potential for topic maps and their representations.

February 16, 2011

Playing with Scala’s pattern matching – Post

Filed under: Pattern Matching,Scala — Patrick Durusau @ 1:28 pm

Playing with Scala’s pattern matching

François Sarradin writes:

How many times have you been stuck in your frustration because you were unable to use strings as entries in switch-case statements. Such an ability would be really useful for example to analyze the arguments of your application or to parse a file, or any content of a string. Meanwhile, you have to write a series of if-else-if statements (and this is annoying). Another solution is to use a hash map, where the keys are those strings and values are the associated reified processes, for example a Runnable or a Callable in Java (but this is not really natural, long to develop, and boring too).

If a switch-case statement that accepts strings as entries would be a revolution for you, the Scala’s pattern matching says that this is not enough! Indeed, there are other cases where a series of if-else-if statements would be generously transformed into a look-alike switch-case statement. For example, it would be really nice to simplify a series of instanceof and cast included in if-else-if to execute the good process according to the type of a parameter.

In this post, we see the power of the Scala’s pattern matching in different use cases.

What language you choose for topic map development is going to depend upon its pattern matching abilities.

Here’s a chance to evaluate Scala in that regard.

Wordnik – 10 million API Requests a Day on MongoDB and Scala – Post

Filed under: MongoDB,Scala — Patrick Durusau @ 1:08 pm

Wordnik – 10 million API Requests a Day on MongoDB and Scala

From the website:

Wordnik is an online dictionary and language resource that has both a website and an API component. Their goal is to show you as much information as possible, as fast as we can find it, for every word in English, and to give you a place where you can make your own opinions about words known. As cool as that is, what is really cool is the information they share in their blog about their experiences building a web service. They’ve written an excellent series of articles and presentations you may find useful: (see the post)

Of course, what I find fascinating is the “…make your own opinions about words known” aspect of the system.

Even so, from a scaling standpoint, this sounds like an impressive bit of work.

Definitely worth a look.

February 15, 2011

Scala: Introduction to Scala for Java Programmers

Filed under: Java,Scala — Patrick Durusau @ 11:27 am

Scala: Introduction to Scala for Java Programmers by Adam Rabung.

Useful for Java programmers looking at Scala for topic map development.

February 13, 2011

Programming Scala

Filed under: Merging,Scala — Patrick Durusau @ 6:51 am

Programming Scala by Dean Wampler and Alex Payne.

Experimental book at O’Reilly Labs.

Seems to be the day for not-strictly topic map posts but I think Scala is going to be important both for topic maps as well as scalable programming in general.

I suspect that reflects my personal view that functional approaches to merging are more likely to be successful with topic maps than approaches that rely upon mutable objects.

Comments about your experience with Scala, particularly with regard to topic maps and with this book most welcome!

February 3, 2011

Scala Update with Martin Odersky

Filed under: Scala — Patrick Durusau @ 7:56 pm

Scala Update with Martin Odersky

From the website:

This episode is an update on the developments around the Scala language. We covered the new features in 2.7 and 2.8, as well as what’s planned for 2.9. We then discussed briefly the different “proficiency levels” of Scala programmers. The main part of the episode centered around Martin’s new research project: the polymorphic embedding of DSLs for expressing concurrency into Scala.

Scala is important for a number of uses, not the least of which is noted in: Introduction to Category Theory in Scala

At this Scala Update, you will find: The research project. Follow it. Takes you to a notice about a 5 year European Research Grant that was won by the Scala Research Group. Looks very important and quite possibly an area where topic maps might want to play.

January 31, 2011

Applicatives are generalized functors

Filed under: Category Theory,Scala — Patrick Durusau @ 9:50 am

Applicatives are generalized functors

A continuation of Heiko Seeberger’s coverage of Scala and category theory.

Highly recommended.

January 26, 2011

A Quick WebApp with Scala, MongoDB, Scalatra and Casbah – Practice for TMs

Filed under: MongoDB,Scala,Software,Topic Maps — Patrick Durusau @ 8:41 am

A Quick WebApp with Scala, MongoDB, Scalatra and Casbah

However clever, topic maps aren’t of much interest unless they are delivered to users.

In the general case that means a web based application.

This post is a short introduction to several tools you may find handy with building and/or delivering topic maps.

*****
PS: We will know topic maps have arrived when the technology keeps changing but management of subject identity is inherent in both programming languages and application design. Ways to go yet.

January 17, 2011

Rogue

Filed under: MongoDB,Query Language,Scala — Patrick Durusau @ 8:38 pm

Rogue

From the website:

Rogue is a type-safe internal Scala DSL for constructing and executing find and modify commands against MongoDB in the Lift web framework. It is fully expressive with respect to the basic options provided by MongoDB’s native query language, but in a type-safe manner, building on the record types specified in your Lift models.

Seen on MyNoSQL

*****
PS: To learn more about Lift, see: http://liftweb.net/

December 10, 2010

Scala in Depth

Filed under: Scala,Software — Patrick Durusau @ 7:18 am

Scala in Depth Authors: Josh Suereth

Abstract:

Scala is a unique and powerful new programming language for the JVM. Blending the strengths of the Functional and Imperative programming models, Scala is a great tool for building highly concurrent applications without sacrificing the benefits of an OO approach. While information about the Scala language is abundant, skilled practitioners, great examples, and insight into the best practices of the community are harder to find. Scala in Depth bridges that gap, preparing you to adopt Scala successfully for real world projects. Scala in Depth is a unique new book designed to help you integrate Scala effectively into your development process. By presenting the emerging best practices and designs from the Scala community, it guides you though dozens of powerful techniques example by example. There’s no heavy-handed theory here-just lots of crisp, practical guides for coding in Scala.

For example:

  • Discover the “sweet spots” where object-oriented and functional programming intersect.
  • Master advanced OO features of Scala, including type member inheritance, multiple inheritance and composition.
  • Employ functional programming concepts like tail recursion, immutability, and monadic operations.
  • Learn good Scala style to keep your code concise, expressive and readable.

As you dig into the book, you’ll start to appreciate what makes Scala really shine. For instance, the Scala type system is very, very powerful; this book provides use case approaches to manipulating the type system and covers how to use type constraints to enforce design constraints. Java developers love Scala’s deep integration with Java and the JVM Ecosystem, and this book shows you how to leverage it effectively and work around the rough spots.

There is little doubt that concurrent programming is a dawning reality. Which languages will be the best for concurrent programming in general (if there is such a case) or for topic maps is particular isn’t as clear.

Only time and usage can answer those questions.

November 27, 2010

Introduction to Category Theory in Scala

Filed under: Category Theory,Scala — Patrick Durusau @ 9:58 pm

Introduction to Category Theory in Scala.

Jack Park and I have been bouncing posts about category theory resources off of each other for years.

This one looks like a keeper.

It may be the sort of series that acts as a bridge between an abstraction (category theory) and the real world of programming.

They are related you know.

« Newer Posts

Powered by WordPress