Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

April 14, 2015

Hash Table Performance in R: Part I + Part 2

Filed under: Hashing,R — Patrick Durusau @ 10:53 am

Hash Table Performance in R: Part I + Part 2 by Jeffrey Horner.

From part 1:

A hash table, or associative array, is a well known key-value data structure. In R there is no equivalent, but you do have some options. You can use a vector of any type, a list, or an environment.

But as you’ll see with all of these options their performance is compromised in some way. In the average case a lookupash tabl for a key should perform in constant time, or O(1), while in the worst case it will perform in O(n) time, n being the number of elements in the hash table.

For the tests below, we’ll implement a hash table with a few R data structures and make some comparisons. We’ll create hash tables with only unique keys and then perform a search for every key in the table.

This rocks! Talk about performance increases!

My current Twitter client doesn’t dedupe my home feed and certainly doesn’t dedupe it against search based feeds. I’m not so concerned with retweets as with authors that repeat the same tweet several times in a row. What I don’t know is what period of uniqueness would be best? Will have to experiment with that.

I originally saw this series at Hash Table Performance in R: Part II In Part I of this series, I explained how R hashed… on R-Bloggers, the source of so much excellent R related content.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress