Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

October 8, 2013

Splunk Enterprise 6

Filed under: Intelligence,Machine Learning,Operations,Splunk — Patrick Durusau @ 3:27 pm

Splunk Enterprise 6

The latest version of Splunk is described as:

Operational Intelligence for Everyone

Splunk Enterprise is the leading platform for real-time operational intelligence. It’s the easy, fast and secure way to search, analyze and visualize the massive streams of machine data generated by your IT systems and technology infrastructure—physical, virtual and in the cloud.

Splunk Enterprise 6 is our latest release and delivers:

  • Powerful analytics for everyone—at amazing speeds
  • Completely redesigned user experience
  • Richer developer environment to easily extend the platform

The current download page promises the enterprise version for 60 days. At the end of that period you can convert to a Free license or purchase an Enterprise license.

June 26, 2013

Hunk: Splunk Analytics for Hadoop Beta

Filed under: Analytics,Splunk — Patrick Durusau @ 1:28 pm

Hunk: Splunk Analytics for Hadoop Beta

From the post:

Hunk is a new software product to explore, analyze and visualize data in Hadoop. Building upon Splunk’s years of experience with big data analytics technology deployed at thousands of customers, it drives dramatic improvements in the speed and simplicity of interacting with and analyzing data in Hadoop without programming, costly integrations or forced data migrations.

  • Splunk Virtual Indexing (patent pending) – Explore, analyze and visualize data across multiple Hadoop distributions as if it were stored in a Splunk index
  • Easy to Deploy and Drive Fast Value – Simply point Hunk at your Hadoop cluster and start exploring data immediately
  • Interactive Analysis of Data in Hadoop – Drive deep analysis, pattern detection and find anomalies across terabytes and petabytes of data. Correlate data to spot trends and identify patterns of interest

I think this is the line that will catch most readers:

Hunk is compatible with virtually every leading Hadoop distribution. Simply point it at your Hadoop cluster and start exploring and analyzing your data within minutes.

Professional results may take longer but results within minutes will please most users.

December 21, 2012

Connecting Splunk and Hadoop

Filed under: Hadoop,Splunk — Patrick Durusau @ 6:23 am

Connecting Splunk and Hadoop by Ledion Bitincka.

From the post:

Finally I am getting a some time to write about some cool features of one the projects that I’ve been working on – Splunk Hadoop Connect . This app is our first step in integrating Splunk and Hadoop. In this post I will cover three tips on how this app can help you, all of them are based on the new search command included in the app: hdfs. Before diving into the tips I would encourage that you download, install and configure the app first. I’ve also put together two screencast videos to walk you through the installation process:

Installation and Configuration for Hadoop Connect
Kerberos Configuration

You can also find the full documentation for the app here

Cool!

Is it just me or is sharing data across applications becoming more common?

Thinking the greater the sharing, the greater the need for mapping data semantics for integration.

December 20, 2012

Splunk SDKs for Java and Python

Filed under: BigData,Splunk — Patrick Durusau @ 7:07 pm

Splunk Announces New Development Tools to Extend the Power of Splunk Software

From the post:

Splunk Inc. (NASDAQ: SPLK), the leading software platform for real-time operational intelligence, today announced the general availability (GA) of new software development kits (SDKs) for Java and Python. SDKs make it easier for developers to customize and extend the power of Splunk® Enterprise, enabling real-time big data insights across the organization. Splunk previously released the GA version of the Splunk SDK for JavaScript for Splunk Enterprise 5. The Splunk SDK for PHP is in public preview.

“Our mission at Splunk is to lower the barriers for organizations to gain operational intelligence from machine data,” said Paul Sanford, general manager of developer platform, Splunk. “We want to empower developers to build big data applications on the Splunk platform and to understand that you don’t need large-scale development efforts to get big value. That’s a key driver behind the development of these SDKs, helping developers quickly get started with Splunk software, leveraging their existing language skills and driving rapid time to value.”

“Building a developer community around a software platform requires a strong commitment to a low barrier to entry. This applies to every step of the adoption process, from download to documentation to development. Splunk’s focus on SDKs for some of the most popular programming languages, with underlying REST-based APIs, supports its commitment to enabling software developers to easily build applications,” said Donnie Berkholz, Ph.D., IT industry analyst, RedMonk.

Just in time for the holidays!

Downloads:

Splunk Enterprise

Splunk SDK for Java

Splunk SDK for JavaScript

Splunk SDK for PHP

Splunk SDK for Python

I first saw this in a tweet by David Fauth.

November 30, 2012

Campaign Finance Data in Splunk [Cui bono?]

Filed under: Government,Government Data,Splunk — Patrick Durusau @ 5:29 pm

Two post you may find interesting:

SPLUNK’D: Federal Election Commission Campaign Finance Data

and,

Spluk4Good Announces public data project highlighting FEC Campaign Finance Data

Project link.

The project reveals answers to our burning questions:

  • What state gives the most?
  • Which state gives the most per capita? (Bet you won’t guess this one!)
  • What does aggregate giving look like visualized over the election cycle?
  • Is your city more Red or more Blue?
  • What does a map viz with drilldown reveal about giving by zip codes or cities?
  • What occupation gives the most?
  • Are geologists more Red or more Blue (Hint: think about where geologist live and who they work for!)

Impressive performance but some of my burning questions would be:

  • Closing which tax loopholes would impact particular taxpayers who contributed to X political campaign?
  • Which legislative provisions benefits particular taxpayers or their investments?
  • Which regulations by federal agencies benefit particular taxpayers or their businesses?

The FEC data isn’t all you would need to answer those questions. But the answers are known.

Someone asked for the benefits in all three cases. Someone wrote the laws, regulations or loop holes with the intent to grant those benefits.

Not all of those are dishonest. Consider the charitable contributions that sustain fine art, music, libraries and research that benefits all of us.

There are other benefits that are less benign.

To identify the givers, recipients, legislation/regulation and the benefit, would require collocation of data from disparate domains and vocabularies.

Interested?

October 15, 2012

Exploring Splunk: Search Processing Language (SPL) Primer and Cookbook

Filed under: Data Analysis,Searching,Splunk — Patrick Durusau @ 2:35 pm

Exploring Splunk: Search Processing Language (SPL) Primer and Cookbook by David Carraso.

From the webpage:

Splunk is probably the single most powerful tool for searching and exploring data you will ever encounter. Exploring Splunk provides an introduction to Splunk — a basic understanding of Splunk’s most important parts, combined with solutions to real-world problems.

Part I: Exploring Splunk

  • Chapter 1 tells you what Splunk is and how it can help you.
  • Chapter 2 discusses how to download Splunk and get started.
  • Chapter 3 discusses the search user interface and searching with Splunk.
  • Chapter 4 covers the most commonly used search commands.
  • Chapter 5 explains how to visualize and enrich your data with knowledge.

Part II: Solution Recipes

  • Chapter 6 covers the most common monitoring and alerting solutions.
  • Chapter 7 covers the most common transaction solutions.
  • Chapter 8 covers the most common lookup table solutions.

My Transaction Searching: Unifying Field Names post is based on an excerpt from this book.

You can download the book in ePub, pdf or Kindle versions or order a hardcopy.

Documentation that captures the interest of a reader.

Not that warns them software is going to be painful, even if in the long term beneficial.

Most projects could benefit from using “Exploring Splunk” as a model for introductory documentation.

Transaction Searching: Unifying Field Names

Filed under: Merging,Splunk — Patrick Durusau @ 2:14 pm

Transaction Searching: Unifying Field Names posted by David Carraso.

From the post:

Problem

You need to build transactions from multiple data sources that use different field names for the same identifier.

Solution

Typically, you can join transactions with common fields like:

... | transaction username

But when the username identifier is called different names (login, name, user, owner, and so on) in different data sources, you need to normalize the field names.

If sourcetype A only contains field_A and sourcetype B only contains field_B, create a new field called field_Z which is either field_A or field_B, depending on which is present in an event. You can then build the transaction based on the value of field_Z.

sourcetype=A OR sourcetype=B
| eval field_Z = coalesce(field_A, field_B)
| transaction field_Z

Looks a lot like a topic map merging operation doesn’t it?

But “looks a lot like” doesn’t mean it is “the same as” a topic map merging operation.

How would you say it is different?

While the outcome may be the same as a merging operation (which are legend defined), I would say that I don’t know how we got from A or B to Z?

That is next month or six months from now, or even two years down the road, I have C and I want to modify this transaction.

Question: Can I safely modify this transaction to add C?

I suspect the answer is:

“We don’t know. Have to go back to confirm what A and B (as well as C) mean and get back to you on that question.”

For a toy example that seems like overkill, but what if you have thousands of columns spread over hundreds of instances of active data systems.

Still feel confident about offering an answer without researching it?

Topic map based merging could give you that confidence.

Even if like Scotty, you say two weeks and have the answer later that day, to age a bit before delivering it ahead of schedule. 😉

October 9, 2012

Code for America: open data and hacking the government

Filed under: Government,Government Data,Open Data,Open Government,Splunk — Patrick Durusau @ 12:50 pm

Code for America: open data and hacking the government by Rachel Perkins.

From the post:

Last week, I attended the Code for America Summit here in San Francisco. I attended as a representative of Splunk>4Good (we sponsored the event via a nice outdoor patio lounge area and gave away some of our (in)famous tshirts and a few ponies). Since this wasn’t your typical “conference”, and I’m not so great at schmoozing, i was a little nervous–what would Christy Wilson, Clint Sharp, and I do there? As it turned out, there were so many amazing takeaways and so much potential for awesomeness that my nervousness was totally unfounded.

So what is Code for America?

Code for America is a program that sends technologists (who take a year off and apply to their Fellowship program) to cities throughout the US to work with advocates in city government. When they arrive, they spend a few weeks touring the city and its outskirts, meeting residents, getting to know the area and its issues, and brainstorming about how the city can harness its public data to improve things. Then they begin to hack.
Some of these partnerships have come up with amazing tools–for example,

  • Opencounter Santa Cruz mashes up several public datasets to provide tactical and strategic information for persons looking to start a small business: what forms and permits you’ll need, zoning maps with overlays of information about other businesses in the area, and then partners with http://codeforamerica.github.com/sitemybiz/ to help you find commercial space for rent that matches your zoning requirements.
  • Another Code for America Fellow created blightstatus.org, which uses public data in New Orleans to inform residents about the status and plans for blighted properties in their area.
  • Other apps from other cities do cool things like help city maintenance workers prioritize repairs of broken streetlights based on other public data like crime reports in the area, time of day the light was broken, and number of other broken lights in the vicinity, or get the citizenry involved with civic data, government, and each other by setting up a Stack Exchange type of site to ask and answer common questions.

Whatever your view data sharing by the government, too little, too much, just right, Rachel points to good things can come from open data.

Splunk has a “corporate responsibility program: Splunk>4Good.

Check it out!

BTW, do you have a topic maps “corporate responsibility” program?

September 26, 2012

Splunk’s Software Architecture and GUI for Analyzing Twitter Data

Filed under: CS Lectures,Splunk,Tweets — Patrick Durusau @ 1:24 pm

Splunk’s Software Architecture and GUI for Analyzing Twitter Data by Marti Hearst.

From the post:

Today we learned about an alternative software architecture for processing large data, getting the technical details from Splunk’s VP of Engineering, Stephen Sorkin. Splunk also has a really amazing GUI for analyzing Twitter and other data sources in real time; be sure to watch the last 15 minutes of the video to see the demo:

Someone needs to organize a “big data tool of the month” club!

Or at the rate of current development, would that be a “big data tool of the week” club?

September 5, 2012

Exploring Twitter Data

Filed under: Splunk,Tweets — Patrick Durusau @ 4:32 pm

Exploring Twitter Data

From the post:

Want to explore popular content on Twitter with Splunk queries? The new Twitter App for Splunk 4.3 provides a scripted input that automatically extracts data from Twitter’s public 1% sample stream.

What could be better? Watching a twitter stream and calling it work. 😉

April 19, 2012

7 top tools for taming big data

Filed under: BigData,Jaspersoft,Karmasphere,Pentaho,Skytree,Splunk,Tableau,Talend — Patrick Durusau @ 7:20 pm

7 top tools for taming big data by Peter Wayner.

Peter covers:

  • Jaspersoft BI Suite
  • Pentaho Business Analytics
  • Karmasphere Studio and Analyst
  • Talend Open Studio
  • Skytree Server
  • Tableau Desktop and Server
  • Splunk

Not as close to the metal as Lucene/Solr, Hadoop, HBase, Neo4j, and many other packages but not bad starting places.

Do be mindful of Peter’s closing paragraph:

At a recent O’Reilly Strata conference on big data, one of the best panels debated whether it was better to hire an expert on the subject being measured or an expert on using algorithms to find outliers. I’m not sure I can choose, but I think it’s important to hire a person with a mandate to think deeply about the data. It’s not enough to just buy some software and push a button.

December 21, 2011

Three New Splunk Developer Platform Offerings

Filed under: Java,Javascript,Python,Splunk — Patrick Durusau @ 7:24 pm

Three New Splunk Developer Platform Offerings

From the post:

Last week was a busy week for the Splunk developer platform team. We pushed live 2 SDKs within one hour! We are excited to announce the release of:

  • Java SDK Preview on GitHub. The Java SDK enables our growing base of customers to share and harness the core Splunk platform and the valuable data stored in Splunk across the enterprise. The SDK ships with a number of examples including an explorer utility that provides the ability to explore the components and configuration settings of a Splunk installation. Learn more about the Java SDK.
  • JavaScript SDK Preview on GitHub The JavaScript SDK takes big data to the web by providing developers with the ability to easily integrate visualizations into custom applications. Now developers can take the timeline view and charting capabilities of Splunk’s out-of-the-box web interface and include them in their custom applications. Additionally, with node.js support on the server side, developers can build end-to-end applications completely in JavaScript. Learn more about the JavaScript SDK.
  • Splunk Developer AMI. A developer-focused publicly available Linux Amazon Machine Image (AMI) that includes all the Splunk SDKs and Splunk 4.2.5. The Splunk Developer AMI, will make it easier for developers to try the Splunk platform. To enhance the usability of the image, developers can sign up for a free developer license trial, which can be used with the AMI. Read our blog post to learn more about the developer AMI.

The delivery of the Java and JavaScript SDKs coupled with our existing Python SDK (GitHub) reinforce our commitment to developer enablement by providing more language choice for application development and putting the SDKs on the Splunk Linux AMI expedites the getting started experience.

We are seeing tremendous interest in our developer community and customer base for Splunk to play a central role facilitating the ability to build innovative applications on top of a variety of data stores that span on-premises, cloud and mobile.

We are enabling developers to build complex Big Data applications for a variety of scenarios including:

  • Custom built visualizations
  • Reporting tool integrations
  • Big Data and relational database integrations
  • Complex event processing

Not to mention being just in time for the holidays! 😉

Seriously, tools to do useful work with “big data” are coming online. The question is going to be the skill with which they are applied.

December 6, 2011

Introducing Shep

Filed under: Hadoop,Shep,Splunk — Patrick Durusau @ 7:57 pm

Introducing Shep

From the post:

These are exciting times at Splunk, and for Big Data. During the 2011 Hadoop World, we announced our initiative to combine Splunk and Hadoop in a new offering. The heart of this new offering is an open source component called Shep. Shep is what will enable seamless two-way data-flow across the the systems, as well as opening up two-way compute operations across data residing in both systems.

Use Cases

The thing that intrigues us most is the synergy between Splunk and Hadoop. The ways to integrate are numerous, and as the field evolves and the project progresses, we can see more and more opportunities to provide powerful solutions to common problems.

Many of our customers are indexing terabytes per day, and have also spun up Hadoop initiatives in other parts of the business. Splunk integration with Hadoop is part of a broader goal at Splunk to break down barriers to data-silos, and open them up to availability across the enterprise, no matter what the source. To itemize some categories we’re focused on, listed here are some key use cases:

  • Query both Splunk and Hadoop data, using Splunk as a “single-pane-of-glass”
  • Data transformation utilizing Splunk search commands
  • Real-time analytics of data streams going to mutliple destinations
  • Splunk as data warehouse/marts for targeted exploration of HDFS data
  • Data acquisition from logs and apis via Splunk Universal Forwarder

Read the post to learn the features that are supported now or soon will be in Shep.

Now in private beta but it sounds worthy of a “heads up!”

Powered by WordPress