Archive for the ‘Audio’ Category

Audiogram (New York Pubic Radio)

Monday, August 1st, 2016

Audiogram from New York Public Radio.

My interest in Audiogram was sparked by the need to convert an audio file into video, so the captioning service at YouTube would provide a rough cut at transcribing the audio.

From the post:

Audiogram is a library for generating shareable videos from audio clips.

Here are some examples of the audiograms it creates:

Why does this exist?

Unlike audio, video is a first-class citizen of social media. It’s easy to embed, share, autoplay, or play in a feed, and the major services are likely to improve their video experiences further over time.

Our solution to this problem at WNYC was this library. Given a piece of audio we want to share on social media, we can generate a video with that audio and some basic accompanying visuals: a waveform of the audio, a theme for the show it comes from, and a caption.

For more on the backstory behind audiograms, read this post.

I hope to finish the transcript I obtained from YouTube later this week and will be posted it, along with all the steps I took to produce it.

Hiding either the process and/or result would be poor repayment to all those who have shared so much, like New York Public Radio.

Virtual Kalimba

Saturday, December 26th, 2015

Virtual Kalimba


Visit the site for keyboard shortcuts, tips & tricks, and interactive production of sound!

The website is an experiment in Web Audio by Middle Ear Media.

The Web Audio Tutorials page at Middle Ear Media has eight (8) tutorials on Web Audio.

Demo apps:

Apps are everywhere. While native mobile apps get a lot of attention, web apps have become much more powerful in recent years. Hopefully you can find something here that will stimulate you or improve the quality of your life in some way.

Web Audio Loop Mixer

Web Audio Loop Mixer is a Web Audio experiment created with HTML5, CSS3, JavaScript, and the Web Audio API. This web app is a stand alone loop mixer with effects. It allows up to four audio loops to be boosted, attenuated, equalized, panned, muted, and effected by delay or distortion in the browser.

Virtual Kalimba

Virtual Kalimba is a Web Audio experiment created with HTML5, CSS3, and JavaScript. It uses the Web Audio API to recreate a Kalimba, also known as an Mbira or Thumb Piano. This is a traditional African instrument that belongs to the Lamellophone family of musical instruments.

Virtual Hang

Virtual Hang is a Web Audio experiment created with HTML5, CSS3, and JavaScript. It uses the Web Audio API to recreate a Hang, a steel hand pan instrument. The Hang is an amazing musical instrument developed by Felix Rohner and Sabina Schärer in Bern, Switzerland.

War Machine

War Machine is a Web Audio experiment created with HTML5, CSS3, and JavaScript. The App uses the Web Audio API to create a sample pad interface reminiscent of an Akai MPC. The purpose of War Machine is not to promote violence, but rather to create a safe (victimless) environment for the release of excess aggression.

Channel Strip

Channel Strip is a Web Audio experiment created with HTML5, CSS3, JavaScript, and the Web Audio API. This web app is a stand alone audio channel strip that allows an audio signal to be boosted, attenuated, equalized, panned, compressed and muted in the browser. The audio source is derived from user media via file select input.

Task Managment

A fast loading Web App for managing tasks online. This App offers functions such as editable list items, removable list items, and it uses localStorage to save your information in your own browser.

On War Machine, the top row, third pad from the left comes the closest to an actual gunshot sound.

Works real well with the chorus from Anders Osborne’s Five Bullets:

Boom , boom, boom, that American sound
Teenage kids on a naked ground
Boom, boom, boom, that American sound
Five bullets in Pigeon Town

For more details on Anders Osborne, including lyrics and tour dates, see: Ya Ya Nation.

I first saw this in a tweet by Chris Ford.

Paradise Lost (John MILTON, 1608 – 1674) Audio Version

Thursday, December 10th, 2015

Paradise Lost (John MILTON, 1608 – 1674) Audio Version.

As you know, John Milton was blind when he wrote Paradise Lost. His only “interface” for writing, editing and correcting was aural.

Shoppers and worshipers need to attend very closely to the rhetoric of the season. Listening to Paradise Lost even as Milton did, may sharpen your ear for rhetorical devices and words that would otherwise pass unnoticed.

For example, what are the “good tidings” of Christmas hymns? Are they about the “…new born king…” or are they anticipating the sacrifice of that “…new born king…” instead of ourselves?

The first seems traditional and fairly benign, the second, seems more self-centered and selfish than the usual Christmas holiday theme.

If you think that is an aberrant view of the holiday, consider that in A Christmas Carol by Charles Dickens, that Scrooge, spoiler alert, ends the tale by keeping Christmas in his heart all year round.

One of the morals being that we should treat others kindly and with consideration every day of the year. Not as some modern Christians do, half-listening at an hour long service once a week and spending the waking portion of the other 167 hours not being Christians.

Paradise Lost is a complex and nuanced text. Learning to spot its rhetorical moves and devices will make you a more discerning observer of modern discourse.


NPR’s new podcast concierge…

Thursday, November 5th, 2015

NPR’s new podcast concierge recommends shows from inside — and outside — public radio by Shan Wang.

From the post:

Seventeen percent of Americans age 12 and older have listened to at least one podcast in a given month, and awareness of podcasting medium has grown to nearly 50 percent of that population, according to recent data from Edison Research. But if you’re a creator of podcasts, you might see this number as a mere 17 percent, and one that represents a relatively affluent, smartphone-toting slice of society.

With its new tool, billed as “your friendly guide to great podcasts,” NPR is hoping to expand that slice of society — and lower the barrier of entry for people who want to listen to a podcast for the first time but are paralyzed by thousands of options.

“Podcasting is expanding in a way that makes it a competitor with books, music, movies, [and] TV shows,” Michael Oreskes, NPR’s senior VP of news and editorial director, told me.

I pass this along as a good source for podcasts if you want to include them in the mix of data covered by your topic map.

If you create podcasts, include transcripts whenever possible.

800,000 NPR Audio Files!

Wednesday, April 29th, 2015

There Are Now 800,000 Reasons To Share NPR Audio On Your Site by Patrick Cooper.

From the post:

From NPR stories to shows to songs, today we’re making more than 800,000 pieces of our audio available for you to share around the Web. We’re throwing open the doors to embedding, putting our audio on your site.

Complete with simple instructions for embedding!

I often think of topic maps when listening to NPR so don’t be surprised if you start seeing embedded NPR audio in the very near future!


Mapping Your Music Collection [Seeing What You Expect To See]

Saturday, March 14th, 2015

Mapping Your Music Collection by Christian Peccei.

From the post:

In this article we’ll explore a neat way of visualizing your MP3 music collection. The end result will be a hexagonal map of all your songs, with similar sounding tracks located next to each other. The color of different regions corresponds to different genres of music (e.g. classical, hip hop, hard rock). As an example, here’s a map of three albums from my music collection: Paganini’s Violin Caprices, Eminem’s The Eminem Show, and Coldplay’s X&Y.


To make things more interesting (and in some cases simpler), I imposed some constraints. First, the solution should not rely on any pre-existing ID3 tags (e.g. Arist, Genre) in the MP3 files—only the statistical properties of the sound should be used to calculate the similarity of songs. A lot of my MP3 files are poorly tagged anyways, and I wanted to keep the solution applicable to any music collection no matter how bad its metadata. Second, no other external information should be used to create the visualization—the only required inputs are the user’s set of MP3 files. It is possible to improve the quality of the solution by leveraging a large database of songs which have already been tagged with a specific genre, but for simplicity I wanted to keep this solution completely standalone. And lastly, although digital music comes in many formats (MP3, WMA, M4A, OGG, etc.) to keep things simple I just focused on MP3 files. The algorithm developed here should work fine for any other format as long as it can be extracted into a WAV file.

Creating the music map is an interesting exercise. It involves audio processing, machine learning, and visualization techniques.

It would take longer than a weekend to complete this project with a sizable music collection but it would be a great deal of fun!

Great way to become familiar with several Python libraries.

BTW, when I saw Coldplay, I thought of Coal Chamber by mistake. Not exactly the same subject. 😉

I first saw this in a tweet by Kirk Borne.

American Institute of Physics: Oral Histories

Monday, December 15th, 2014

American Institute of Physics: Oral Histories

From the webpage:

The Niels Bohr Library & Archives holds a collection of over 1,500 oral history interviews. These range in date from the early 1960s to the present and cover the major areas and discoveries of physics from the past 100 years. The interviews are conducted by members of the staff of the AIP Center for History of Physics as well as other historians and offer unique insights into the lives, work, and personalities of modern physicists.

Read digitized oral history transcripts online

I don’t have a large collection audio data-set (see: Shining a light into the BBC Radio archives) but there are lots of other people who do.

If you are teaching or researching physics for the last 100 years, this is a resource you should not miss.

Integrating audio resources such as this one, at less than the full recording level (think of it as audio transclusion), into teaching materials would be a great step forward. To say nothing of being about to incorporate such granular resources into a library catalog.

I did not find an interview with Edward Teller but a search of the transcripts turned up three hundred and five (305) “hits” where he is mentioned in interviews. A search for J. Robert Oppenheimer netted four hundred and thirty-six (436) results.

If you know your atomic bomb history, you can guess between Teller and Oppenheimer which one would support the “necessity” defense for the use of torture. It would be an interesting study to see how the interviewees saw these two very different men.

Shining a light into the BBC Radio archives

Monday, December 15th, 2014

Shining a light into the BBC Radio archives by Yves Raimond, Matt Hynes, and Rob Cooper.

From the post:


One of the biggest challenges for the BBC Archive is how to open up our enormous collection of radio programmes. As we’ve been broadcasting since 1922 we’ve got an archive of almost 100 years of audio recordings, representing a unique cultural and historical resource.

But the big problem is how to make it searchable. Many of the programmes have little or no meta-data, and the whole collection is far too large to process through human efforts alone.

Help is at hand. Over the last five years or so, technologies such as automated speech recognition, speaker identification and automated tagging have reached a level of accuracy where we can start to get impressive results for the right type of audio. By automatically analysing sound files and making informed decisions about the content and speakers, these tools can effectively help to fill in the missing gaps in our archive’s meta-data.

The Kiwi set of speech processing algorithms

COMMA is built on a set of speech processing algorithms called Kiwi. Back in 2011, BBC R&D were given access to a very large speech radio archive, the BBC World Service archive, which at the time had very little meta-data. In order to build our prototype around this archive we developed a number of speech processing algorithms, reusing open-source building blocks where possible. We then built the following workflow out of these algorithms:

  • Speaker segmentation, identification and gender detection (using LIUM diarization toolkitdiarize-jruby and ruby-lsh). This process is also known as diarisation. Essentially an audio file is automatically divided into segments according to the identity of the speaker. The algorithm can show us who is speaking and at what point in the sound clip.
  • Speech-to-text for the detected speech segments (using CMU Sphinx). At this point the spoken audio is translated as accurately as possible into readable text. This algorithm uses models built from a wide range of BBC data.
  • Automated tagging with DBpedia identifiers. DBpedia is a large database holding structured data extracted from Wikipedia. The automatic tagging process creates the searchable meta-data that ultimately allows us to access the archives much more easily. This process uses a tool we developed called ‘Mango’.


COMMA is due to launch some time in April 2015. If you’d like to be kept informed of our progress you can sign up for occasional email updates here. We’re also looking for early adopters to test the platform, so please contact us if you’re a cultural institution, media company or business that has large audio data-set you want to make searchable.

This article was written by Yves Raimond (lead engineer, BBC R&D), Matt Hynes (senior software engineer, BBC R&D) and Rob Cooper (development producer, BBC R&D)

I don’t have a large audio data-set but I am certainly going to be following this project. The results should be useful in and of themselves, to say nothing of being a good starting point for further tagging. I wonder if the BBC Sanskrit broadcasts are going to be available? I will have to check on that.

Without diminishing the achievements of other institutions, the efforts of the BBC, the British Library, and the British Museum are truly remarkable.

I first saw this in a tweet by Mike Jones.