Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 15, 2016

BBC World Service – In 40 Languages [Non-U.S. Centric Topic Mappers Take Note]

Filed under: BBC,Language,News — Patrick Durusau @ 9:56 pm

BBC World Service announces biggest expansion ‘since the 1940s’

From the post:

The BBC World Service will launch 11 new language services as part of its biggest expansion “since the 1940s”, the corporation has announced.

The expansion is a result of the funding boost announced by the UK government last year.

The new languages will be Afaan Oromo, Amharic, Gujarati, Igbo, Korean, Marathi, Pidgin, Punjabi, Telugu, Tigrinya, and Yoruba.

The first new services are expected to launch in 2017.

“This is a historic day for the BBC, as we announce the biggest expansion of the World Service since the 1940s,” said BBC director general Tony Hall.

“The BBC World Service is a jewel in the crown – for the BBC and for Britain.

“As we move towards our centenary, my vision is of a confident, outward-looking BBC which brings the best of our independent, impartial journalism and world-class entertainment to half a billion people around the world.

Excellent!

The BBC World Service is the starting place to broaden your horizons.

In English “all shows” lists 1831 shows.

I prefer reading over listening but have resolved to start exploring the world of the BBC.

November 4, 2016

BBC News Could Do Better: Scottish witchcraft book published online

Filed under: BBC,Books — Patrick Durusau @ 7:41 pm

Scottish witchcraft book published online

From the post:

The Names of Witches in Scotland, 1658 collection, was drawn up during a time when the persecution of supposed witches was rife.

The book also lists the towns where the accused lived and notes of confession.

It is believed many were healers, practicing traditional folk medicine.

Some of the notes give small insights into the lives of those accused.

It is recorded that the spouse of Agnes Watsone, from Dumbarton, is “umquhile” (deceased).

A majority of those accused of witchcraft were women although the records reveal that some men were also persecuted.

Jon Gilchreist and Robert Semple, from Dumbarton, are recorded as sailors. A James Lerile of Alloway, Ayr, is noted as “clenged”, in other words cleaned or made clean.

While Mr Lerile’s fate is unclear, the term probably meant banishment or death.

I’m glad BBC News drew attention to this volume but the only links in the post go to a very annoying commercial site that has transcribed the work.

🙁

With very little effort, I can send you to images of the original:

Names of the witches (in Scotland) 1658.

Some readers (cough), may find the commercial service useful. OK, but BBC News should include links to originals, especially then those are sans annoying subscription requests.

October 21, 2015

Who’s talking about what [BBC News Labs]

Filed under: BBC,News,Searching — Patrick Durusau @ 10:21 am

Who’s talking about what – See who’s talking about what across hundreds of news sources.

Imagine comparing the coverage of news feeds from approximately 350 sources (you choose), with granular date ranges (instead of last 24 hours, last week, last month, last year) plus, “…AND, OR, NOT and parenthesis in queries.” The interface shows co-occurring topics as well.

BBC New Labs did more than +1! a great idea, they implemented it and posted their code.

From the webpage:

Inspired by a concept created by Adam Ramsay, Zoe Blackler & Iain Collins at the Center for Investigative Reporting design sprint on Climate Change

Implementation by Iain Collins and Sylvia Tippmann, using data from the BBC News Labs Juicer | View Source

What conclusions would you draw from reports starting September 1, 2015 to date, “violence AND Israel?

bbc-news-wat4
bbc-news-wat2

One story only illustrates the power of this tool to create comparisons between news sources. Drawing conclusions about news sources requires systematic study of sources across a range of stories. The ability to do precisely that has fallen into your lap.

I first saw this in a tweet by Nick Diakopoulos.

December 15, 2014

Shining a light into the BBC Radio archives

Filed under: Archives,Audio,Auto Tagging,BBC,British Library,British Museum,Radio — Patrick Durusau @ 9:23 am

Shining a light into the BBC Radio archives by Yves Raimond, Matt Hynes, and Rob Cooper.

From the post:

comma

One of the biggest challenges for the BBC Archive is how to open up our enormous collection of radio programmes. As we’ve been broadcasting since 1922 we’ve got an archive of almost 100 years of audio recordings, representing a unique cultural and historical resource.

But the big problem is how to make it searchable. Many of the programmes have little or no meta-data, and the whole collection is far too large to process through human efforts alone.

Help is at hand. Over the last five years or so, technologies such as automated speech recognition, speaker identification and automated tagging have reached a level of accuracy where we can start to get impressive results for the right type of audio. By automatically analysing sound files and making informed decisions about the content and speakers, these tools can effectively help to fill in the missing gaps in our archive’s meta-data.

The Kiwi set of speech processing algorithms

COMMA is built on a set of speech processing algorithms called Kiwi. Back in 2011, BBC R&D were given access to a very large speech radio archive, the BBC World Service archive, which at the time had very little meta-data. In order to build our prototype around this archive we developed a number of speech processing algorithms, reusing open-source building blocks where possible. We then built the following workflow out of these algorithms:

  • Speaker segmentation, identification and gender detection (using LIUM diarization toolkitdiarize-jruby and ruby-lsh). This process is also known as diarisation. Essentially an audio file is automatically divided into segments according to the identity of the speaker. The algorithm can show us who is speaking and at what point in the sound clip.
  • Speech-to-text for the detected speech segments (using CMU Sphinx). At this point the spoken audio is translated as accurately as possible into readable text. This algorithm uses models built from a wide range of BBC data.
  • Automated tagging with DBpedia identifiers. DBpedia is a large database holding structured data extracted from Wikipedia. The automatic tagging process creates the searchable meta-data that ultimately allows us to access the archives much more easily. This process uses a tool we developed called ‘Mango’.

,,,

COMMA is due to launch some time in April 2015. If you’d like to be kept informed of our progress you can sign up for occasional email updates here. We’re also looking for early adopters to test the platform, so please contact us if you’re a cultural institution, media company or business that has large audio data-set you want to make searchable.

This article was written by Yves Raimond (lead engineer, BBC R&D), Matt Hynes (senior software engineer, BBC R&D) and Rob Cooper (development producer, BBC R&D)

I don’t have a large audio data-set but I am certainly going to be following this project. The results should be useful in and of themselves, to say nothing of being a good starting point for further tagging. I wonder if the BBC Sanskrit broadcasts are going to be available? I will have to check on that.

Without diminishing the achievements of other institutions, the efforts of the BBC, the British Library, and the British Museum are truly remarkable.

I first saw this in a tweet by Mike Jones.

October 17, 2014

BBC Genome Project

Filed under: BBC,News — Patrick Durusau @ 4:50 pm

BBC Genome Project

From the post:

This site contains the BBC listings information which the BBC printed in Radio Times between 1923 and 2009. You can search the site for BBC programmes, people, dates and Radio Times editions.

We hope it helps you find that long forgotten BBC programme, research a particular person or browse your own involvement with the BBC.

This is a historical record of both the planned output and the BBC services of any given time. It should be viewed in this context and with the understanding that it reflects the attitudes and standards of its time – not those of today.

Join in

You can join in and become part of the community that is improving this resource. As a result of the scanning process there are lots of spelling mistakes and punctuation errors and you can edit the entries to accurately reflect the magazine entry. You can also tell us when the schedule changed and we will hold on to that information for the next stage of this project.

What a delightful resource to find on a Friday!

True, no links to the original programs but perhaps someday?

Enjoy!

I first saw this in a tweet by Tom Loosemore.


Update: Genome: behind the scenes by Andy Armstrong.

From the post:

In October 2011 Helen Papadopoulos wrote about the Genome project – a mammoth effort to digitise an issue of the Radio Times from every week between 1923 and 2009 and make searchable programme listings available online.

Helen expected there to be between 3 and 3.5 million programme entries. Since then the number has grown to 4,423,653 programmes from 4,469 issues. You can now browse and search all of them at http://genome.ch.bbc.co.uk/

Back in 2011 the process of digitising the scanned magazines was well advanced and our thoughts were turning to how to present the archive online. It’s taken three years and a few prototypes to get us to our first public release.

Andy gives you the backend view of the BBC Genome Project.

I first saw this in a tweet by Jem Stone.

May 30, 2014

BBC Radio Explorer:…

Filed under: BBC,News,Reporting — Patrick Durusau @ 1:56 pm

BBC Radio Explorer: a new way to listen to radio by James Cridland.

From the post:

The BBC has quietly released a prototype service called BBC Radio Explorer.

The service is the result of “10% time”, a loose concept that allows the BBC’s software engineers time to develop and play about with things. Unusually, this one is visible to the public, if you know where to look. But, with a quiet announcement on Twitter and no press release, you’ll be forgiven to not know it exists. That’s by design: since it’s not finished: every page tells us it’s “work-in-progress”.

BBC Radio Explorer is a relatively simple idea. Type something that you’re interested in, and the service plays you clips and programmes that it thinks you’ll like: one after the other. It’s a different way to listen to the BBC’s speech radio output, and it should unearth a lot of interesting programming from the BBC.

Technically, it’s nicely done: type a topic, and it instantly starts playing some audio. The BBC’s invested some time in clipping some of their programmes into small chunks, and typically you’ll get a little bit of the Today programme, or BBC Radio 5 live’s breakfast show, as well as longer-form programmes. You can skip forward and back to different clips, and a quite clever progress bar shows you images of what’s coming up, while the current programme slowly disappears. It’s a responsive site, and apparently works well on iOS devices too, though Android support is lacking.
….

James compares similar services and discusses a number short-comings of the service.

An old and familiar one is the inadequacy of BBC Radio Explorer search capabilities. Not unique to the BBC but common across search engines everywhere.

But on the whole, James take this to be a worthwhile venture and I would have to agreed.

Unless and until users become more vocal about what is lacking in current search capabilities, business as usual will prevail as search engines tweak their results to sell more ads.

Powered by WordPress