Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

September 9, 2012

Books, Bookstores, Catalogs [30% Digital by end of 2012, Books as Islands/Silos]

Filed under: Books,Publishing — Patrick Durusau @ 2:12 pm

Books, Bookstores, Catalogs by Kevin Hillstrom.

From the post:

The parallels between books, bookstores, and catalogs are significant.

So take fifty minutes this weekend, and watch this session that was recently broadcast on BookTV, titled “The Future of the Book and Bookstore“.

This is fifty minutes of absolutely riveting television, seriously! Boring setting, riveting topic.

Jim Milliot (Publishers Weekly) tossed out an early tidbit: 30% of book sales will be digital by the end of 2012.

LIssa Muscatine, Politics & Prose bootstore owner: When books are a smaller part of the revenue stream, have to diversify the revenue stream. Including print on demand from a catalog of 7 million books.

Sam Dorrance Potomac Books (publisher): Hard copy sales will likely decrease by ten percent (10%) per year for the next several years.

Recurrent theme: Independent booksellers can provide guidance to readers. Not the same thing as “recommendation” because it is more nuanced.

Rafe Sagalyn Sagalyn Literary Agency: Now a buyers market. Almost parity between hard copy and ebook sales.

Great panel but misses the point that books, hard copy or digital, remain isolated islands/silos.

Want to have a value-add that is revolutionary?

Create links across Kindle and other electronic formats, so that licensed users are not isolated within single works.

Did I hear someone say topic maps?

August 9, 2012

The Bookless Library

Filed under: Books,Library — Patrick Durusau @ 3:45 pm

The Bookless Library by David A. Bell. (New Republic, July 12, 2012)

Although Bell is quick to dismiss the notion of libraries without physical books, the confusion of libraries with physical books is one that has hurt the cause of libraries.

He remarks:

Libraries are also sources of crucial expertise. Librarians do not just maintain physical collections of books. Among other things, they guide readers, maintain catalogues, develop access portals for electronic sources, organize special programs and exhibitions, oversee special collections, and make acquisition decisions. The fact that more and more acquisition decisions now involve a question of which databases to subscribe to, rather than which physical books and journals to buy, does not make these functions any less important. To the contrary: the digital landscape is wild and wooly, and it is crucial to have well-trained, well-informed librarians on hand to figure out which content to spend scarce subscription dollars on, and how to guide readers through it.

Digital resources and collections have already out-stripped the physical collections possible in even major research libraries. Digitization efforts promise that more and more of the written record will become readily accessible to more readers.

Accessible in the sense that they can “read” the text, whether it is understood or not, is a different issue.

Without librarians to act as intelligent filters, digital content will be a sea of information that washes over all but the most intrepid scholars.

Increases in digital resources require increases in the number of librarians performing the creative aspects of their professions.

Acting as teachers, guides and fellow travellers in the exploration cultural riches past and present, and preparing for those yet to come.

May 4, 2012

Titles from Springer collection cover wide range of disciplines on Apple’s iBookstore

Filed under: Books,Data,Springer — Patrick Durusau @ 3:44 pm

Titles from Springer collection cover wide range of disciplines on Apple’s iBookstore

From the post:

Springer Science+Business Media now offers one of the largest scientific, technical and medical (STM) book collections on the iBookstore with more than 20,000 individual Springer titles. Cornerstone works in disciplines like mathematics, medicine and engineering are now available, along with selections in other fields such as business and economics. Titles include the Springer Handbook of Nanotechnology, Pattern Recognition and Machine Learning, Bergey’s Manual of Systematic Bacteriology and the highly regarded book series Graduate Texts in Mathematics.

Springer is currently undertaking an exhaustive effort to digitize all of its books dating back to the mid-nineteenth century. By making most of its entire collection – both new and archived titles – available through its SpringerLink platform, Springer offers STM researchers far more opportunities than ever to obtain and apply content.

Gee, do you think the nomenclature has changed since the mid-nineteenth century until now? Just a bit? To say nothing across languages.

Prime topic map territory, both for traditional build and sell versions as well as topic trails through literature.

Will have to check to see how far back the current Spring API goes.

April 27, 2012

Harvard Library releases big data for its books

Filed under: Books,Library — Patrick Durusau @ 6:11 pm

Harvard Library releases big data for its books

Audrey Watters writes in part:

Harvard University announced this week that it would make more than 12 million catalog records from its 73 libraries publicly available. These records contain bibliographic information about books, manuscripts, maps, videos, and audio recordings. The Harvard Library is making these records available under a Creative Commons 0 license, in accordance with its Open Metadata Policy.

In MARC21 format, these records should lend themselves to a number of interesting uses.

I have always been curious about semantic drift across generations of librarians for subject headings.

Did we as users “learn” the cataloging of particular collections?

How can we recapture that “learning” in a topic map?

April 25, 2012

OAG Launches Mapper, a New Network Analysis Mapping Tool

Filed under: Aviation,Books,Marketing,Travel — Patrick Durusau @ 6:27 pm

OAG Launches Mapper, a New Network Analysis Mapping Tool

From the post:

OAG, a UBM Aviation brand, today unveiled its new aviation analysis mapping tool, OAG Mapper. This latest innovation, from the global leader in aviation intelligence, combines a powerful global flight schedule query with advanced mapping software technology to quickly plot route network maps, based on data drawn from OAG’s market leading schedules database of 1,000 airlines and over 3,500 airports. It is ideal for those in commercial, marketing and strategic planning roles across the airlines, airports, tourism, consulting and route network related industry sectors.

A web-based tool that eliminates the need to hand-draw network routes onto maps, OAG Mapper allows users to either import IATA Airport codes, or to enter a carrier, airport, equipment type or a combination of these and generate custom network maps in seconds. The user can then highlight key routes by changing the thickness and colour of the lines and label them for easy reference, save the map to their profile and export to jpeg for use in network planning, forecasting, strategy and executive presentations.

This has aviation professional written all over it.

And what does aviation bring to mind? that’s right! Coin of the realm! Lot of coins from lots of realms.

Two thoughts:

First and the most obvious, use this service in tandem with other information for aviation professionals to create enhanced services for their use. Ask aviation professional what they would like to see and how they would like to see it. (Novel software theory: Give users what they want, how they want it. Easier sell than educating them.)

Second, we have all seen the travel sites that plot schedules, fees, destinations, hotels and car rentals.

But when was the last time you flew to an airport, rented a car and stayed in a hotel? That was the sum total of your trip?

Every location in the world has more to offer than that, well, not the South Pole but they don’t have a car rental agency. Or any beach. So why go there?

Sorry, got distracted. Every location in the world (with one exception, see above) has more than airports, hotels and car rentals. Suggestion: Use topic maps (non-obviously) to create information/reservation rich information environments.

The Frankfurt Book Fair is an example of an event with literally thousands of connections to be made in addition to airport, hotel and car rental. Your application could be the one that crosses all the information systems (or lack thereof) to provide that unique experience.

Could hard code it but I assume you are brighter than that.

March 19, 2012

53 Books APIs: Google Books, Goodreads and SharedBook

Filed under: Books,Library — Patrick Durusau @ 6:54 pm

53 Books APIs: Google Books, Goodreads and SharedBook

Wendell Santos has posted on behalf of ProgrammableWeb a list of fifty-three (53) book APIs!

Fairly good listing but it could be better.

For example, it is missing the Springer API, http://dev.springer.com/, and although they don’t list Elsevier and say http://www.programmableweb.com/api/elsevier-article is historical only, you should be aware that Elsevier does offer an extensive API at: http://www.developers.elsevier.com/cms/index (called SciVerse).

I am sure there are others. Any you would like to mention in particular?

Now that I think about it, guess who doesn’t have a public API?

Would you believe the ACM? Check out the ACM Digital Library and tell me if I am wrong.

Or for that matter, the IEEE. See CS Digital Library.

Maybe they don’t have anyone to build an API for them? Please write the ACM and/or IEEE offering your services at your usual rates.

March 15, 2012

Data and Reality

Data and Reality: A Timeless Perspective on Data Management by Steve Hoberman.

I remember William Kent, the original author of “Data and Reality” from a presentation he made in 2003, entitled: “The unsolvable identity problem.”

His abstract there read:

The identity problem is intractable. To shed light on the problem, which currently is a swirl of interlocking problems that tend to get tumbled together in any discussion, we separate out the various issues so they can be rationally addressed one at a time as much as possible. We explore various aspects of the problem, pick one aspect to focus on, pose an idealized theoretical solution, and then explore the factors rendering this solution impractical. The success of this endeavor depends on our agreement that the selected aspect is a good one to focus on, and that the idealized solution represents a desirable target to try to approximate as well as we can. If we achieve consensus here, then we at least have a unifying framework for coordinating the various partial solutions to fragments of the problem.

I haven’t read the “new” version of “Data and Reality” (just ordered a copy) but I don’t recall the original needing much in the way of changes.

The original carried much the same message, that all of our solutions are partial even within a domain, temporary, chronologically speaking, and at best “useful” for some particular purpose. I rather doubt you will find that degree of uncertainty being confessed by the purveyors of any current semantic solution.

I did pull my second edition off the shelf and with free shipping (5-8 days), I should have time to go over my notes and highlights before the “new” version appears.

More to follow.

March 14, 2012

Keyword Indexing for Books vs. Webpages

Filed under: Books,Indexing,Keywords,Search Engines — Patrick Durusau @ 7:35 pm

I was watching a lecture on keyword indexing that started off with a demonstration of an index to a book, which was being compared to indexing web pages. The statement was made that the keyword pointed the reader to a page where that keyword could be found, much like a search engine does for a web page.

Leaving aside the more complex roles that indexes for books play, such as giving alternative terms, classifying the nature of the occurrence of the term (definition, mentioned, footnote, etc.), cross-references, etc., I wondered if there is a difference between a page reference in a book index vs. a web page reference by a search engine?

In some 19th century indexes I have used, the page references are followed by a letter of the alphabet, to indicate that the page is divided into sections, sometimes as many as a – h or even higher. Mostly those are complex reference works, dictionaries, lexicons, works of that type, where the information is fairly dense. (Do you know of any modern examples of indexes where pages are divided? A note would be appreciated.)

I have the sense that an index of a book, without sub-dividing a page, is different from a index pointing to a web page. It may be a difference that has never been made explicit but I think it is important.

Some facts about word length on a “page:”

With a short amount of content, average book page length, the user has little difficulty finding an index term on a page. But the longer the web page, the less useful our instinctive (trained?) scan of the page becomes.

In part because part of the page scrolls out of view. As you may know, that doesn’t happen with a print book.

Scanning of a print book is different from scanning of a webpage. How to account for that difference I don’t know.

Before you suggest Ctrl-F, see Do You Ctrl-F?. What was it you were saying about Ctrl-F?

Web pages (or other electronic media) that don’t replicate the fixed display of book pages result in a different indexing experience for the reader.

If a search engine index could point into a page, it would still be different from a traditional index but would come closer to a traditional index.

(The W3C has steadfastly resisted any effective subpage pointing. See the sad history of XLink/XPointer. You will probably have to ask insiders but it is a well known story.)

BTW, in case you are interested in blog length, see: Bloggers: This Is How Long Your Posts Should Be. Informative and amusing.

March 1, 2012

Paper vs. Electronic Brick, What’s the Difference?

Filed under: Books,eBooks,Law,Law - Sources — Patrick Durusau @ 9:01 pm

I think the comparison that Elmer Masters is looking for in The Future of The (Case)Book Is The Web, is paper vs. electronic brick, what’s the difference?

He writes:

Recently there has been an explosion of advances in the ebook arena. New tools, new standards and formats, and new platforms seem to be coming out every day. The rush to get books into an “e” format is on, but does it make a real difference?

The “e” versions of books offer little in the way of improvement over the print version of the same book. Sure, these new formats provide a certain increase in accessibility over print by running on devices that are lighter than print books and allow for things like increasing font size, but there is little else. It is, after all, just a matter of reading the same text on some sort of screen instead of paper.

Markelaw school booksters will tell you that the Kindle, Nook, iPad, and various software readers are the future of the book, an evolutionary, if not revolutionary, step in reading and learning. But that does not ring true. These platforms are really just another form for print. So now beside hard cover and paperback, you can get the same content on any number of electronic platforms. Is that so revolutionary? Things like highlighting and note taking are just replications of the analog versions. Like their analog counterparts, notes and highlights on these platforms are typically locked to the hardware or software reader, no better than the highlights and margin notes of print books. These are just closed platforms, “e” or print, just silos of information.

Unlocking the potential of a book that is locked to a specific platform requires moving the book to an open platform with no real limits like the web. On the web the the book is suddenly expansive. Anything that you can do on the web, you can do with a book. As an author, reader, student, teacher, scholar; anything is possible with a book that is on the open web. The potential for linking, including external material, use of media, note taking, editing, markup, remixing are opened without the bounds of a specific reader platform. A book as a website provides the potential for unlimited customization that will work across any hardware platform.

If you have ever seen a print version of a law school casebook, you know what I mean by “paper brick.”

If you have a Kindle, Nook, etc., with a law school casebook, you know what I mean by “electronic brick.”

The latter is smaller, lighter, can carry more content, but it is still a brick, albeit an electronic one.

Elmer’s moniker “website” covers an HTML engine that serves out topic map augmented content.

We have all seen topic map engines that export to HTML output.

What about specifying HTML authoring that is by default the equivalent to the export a topic map?

And tools that automatically capture such website content and “merge” it with other specified content? A “point and click” interface for authors.

All from the FWB (Friendly Web Browser). 😉

January 12, 2012

Complexity and Computation

Filed under: Books,Complexity,Computation — Patrick Durusau @ 7:27 pm

Complexity and Computation by Allen B. Downey.

Another free (you can order hard copy) book from Allen B. Downey. See my post: Think Stats: Probability and Statistics for Programmers or jump to Green Tea Press to see these and other titles for free download.

Description:

This book is about complexity science, data structures and algorithms, intermediate programming in Python, and the philosophy of science:

  • Data structures and algorithms: A data structure is a collection that contains data elements organized in a way that supports particular operations. For example, a dictionary organizes key-value pairs in a way that provides fast mapping from keys to values, but mapping from values to keys is generally slower.

    An algorithm is a mechanical process for performing a computation. Designing efficient programs often involves the co-evolution of data structures and the algorithms that use them. For example, the first few chapters are about graphs, a data structure that is a good implementation of a graph—nested dictionaries—and several graph algorithms that use this data structure.

  • Python programming: This book picks up where Think Python leaves off. I assume that you have read that book or have equivalent knowledge of Python. As always, I will try to emphasize fundmental ideas that apply to programming in many languages, but along the way you will learn some useful features that are specific to Python.
  • Computational modeling: A model is a simplified description of a system that is useful for simulation or analysis. Computational models are designed to take advantage of cheap, fast computation.
  • Philosophy of science: The models and results in this book raise a number of questions relevant to the philosophy of science, including the nature of scientific laws, theory choice, realism and instrumentalism, holism and reductionism, and Bayesian epistemology.

This book focuses on discrete models, which include graphs, cellular automata, and agent-based models. They are often characterized by structure, rules and transitions rather than by equations. They tend to be more abstract than continuous models; in some cases there is no direct correspondence between the model and a physical system.

Complexity science is an interdiscipinary field—at the intersection of mathematics, computer science and physics—that focuses on these kinds of models. That’s what this book is about.

January 4, 2012

To Know, but Not Understand: David Weinberger on Science and Big Data

Filed under: Books,Epistemology,Knowledge,Philosophy of Science — Patrick Durusau @ 2:21 pm

To Know, but Not Understand: David Weinberger on Science and Big Data

From the introduction:

In an edited excerpt from his new book, Too Big to Know, David Weinberger explains how the massive amounts of data necessary to deal with complex phenomena exceed any single brain’s ability to grasp, yet networked science rolls on.

Well, it is a highly entertaining excerpt, with passages like:

For example, the biological system of an organism is complex beyond imagining. Even the simplest element of life, a cell, is itself a system. A new science called systems biology studies the ways in which external stimuli send signals across the cell membrane. Some stimuli provoke relatively simple responses, but others cause cascades of reactions. These signals cannot be understood in isolation from one another. The overall picture of interactions even of a single cell is more than a human being made out of those cells can understand. In 2002, when Hiroaki Kitano wrote a cover story on systems biology for Science magazine — a formal recognition of the growing importance of this young field — he said: “The major reason it is gaining renewed interest today is that progress in molecular biology … enables us to collect comprehensive datasets on system performance and gain information on the underlying molecules.” Of course, the only reason we’re able to collect comprehensive datasets is that computers have gotten so big and powerful. Systems biology simply was not possible in the Age of Books.

Weinberger slips twix and tween philosophy of science, epistemology, various aspects of biology and computational science. Not to mention with the odd bald faced assertion such as: “…the biological system of an organism is complex beyond imagining.” At one time that could have been said about the atom. I think some progress has been made on understanding that last item, or so physicists claim.

Don’t get me wrong, I have a copy on order and look forward to reading it.

But, no single reader will be able to discover all the factual errors and leaps of logic in Too Big to Know. Perhaps a website or wiki, Too Big to Correct?

Top Holiday Gifts For Data Scientists

Filed under: Books,Data Science — Patrick Durusau @ 10:28 am

Top Holiday Gifts For Data Scientists by Jeff Hammerbacher.

Hammerbacher is the chief scientist for Cloudera. Need I say more?

Missed the holidays but I do have a birthday coming up. 😉

Enjoy!

December 31, 2011

FreeBookCentre.Net

Filed under: Books,Computer Science,Mathematics — Patrick Durusau @ 7:20 pm

FreeBookCentre.Net

Books and online materials on:

  • Computer Science
  • Physics
  • Mathematics
  • Electronics

I just scanned a few of the categories and the coverage isn’t systematic. Still, if you need a text for quick study, the price is right.

June 23, 2011

British Library Makes Available 250,000 Digitized Books

Filed under: Books,Library — Patrick Durusau @ 1:58 pm

British Library Makes Available 250,000 Digitized Books

From the post:

The British Library is making available 250,000 texts through Google’s Books system.

As a legal deposit library,The British Library gets copies of all books produced in the U.K. and Ireland, as well as overseas books published in Britain.

The texts, some published in the 18th century, are in the public domain. Included in the collection will be books on mathematics, science, and engineering, which would serve as invaluable resources for historians of science and today’s scientists and researchers. It’s plausible that there is plenty of original thinking that’s been overlooked and forgotten–and which will soon be only a Google search away.

This comes on top of the British Library’s effort at bringing 60,000 digital copies of historic books to the general public as a free iPad app. At 150 million texts, the size of Library’s collection is second only to that of the Library of Congress.

See also the IPad, 60,000 Text Story.

The amount of available semantically diverse data grows everyday.

March 1, 2011

InTech – Open Access Publisher

Filed under: Books,Data Mining,Self-Organizing — Patrick Durusau @ 10:18 am

I scan lightly before I clean out my spam filter for the blog and saw:

Hello. Yesterday I found two new books about Data mining. These series of books entitled by ‘Data Mining’ address the need by presenting in-depth description of novel mining algorithms and many useful applications. In addition to understanding each section deeply, the two books present useful hints and strategies to solving problems in the following chapters.The progress of data mining technology and large public popularity establish a need for a comprehensive text on the subject. Books are: “New Fundamental Technologies in Data Mining” here http://www.intechopen.com/books/show/title/new-fundamental-technologies-in-data-mining & “Knowledge-Oriented Applications in Data Mining” here http://www.intechopen.com/books/show/title/knowledge-oriented-applications-in-data-mining These are open access books so you can download it for free or just read on online reading platform like I do. Cheers!

I was curious enough to follow the links and was glad I did.

InTech – Open Access Publisher has a number of volumes for downloading that may interest topic mappers. For free!

At first I thought these were article collections, made up of conference and other papers. I have only spot checked Self Organizing Maps – Applications and Novel Algorithm Design, edited by Josphat Igadwa Mwasiagi, but none of the paper titles appear in web searches, other than at Intechweb.org.

Apologies for appearing suspicious but there is so much re-cycled content on the WWW these days. That does not appear to be the case here, which is welcome news!

Would appreciate hearing of the experience of others with volumes from this site.

February 13, 2011

O’Reilly Book Sale

Filed under: Books,MongoDB,NoSQL,Topic Maps — Patrick Durusau @ 7:14 am

OK, ok, one more not-strictly topic map item and I promise no more for today!

Buy 2, Get 1 Free at O’Reilly caught my eye this morning.

I think it is justified to appear here for two reasons:

1) It has a lot of books, such as those on databases, that are relevant to implementing topic map systems.

But, just as importantly:

2) The O’Reilly online catalog illustrates the need for topic maps.

Look at the catalog listings under Other Databases for MongoDB (you may have heard about it). Now look under Database Design and Analysis. Opps! There you will find: MongoDB: The Definitive Guide. (at least as of 13 February 2010, 7:01 AM Eastern time)

One way (not the only way) to implement a topic map here would result in a single source of updates across the catalog. And the catalog could also act as a resource pointer to other materials. The Other Resources for the MongoDB book, isn’t terribly inspiring.

*****
PS: I am hopeful the interest in NoSQL will drive greater exploration of MySQL, PostgresSQL and Oracle databases in general and as part of topic maps systems in particular.

December 13, 2010

USA Today Best-Selling Books API

Filed under: Books,Data Source,Dataset — Patrick Durusau @ 8:45 am

USA Today Best-Selling Books API

From the website:

USA Today’s Best-Selling Books API provides a method for developers to retrieve USA TOday’s weekly compiled list of the nation’s best-selling books, which is published each Thursday. In addition, developers can also retrieve archived lists since the book list’s launch on Thursday, Oct. 28, 1993. The Best-Selling Books API can also be used to retrieve a title’s history on the list and metadata about each title.

Available metadata:

  • Author. Contains one or more names of the authors, illustrators, editors or other creators of the book.
  • BookListAppearances. The number of weeks a book has appeared in the Top 150, regardless of ISBN.
  • BriefDescription. A summary of the book. Contains indicators of the book’s class (fiction or non-fiction) and format (hardcover, paperback, e-book). If a title is available in multiple formats, the format noted is the one selling the most copies that week.
  • CategoryID. Code for book category type.
  • CategoryName. Text of book category type.
  • Class. Specifies whether the book is fiction or non-fiction.
  • FirstBookListAppearance. The date of the list when the particular ISBN first appeared in the top 150.
  • FormatName. Specifies whether the ISBN is assigned to a hardcover, paperback or e-book edition.
  • HighestRank. The highest position on the list achieved by this book, regardless of ISBN.
  • ISBN. The book’s 13- or 10-digit ISBN. The ISBN for a title in a given week is the ISBN of the version (hardcover, paperback or e-book) that sold the most copies that week.
  • MostRecentBooksListAppearance. The date of the list when the particular ISBN last appeared in the top 150.
  • Rank. The book’s rank on the list.
  • RankHistories. The weekly history of the ISBN fetched.
  • RankLastWeek. The book’s rank on the prior week’s list if it appeared. Books not on the previous week’s list are designated with a “0”.
  • Title. The book title. Titles are generally reported as specified by publishers and as they appear on the book’s cover.
  • TitleAPIUrl. URL to retrieve the list history for that ISBN. Note that the ISBN refers to the version of the title that sold the most copies that week if multiple formats were available for sale. Sales from other ISBNs assigned to that title may be included; we do not provide the other ISBNs each week.

Questions:

  1. Would you use a topic map to dynamically display this information to library patrons? If so, which parts? (2-3 pages, no citations)
  2. What information would you want to use to supplement this information? How would you map it to this information? (2-3 pages, no citations)
  3. What information would you include for library staff and not patrons? (if any) (2-3 pages, no citations)
« Newer Posts

Powered by WordPress