Archive for the ‘Funding’ Category

Another Patriarchy Triumph – Crowd Funding Shadow Brokers Fails

Thursday, June 1st, 2017

Hackers shelve crowdfunding drive for Shadow Brokers exploits by Bill Brenner.

From the post:

To some, it was a terrible idea akin to paying bad people to do harm. To others, it was a chance to build more powerful defenses against the next WannaCry.

It’s now a moot point.

Forty-eight hours after they started a crowdsourcing effort on Patreon to raise $25,000 a month for a monthly Shadow Brokers subscription service, security researchers Matthew Hickey – perhaps better known as Hacker Fantastic – and x0rz, announced the fund’s cancellation. Thursday morning, the page was empty:

Brenner covers alleged reasons for the cancellation and concludes with poor advice:

Better to not go there.

As I pointed out yesterday, if 2500 people each contributed %100, the goal of raising $25,000 would be met without significant risk to anyone. Cable bills, to say nothing of mobile phone charges, easily exceed $100 for a month.

If a subscription were purchased for one month and either the Shadow Brokers don’t release new malware or what they release was cobbled up from standard malware sites, don’t buy a second one. At $100 each, isn’t that a risk you would take?

Assuming Shadow Brokers are serious about their malware-by-the-month club, a crowd funded subscription, premised on the immediate and public release of each installment, damages existing systems of patriarchy among/at:

  • Blackhat hackers
  • Governments (all levels)
  • Software vendors
  • Spy agencies (worldwide)
  • Whitehat advisors/hackers

Whitehat-only distribution follows that old saw of patriarchy, “we know what is best for you to know, etc.”

Some innocent people will be hurt by future malware releases. That’s a fact. But it’s an odd complaint for governments, spy agencies and their whitehat and vendor allies to raise.

Governments, spy agencies, whitehats and vendors have jointly participated in the slaughter of millions of people and the oppression of millions more.

Now facing some sharing of their cyberpower, they are concerned about potential injuries?

Looking forward to a deeply concealed entity stepping forward to purchase or crowd fund a release on delivery copy of the first Shadow Brokers malware-by-the-month, month 1.

Take a chance on damaging those patriarchies? Sure, that’s worth $100.

You?

The Koch Brothers are Attacking Libraries – FYI – Funding Appeal

Saturday, December 10th, 2016

EveryLibrary has a funding appeal you need to seriously consider.

The Koch Brothers are Attacking Libraries

From the post:

We are continuing to see the Koch Brothers Super PAC, Americans for Prosperity go after libraries. This last election cycle was the fifth clear example of their involvement in the agenda to defund libraries. We need your help to fight back. When the Koch Brothers and AFP puts out an anti-tax and anti-library attack, they do it with direct mail and robocalls – and they always do it late in the campaign. We need the resources to confront these anti-tax forces before they can start in the next town. Help us stop them with a one time donation today or a $5-10 monthly donation.
… (emphasis in original)

I won’t repeat the crimes committed against libraries by the Koch Brothers and their Super PAC, Americans for Prosperity, here, they are too sickening. The EveryLibrary post has a sub-set of their offenses described.

Be sure to check out the EveryLibrary site and their journal, The Political Librarian.

From their What We Do page:

EveryLibrary is the first and only national organization dedicated to building voter support for libraries. We are chartered “to promote public, school, and college libraries, including by advocating in support of public funding for libraries and building public awareness of public funding initiatives”. Our primary work is to support local public libraries when they have a referendum or measure on the ballot. We do this in three ways: by training library staff, trustees, and volunteers to plan and run effective Information Only campaigns; by assisting local Vote Yes committees on planning and executing Get Out the Vote work for their library’s measure; and by speaking directly to the public about the value and relevance of libraries and librarians. Our focus on activating voters on Election Day is unique in the library advocacy ecosystem. This is reflected in the training and coaching we do for campaigns.

If you have ever fantasized about saving the Library at Alexandria or opposing the sack of Rome by the Vandals and the Visigoths, now is your chance to do more than fantasize.

Libraries are islands of knowledge under siege by the modern analogues of the barbarians that plunged the world into centuries of darkness.

Will you piss and moan on Facebook, Twitter, etc. about the crumbling defenses of libraries or will you take your place on the ramparts?

Yes?

How-To Track Projects Like A Defense Contractor

Sunday, July 31st, 2016

Transparency Tip: How to Track Government Projects Like a Defense Contractor by Dave Maass.

From the post:

Over the last year, thousands of pages of sensitive documents outlining the government’s intelligence practices have landed on our desktops.

One set of documents describes the Director of National Intelligence’s goal of funding “dramatic improvements in unconstrained face recognition.” A presentation from the Navy uses examples from Star Trek to explain its electronic warfare program. Other records show the FBI was purchasing mobile phone extraction devices, malware and fiber network-tapping systems. A sign-in list shows the names and contact details of hundreds of cybersecurity contractors who turned up a Department of Homeland Security “Industry Day.” Yet another document, a heavily redacted contract, provides details of U.S. assistance with drone surveillance programs in Burundi, Kenya and Uganda.

But these aren’t top-secret records carefully leaked to journalists. They aren’t classified dossiers pasted haphazardly on the Internet by hacktivists. They weren’t even liberated through the Freedom of Information Act. No, these public documents are available to anyone who looks at the U.S. government’s contracting website, FBO.gov. In this case “anyone,” is usually just contractors looking to sell goods, services, or research to the government. But, because the government often makes itself more accessible to businesses than the general public, it’s also a useful tool for watchdogs. Every government program costs money, and whenever money is involved, there’s a paper trail.

Searching FBO.gov is difficult enough that there are firms that offer search services to assist contractors with locating business opportunities.

Collating FBO.gov data with topic maps (read adding non-FBO.gov data) will be a value-add to watchdogs, potential contractors (including yourself), or watchers watching watchers.

Dave’s post will get you started on your way.

Tapping Into The Terror Money Stream

Tuesday, June 21st, 2016

Can ISIS Take Down D.C.? by Jeff Stein.

From the post:


If the federal government is good at anything, however, it’s throwing money at threats. Since 2003, taxpayers have contributed $1.3 billion to the feds’ BioWatch program, a network of pathogen detectors deployed in D.C. and 33 other cities (plus at so-called national security events like the Super Bowl), despite persistent questions about its need and reliability. In 2013, Republican Representative Tim Murphy of Pennsylvania, chairman of the House Energy and Commerce Committee’s Oversight and Investigations subcommittee, called it a “boondoggle.” Jeh Johnson, who took over the reins of the Department of Homeland Security (DHS) in late 2013, evidently agreed. One of his first acts was to cancel a planned third generation of the program, but the rest of it is still running.

“The BioWatch program was a mistake from the start,” a former top federal emergency medicine official tells Newsweek on condition of anonymity, saying he fears retaliation from the government for speaking out. The well-known problems with the detectors, he says, are both highly technical and practical. “Any sort of thing can blow into its filter papers, and then you are wrapping yourself around an axle,” trying to figure out if it’s real. Of the 149 suspected pathogen samples collected by BioWatch detectors nationwide, he reports, “none were a threat to public health.” A 2003 tularemia alarm in Texas was traced to a dead rabbit.

Michael Sheehan, a former top Pentagon, State Department and New York Police Department counterterrorism official, echoes such assessments. “The technology didn’t work, and I had no confidence that it ever would,” he tells Newsweek. The immense amounts of time and money devoted to it, he adds, could’ve been better spent “protecting dangerous pathogens stored in city hospitals from falling into the wrong hands.” When he sought to explore that angle at the NYPD, the Centers for Disease Control and Prevention “initially would not tell us where they were until I sent two detectives to Atlanta to find out,” he says. “And they did, and we helped the hospitals with their security—and they were happy for the assistance.”

Even if BioWatch performed as touted, Sheehan and others say, a virus would be virtually out of control and sending scores of people to emergency rooms by the time air samples were gathered, analyzed and the horrific results distributed to first responders. BioWatch, Sheehan suggests, is a billion-dollar hammer looking for a nail, since “weaponizing biological agents is incredibly hard to do,” and even ISIS, which theoretically has the scientific assets to pursue such weapons, has shown little sustained interest in them. Plus, extremists of all denominations have demonstrated over the decades that they like things that go boom (or tat-tat-tat, the sound of an assault rifle). So the $1.1 billion spent on BioWatch is way out of proportion to the risk, critics argue. What’s really driving programs like BioWatch, Sheehan says—beside fears of leaving any potential threat uncovered, no matter how small—is the opportunity it gives members of Congress to lard out pork to research universities and contractors back home.

Considering that two people, one rifle, terrorized the D.C. area for 23 days, The Beltway Snipers, Part 1, The Beltway Snipers, Part 2, I would have to say yes, ISIS can take down D.C.

Even if they limit themselves to “…things that go boom (or tat-tat-tat, the sound of an assault rifle).” (You have to wonder about the quality of their “terrorist” training.)

But in order to get funding, you have to discover a scenario that isn’t fully occupied by contractors.

Quite recently I read of an effort to detect the possible onset of terror attacks based on social media traffic. Except there is no evidence that random social media group traffic picks up before a terrorist attack. Yeah, well, there is that but that won’t come up for years.

Here’s a new terror vector. Using Washington, D.C. as an example, how would you weaponize open data found at: District of Columbia Open Data?

Data.gov reports there are forty states (US), forty-eight counties and cities (US), fifty-two international countries (what else would they be?), and one-hundred and sixty-four international regions with open data portals.

That’s a considerable amount of open data. Data that could be combined together to further ends not intended to improve public health and well-being.

Don’t allow the techno-jingoism of posts like: How big data can terrorize global terrorism lull you in to a false sense of security.

Anyone who can think beyond being a not-so-smart bomb or tat-tat-tat can access and use open data with free tools. Are you aware of the danger that poses?

Improv at DARPA (No, Not Comedy)

Saturday, March 12th, 2016

Improv Proposers Day Webcast Special Notice March 29 and March 30, 2016 (DARPA SN-16-26)

From the notice:

PROGRAM OBJECTIVE AND DESCRIPTION

The DARPA/DSO Improv program is seeking prototype products and systems that have the potential to threaten current military operations, equipment, or personnel and are assembled primarily from commercially available technology. The technology scope of Improv is broad, and the program is structured to encourage participation by a wide range of technical specialists, researchers, developers, and skilled hobbyists. Performers may reconfigure, repurpose, program, reprogram, modify, combine, or recombine commercially available technology in any way within the bounds of local, state, and federal laws and regulations. Use of components, products, and systems from non-military technical specialties (e.g., transportation, construction, maritime, and communications) is of particular interest.

Pre-recorded seven hour webcast and recording is prohibited:

Tuesday, March 29, 2016 at 10:00 a.m. – 5:00 p.m., and Wednesday, March 30, 2016 at 10:00 a.m. – 5:00 p.m.

No cost but pre-registration is required:

http://www.sa-meetings.com/ImprovProposersDay

This looks like fun!

Your effort won’t be wasted in any event. If your idea isn’t funded here, you can still market it to others.

PS: I tried to register on 12 March 2016 and the website was down. 🙁 Will try again next week.

Innovation Down Under!

Sunday, December 6th, 2015

Twenty-nine “Welcome to the Ideas Boom” one-pagers from innovation.gov.au.

I saw this in a tweet by Leanne O’Donnell thanking @stilgherrian for putting these in one PDF file.

Hard to say what the results will be but certainly more successful than fattening the usual suspects. (NSF: BD Spokes (pronounced “hoax”) initiative)

Watch for the success factors so you can build upon the experience Australia has with its new approaches.

Big Data to Knowledge (Biomedical)

Tuesday, July 28th, 2015

Big Data to Knowledge (BD2K) Development of Software Tools and Methods for Biomedical Big Data in Targeted Areas of High Need (U01).

Dates:

Open Date (Earliest Submission Date) September 6, 2015

Letter of Intent Due Date(s) September 6, 2015

Application Due Date(s) October 6, 2015,

Scientific Merit Review February 2016

Advisory Council Review May 2016

Earliest Start Date July 2016

From the webpage:

The purpose of this BD2K Funding Opportunity Announcement (FOA) is to solicit development of software tools and methods in the three topic areas of Data Privacy, Data Repurposing, and Applying Metadata, all as part of the overall BD2K initiative. While this FOA is intended to foster new development, submissions consisting of significant adaptations of existing methods and software are also invited.

The instructions say to submit early so that corrections to your application can be suggested. (Take the advice.)

Topic maps, particularly with customized subject identity rules, are a nice fit to the detailed requirements you will find at the grant site.

Ping me if you are interested in discussing why you should include topic maps in your application.

Humanities Open Book: Unlocking Great Books

Friday, January 16th, 2015

Humanities Open Book: Unlocking Great Books

Deadline: June 10, 2015

A new joint grant program by the National Endowment for the Humanities (NEH) and the Andrew W. Mellon Foundation seeks to give a second life to outstanding out-of-print books in the humanities by turning them into freely accessible e-books.

Over the past 100 years, tens of thousands of academic books have been published in the humanities, including many remarkable works on history, literature, philosophy, art, music, law, and the history and philosophy of science. But the majority of these books are currently out of print and largely out of reach for teachers, students, and the public. The Humanities Open Book pilot grant program aims to “unlock” these books by republishing them as high-quality electronic books that anyone in the world can download and read on computers, tablets, or mobile phones at no charge.

The National Endowment for the Humanities (NEH) and the Andrew W. Mellon Foundation are the two largest funders of humanities research in the United States. Working together, NEH and Mellon will give grants to publishers to identify great humanities books, secure all appropriate rights, and make them available for free, forever, under a Creative Commons license.

The new Humanities Open Book grant program is part of the National Endowment for the Humanities’ agency-wide initiative The Common Good: The Humanities in the Public Square, which seeks to demonstrate and enhance the role and significance of the humanities and humanities scholarship in public life.

“The large number of valuable scholarly books in the humanities that have fallen out of print in recent decades represents a huge untapped resource,” said NEH Chairman William Adams. “By placing these works into the hands of the public we hope that the Humanities Open Book program will widen access to the important ideas and information they contain and inspire readers, teachers and students to use these books in exciting new ways.”

“Scholars in the humanities are making increasing use of digital media to access evidence, produce new scholarship, and reach audiences that increasingly rely on such media for information to understand and interpret the world in which they live,” said Earl Lewis, President of the Andrew W. Mellon Foundation. “The Andrew W. Mellon Foundation is delighted to join NEH in helping university presses give new digital life to enduring works of scholarship that are presently unavailable to new generations of students, scholars, and general readers.”

The National Endowment for the Humanities and the Andrew W. Mellon Foundation will jointly provide $1 million to convert out-of-print books into EPUB e-books with a Creative Commons (CC) license, ensuring that the books are freely downloadable with searchable texts and in formats that are compatible with any e-reading device. Books proposed under the Humanities Open Book program must be of demonstrable intellectual significance and broad interest to current readers.

Application guidelines and a list of F.A.Q’s for the Humanities Open Book program are available online at www.NEH.gov. The application deadline for the first cycle of Humanities Open Book grants is June 10, 2015.

What great news to start a weekend!

If you decided to apply, remember that topic maps can support indexes for a book or across books or across books and including other material. You could make a classic work in the humanities into a portal that opens onto work prior to its publication, at the time of its publication, or since. Something to set yourself apart from simply making that text available.

Gates Foundation champions open access

Sunday, November 30th, 2014

Gates Foundation champions open access by Rebecca Trager.

From the post:

The Bill & Melinda Gates Foundation, based in Washington, US, has adopted a new policy that requires free, unrestricted access and reuse of all peer-reviewed published research that the foundation funds, including any underlying data sets.

The policy, announced last week, applies to all of the research that the Gates Foundation funds entirely or partly, and will come into effect on 1 January, 2015. Specifically, the new rule dictates that published research be made available under a ‘Creative Commons’ generic license, which means that it can be copied, redistributed, amended and commercialised. During a two-year transition period, the foundation will allow publishers a 12 month embargo period on access to their research papers and data sets.

If other science and humanities sponsors follow Gates, nearly universal open access will be an accomplished fact by the end of the decade.

There will be wailing and gnashing of teeth by those who expected protectionism to further their careers at the expense of the public. I can bear their discomfort with a great deal of equanimity. Can’t you?

Open Access and the Humanities…

Friday, November 28th, 2014

Open Access and the Humanities: Contexts, Controversies and the Future by Martin Paul Eve.

From the description:

If you work in a university, you are almost certain to have heard the term ‘open access’ in the past couple of years. You may also have heard either that it is the utopian answer to all the problems of research dissemination or perhaps that it marks the beginning of an apocalyptic new era of ‘pay-to-say’ publishing. In this book, Martin Paul Eve sets out the histories, contexts and controversies for open access, specifically in the humanities. Broaching practical elements alongside economic histories, open licensing, monographs and funder policies, this book is a must-read for both those new to ideas about open-access scholarly communications and those with an already keen interest in the latest developments for the humanities.

Open access to a book on open access!

I was very amused by Gary F. Daught’s comment on the title:

“Open access for scholarly communication in the Humanities faces some longstanding cultural/social and economic challenges. Deep traditions of scholarly authority, reputation and vetting, relationships with publishers, etc. coupled with relatively shallow pockets in terms of funding (at least compared to the Sciences) and perceptions that the costs associated with traditional modes of scholarly communication are reasonable (at least compared to the Sciences) can make open access a hard sell. Still, there are new opportunities and definite signs of change. Among those at the forefront confronting these challenges while exploring open access opportunities for the Humanities is Martin Paul Eve.”

In part because Gary worded his description of the humanities as: “Deep traditions of scholarly authority, reputation and vetting, relationships with publishers,…” which is true, but is a nice way of saying:

Controlling access to the Dead Sea Scrolls was a great way to attract graduate students to professors and certain universities.

Controlling access to the Dead Sea Scrolls was a great way to avoid criticism of work by denying others access to the primary materials.

Substitute current access issues to data, both in the humanities and sciences for “Dead Sea Scrolls” and you have a similar situation.

I mention the Dead Sea Scroll case because after retarding scholarship for decades, the materials are more or less accessible now. The sky hasn’t fallen, newspapers aren’t filled with bad translations, salvation hasn’t been denied (so far as we know), to anyone holding incorrect theological positions due to bad work on the Dead Sea Scrolls.

A good read but I have to differ with Martin on his proposed solution to the objection that open access has no peer review.

Unfortunately Martin treats concerns about peer review as though they were rooted in empirical experience such that contrary experimental results will lead to a different conclusion.

I fear that Martin overlooks that peer review is a religious belief and can no more be diminished by contrary evidence than transubstantiation. Consider all the peer review scandals you have read or heard about in the past year. Has that diminished anyone’s faith in peer review? What about the fact that in the humanities, up to 98% of all monographs remain uncited after a decade?

Assuming peer review is supposed to assure the quality of publishing, a reasonable person would conclude that 98% of what has been published and is uncited, either wasn’t worth writing about and/or peer review was no guarantor of quality.

The key to open access is for publishing and funding organizations to mandate open access to data used in research and/or publication. No exceptions, no on request but deposits in open access archives.

Scholars who have self-assessed themselves as needing the advantages of non-open access data will be unhappy but I can’t say that matters all that much to me.

You?

I first saw this in a tweet by Martin Haspelmath.

Big Data Driving Data Integration at the NIH

Saturday, November 8th, 2014

Big Data Driving Data Integration at the NIH by David Linthicum.

From the post:

The National Institutes of Health announced new grants to develop big data technologies and strategies.

“The NIH multi-institute awards constitute an initial investment of nearly $32 million in fiscal year 2014 by NIH’s Big Data to Knowledge (BD2K) initiative and will support development of new software, tools and training to improve access to these data and the ability to make new discoveries using them, NIH said in its announcement of the funding.”

The grants will address issues around Big Data adoption, including:

  • Locating data and the appropriate software tools to access and analyze the information.
  • Lack of data standards, or low adoption of standards across the research community.
  • Insufficient polices to facilitate data sharing while protecting privacy.
  • Unwillingness to collaborate that limits the data’s usefulness in the research community.

Among the tasks funded is the creation of a “Perturbation Data Coordination and Integration Center.” The center will provide support for data science research that focuses on interpreting and integrating data from different data types and databases. In other words, it will make sure the data moves to where it should move, in order to provide access to information that’s needed by the research scientist. Fundamentally, it’s data integration practices and technologies.

This is very interesting from the standpoint that the movement into big data systems often drives the reevaluation, or even new interest in data integration. As the data becomes strategically important, the need to provide core integration services becomes even more important.

The NIH announcement. NIH invests almost $32 million to increase utility of biomedical research data, reads in part:

Wide-ranging National Institutes of Health grants announced today will develop new strategies to analyze and leverage the explosion of increasingly complex biomedical data sets, often referred to as Big Data. These NIH multi-institute awards constitute an initial investment of nearly $32 million in fiscal year 2014 by NIH’s Big Data to Knowledge (BD2K) initiative, which is projected to have a total investment of nearly $656 million through 2020, pending available funds.

With the advent of transformative technologies for biomedical research, such as DNA sequencing and imaging, biomedical data generation is exceeding researchers’ ability to capitalize on the data. The BD2K awards will support the development of new approaches, software, tools, and training programs to improve access to these data and the ability to make new discoveries using them. Investigators hope to explore novel analytics to mine large amounts of data, while protecting privacy, for eventual application to improving human health. Examples include an improved ability to predict who is at increased risk for breast cancer, heart attack and other diseases and condition, and better ways to treat and prevent them.

And of particular interest:

BD2K Data Discovery Index Coordination Consortium (DDICC). This program will create a consortium to begin a community-based development of a biomedical data discovery index that will enable discovery, access and citation of biomedical research data sets.

Big data driving data integration. Who knew? 😉

The more big data the greater the pressure for robust data integration.

Sounds like they are playing the topic maps tune.

Hewlett Foundation extends CC BY policy to all grantees

Wednesday, September 24th, 2014

Hewlett Foundation extends CC BY policy to all grantees by Timothy Vollmer.

From the post:

Last week the William and Flora Hewlett Foundation announced that it is extending its open licensing policy to require that all content (such as reports, videos, white papers) resulting from project grant funds be licensed under the most recent Creative Commons Attribution (CC BY) license. From the Foundation’s blog post: “We’re making this change because we believe that this kind of broad, open, and free sharing of ideas benefits not just the Hewlett Foundation, but also our grantees, and most important, the people their work is intended to help.” The change is explained in more detail on the foundation’s website.

The foundation had a long-standing policy requiring that recipients of its Open Educational Resources grants license the outputs of those grants; this was instrumental in the creation and growth of the OER field, which continues to flourish and spread. Earlier this year, the license requirement was extended to all Education Program grants, and as restated, the policy will now be rolled out to all project-based grants under any foundation program. The policy is straightforward: it requires that content produced pursuant to a grant be made easily available to the public, on the grantee’s website or otherwise, under the CC BY 4.0 license — unless there is some good reason to use a different license.

For a long time Creative Commons has been interested in promoting open licensing policies within philanthropic grantmaking. We received a grant from the Hewlett Foundation to survey the licensing policies of private foundations, and to work toward increasing the free availability of foundation-supported works. We wrote about the progress of the project in March, and we’ve been maintaining a spreadsheet of foundation IP policies, and a model IP policy.

We urge other foundations and funding bodies to emulate the outstanding leadership demonstrated by the William and Flora Hewlett Foundation and commit to making open licensing an essential component of their grantmaking strategy.

Not only is a wave of big data approaching but it will be more available than data has been at any time in history.

As funders require open access to funded content, arguments for restricted access will simply disappear from even the humanities.

If you want to change behavior, principled arguments won’t get you as far as changing the reward system.

A $23 million venture fund for the government tech set

Tuesday, September 16th, 2014

A $23 million venture fund for the government tech set by Nancy Scola.

Nancy tells a compelling story of a new VC firm, GovTech, which is looking for startups focused on providing governments with better technology infrastructure.

Three facts from the story stand out:

“The U.S. government buys 10 eBays’ worth of stuff just to operate,” from software to heavy-duty trucking equipment.

…working with government might be a tortuous slog, but Bouganim says that he saw that behind that red tape lay a market that could be worth in the neighborhood of $500 billion a year.

What most people don’t realize is government spends nearly $74 billion on technology annually. As a point of comparison, the video game market is a $15 billion annual market.

See Nancy’s post for the full flavor of the story but it sounds like there is gold buried in government IT.

Another way to look at it is the government is already spending $74 billion a year on technology that is largely an object of mockery and mirth. Effective software may be sufficiently novel and threatening to either attract business or a buy-out.

While you are pondering possible opportunities, existing systems, their structures and data are “subjects” in topic map terminology. Which means topic maps can protect existing contracts and relationships, while delivering improved capabilities and data.

Promote topic maps as “in addition to” existing IT systems and you will encounter less resistance both from within and without the government.

Don’t be squeamish about associating with governments, of whatever side. Their money spends just like everyone else’s. You can ask At&T and IBM about supporting both sides in a conflict.

I first saw this in a tweet by Mike Bracken.

Ten habits of highly effective data:…

Sunday, July 27th, 2014

Ten habits of highly effective data: Helping your dataset achieve its full potential by Anita de Waard.

Anita gives all the high minded and very legitimate reasons for creating highly effective data, with examples.

Read her slides to pick up the rhetoric you need and leads on how to create highly effective data.

Let me add one concern to drive your interest in creating highly effective data:

Funders want researchers to create highly effective data.

Enough said?

Answers to creating highly effective data continue to evolve but not attempting to create highly effective data is a losing proposal.

Tools and Resources Development Fund [bioscience ODF UK]

Friday, July 25th, 2014

Tools and Resources Development Fund

Application deadline: 17 September 2014, 4pm

From the summary:

Our Tools and Resources Development Fund (TRDF) aims to pump prime the next generation of tools, technologies and resources that will be required by bioscience researchers in scientific areas within our remit. It is anticipated that successful grants will not exceed £150k (£187k FEC) (ref 1) and a fast-track, light touch peer review process will operate to enable researchers to respond rapidly to emerging challenges and opportunities.

Projects are expected to have a maximum value of £150k (ref 1). The duration of projects should be between 6 and 18 months, although community networks to develop standards could be supported for up to 3 years.

A number of different types of proposal are eligible for consideration.

  • New approaches to the analysis, modelling and interpretation of research data in the biological sciences, including development of software tools and algorithms. Of particular interest will be proposals that address challenges arising from emerging new types of data and proposals that address known problems associated with data handling (e.g. next generation sequencing, high-throughput phenotyping, the extraction of data from challenging biological images, metagenomics).
  • New frameworks for the curation, sharing, and re-use/re-purposing of research data in the biological sciences, including embedding data citation mechanisms (e.g. persistent identifiers for datasets within research workflows) and novel data management planning (DMP) implementations (e.g. integration of DMP tools within research workflows)
  • Community approaches to the sharing of research data including the development of standards (this could include coordinating UK input into international standards development activities).
  • Approaches designed to exploit the latest computational technology to further biological research; for example, to facilitate the use of cloud computing approaches or high performance computing architectures.

Projects may extend existing software resources; however, the call is designed to support novel tools and methods. Incremental improvement and maintenance of existing software that does not provide new functionality or significant performance improvements (e.g. by migration to an advanced computing environment) does not fall within the scope of the call.

Very timely since the UK announcement that OpenDocument Format (ODF) is among the open standards:

The standards set out the document file formats that are expected to be used across all government bodies. Government will begin using open formats that will ensure that citizens and people working in government can use the applications that best meet their needs when they are viewing or working on documents together. (Open document formats selected to meet user needs)

ODF as a format supports RDFa as metadata but lacks an implementation that makes full use of that capability.

Imagine biocuration that:

  • Starts with authors writing a text and is delivered to
  • Publishers, who can proof or augment the author’s biocuration
  • Results are curated on on publication (not months or years later)
  • Results are immediately available for collation with other results.

The only way to match the explosive growth of bioscience publications with equally explosive growth of bioscience curation, is to use tools the user already knows. Like word processing software.

Please pass this along and let me know of other grants or funding opportunities where adaptation of office standards or software could change the fundamentals of workflow.

The EC Brain

Tuesday, July 8th, 2014

Scientists threaten to boycott €1.2bn Human Brain Project by Ian Sample.

From the post:

The world’s largest project to unravel the mysteries of the human brain has been thrown into crisis with more than 100 leading researchers threatening to boycott the effort amid accusations of mismanagement and fears that it is doomed to failure.

The European commission launched the €1.2bn (£950m) Human Brain Project (HBP) last year with the ambitious goal of turning the latest knowledge in neuroscience into a supercomputer simulation of the human brain. More than 80 European and international research institutions signed up to the 10-year project.

But it proved controversial from the start. Many researchers refused to join on the grounds that it was far too premature to attempt a simulation of the entire human brain in a computer. Now some claim the project is taking the wrong approach, wastes money and risks a backlash against neuroscience if it fails to deliver.

In an open letter to the European commission on Monday, more than 130 leaders of scientific groups around the world, including researchers at Oxford, Cambridge, Edinburgh and UCL, warn they will boycott the project and urge others to join them unless major changes are made to the initiative.

If you read Ian’s post and background material he cites, I think you will come away with the impression that all the concerns and charges are valid.

However, the question remains open whether a successful project was ever the goal of the EC? Or was there some other goal, such as funding particular people and groups, for which the project was a convenient vehicle?

If the project succeeded, all well and good but ten years from now, there will have been a decade of other grants with as little chance of success so who would remember this one in particular?

I don’t mean to single out EC projects or even governmental projects for that criticism.

If you remember Moral mazes: the world of corporate managers by Robert Jackall, one of the lessons was that the goal of projects in a corporation isn’t improving the bottom line, success of the project, etc., but rather the allocation of project resources among competing groups.

As more evidence of that mentality, consider the laundry list of failed IT projects undertaken by the U.S. government. From the FBI’s Virtual Case Management System to the now famous and monitored by Green Peace “secret” melting NSA data storage facility in Utah.

Greenpeace airship

The purpose of the melting NSA data center wasn’t to store data (important steps were skipped in the design stage) but to transfer funds to NSA contractors for building and then repairing the data center. Which may or may not actually go into actual use.

If there was actual intent to use the data center, where are the complaints about failure to follow the design? Use of sub-standard materials?

Both the EC Brain and the US Government need a new project strategy: Success isn’t defined by the appropriation and spending of funds. Success is defined by the end results of the project when compared to its original goals.

Imagine having a topic map that traced EC and US funded projects and compared results to original goals.

Anyone interested in a funding investigation that specifies who was paid, who approved, etc?

Unlike Google, the voters should never forget who obtained the benefit of their tax dollars with no appreciable return.

…lotteries to pick NIH research-grant recipients

Thursday, April 17th, 2014

Wall Street Journal op-ed advocates lotteries to pick NIH research-grant recipients by Steven T. Corneliussen

From the post:

The subhead for the Wall Street Journal op-ed “Taking the Powerball approach to funding medical research” summarizes its coauthors’ argument about research funding at the National Institutes of Health (NIH): “Winning a government grant is already a crapshoot. Making it official by running a lottery would be an improvement.”

The coauthors, Ferric C. Fang and Arturo Casadevall, serve respectively as a professor of laboratory medicine and microbiology at the University of Washington School of Medicine and as professor and chairman of microbiology and immunology at the Albert Einstein College of Medicine of Yeshiva University.

At a time when funding levels are historically low, they note, grant peer review remains expensive. The NIH Center for Scientific Review has a $110 million annual budget. Grant-submission and grant-review processes extract an additional high toll from participants. Within this context, the coauthors summarize criticisms of NIH peer review. They mention a 2012 Nature commentary that argued, they say, that the system’s structure “encourages conformity.” In particular, after mentioning a study in the journal Circulation Research, they propose that concerning projects judged good enough for funding, “NIH peer reviewers fare no better than random chance when it comes to predicting how well grant recipients will perform.”

Nature should use a “mock” lottery to judge the acceptance of papers along side its normal peer review process. Publish the results after a year of peer review “competing” with a lottery.

Care to speculate on the results as evaluated by Nature readers?

The Theoretical Astrophysical Observatory:…

Sunday, March 30th, 2014

The Theoretical Astrophysical Observatory: Cloud-Based Mock Galaxy Catalogues by Maksym Bernyk, et al.

Abstract:

We introduce the Theoretical Astrophysical Observatory (TAO), an online virtual laboratory that houses mock observations of galaxy survey data. Such mocks have become an integral part of the modern analysis pipeline. However, building them requires an expert knowledge of galaxy modelling and simulation techniques, significant investment in software development, and access to high performance computing. These requirements make it difficult for a small research team or individual to quickly build a mock catalogue suited to their needs. To address this TAO offers access to multiple cosmological simulations and semi-analytic galaxy formation models from an intuitive and clean web interface. Results can be funnelled through science modules and sent to a dedicated supercomputer for further processing and manipulation. These modules include the ability to (1) construct custom observer light-cones from the simulation data cubes; (2) generate the stellar emission from star formation histories, apply dust extinction, and compute absolute and/or apparent magnitudes; and (3) produce mock images of the sky. All of TAO’s features can be accessed without any programming requirements. The modular nature of TAO opens it up for further expansion in the future.

The website: Theoretical Astrophysical Observatory.

While disciplines in the sciences and the humanities play access games with data and publications, the astronomy community continues to shame both of them.

Funders, both government and private should take a common approach: Open and unfettered access to data or no funding.

It’s just that simple.

If grantees object, they can try to function without funding.

Office of Incisive Analysis

Wednesday, March 12th, 2014

Office of Incisive Analysis Office Wide – Broad Agency Announcement (BAA) IARPA-BAA-14-02
BAA Release Date: March 10, 2014

FedBizOpps Reference

IARPA-BAA-14-02 with all Supporting Documents

From the webpage:

Synopsis

IARPA invests in high-risk, high-payoff research that has the potential to provide our nation with an overwhelming intelligence advantage over future adversaries. This BAA solicits abstracts/proposals for Incisive Analysis.

IA focuses on maximizing insights from the massive, disparate, unreliable and dynamic data that are – or could be – available to analysts, in a timely manner. We are pursuing new sources of information from existing and novel data, and developing innovative techniques that can be utilized in the processes of analysis. IA programs are in diverse technical disciplines, but have common features: (a) Create technologies that can earn the trust of the analyst user by providing the reasoning for results; (b) Address data uncertainty and provenance explicitly.

The following topics (in no particular order) are of interest to IA:

  • Methods for estimation and communication of uncertainty and risk;
  • Methods for understanding the process of analysis and potential impacts of technology;
  • Methods for measuring and improving human judgment and human reasoning;
  • Multidisciplinary approaches to processing noisy audio and speech;
  • Methods and approaches to quantifiable representations of uncertainty simultaneously accounting for multiple types of uncertainty;
  • Discovering, tracking and sorting emerging events and participating entities found in reports;
  • Accelerated system development via machine learning;
  • Testable methods for identifying individuals’ intentions;
  • Methods for developing understanding of how knowledge and ideas are transmitted and change within groups, organizations, and cultures;
  • Methods for analysis of social, cultural, and linguistic data;
  • Methods to construct and evaluate speech recognition systems in languages without a formalized orthography;
  • Multidisciplinary approaches to assessing linguistic data sets;
  • Mechanisms for detecting intentionally falsified representations of events and/or personas;
  • Methods for understanding and managing massive, dynamic data in images, video, and speech;
  • Analysis of massive, unreliable, and diverse data;
  • Methods to make machine learning more useful and automatic;
  • 4D geospatial/temporal representations to facilitate change detection and analysis;
  • Novel approaches for mobile augmented reality applied to analysis and collection;
  • Methods for assessments of relevancy and reliability of new data;
  • Novel approaches to data and knowledge management facilitating discovery, retrieval and manipulation of large volumes of information to provide greater access to interim analytic and processing products.

This announcement seeks research ideas for topics that are not addressed by emerging or ongoing IARPA programs or other published IARPA solicitations. It is primarily, but not solely, intended for early stage research that may lead to larger, focused programs through a separate BAA in the future, so periods of performance generally will not exceed 12 months.

Offerors should demonstrate that their proposed effort has the potential to make revolutionary, rather than incremental, improvements to intelligence capabilities. Research that primarily results in evolutionary improvement to the existing state of practice is specifically excluded.

Contracting Office Address:
Office of Incisive Analysis
Intelligence Advanced Research Projects Activity
Office of the Director of National Intelligence
ATTN: IARPA-BAA-14-02
Washington, DC 20511
Fax: 301-851-7673

Primary Point of Contact:
dni-iarpa-baa-14-02@iarpa.gov

The “topics … of interest” that caught my eye for topic maps are:

  • Methods for measuring and improving human judgment and human reasoning;
  • Discovering, tracking and sorting emerging events and participating entities found in reports;
  • Methods for developing understanding of how knowledge and ideas are transmitted and change within groups, organizations, and cultures;
  • Methods for analysis of social, cultural, and linguistic data;
  • Novel approaches to data and knowledge management facilitating discovery, retrieval and manipulation of large volumes of information to provide greater access to interim analytic and processing products.

Thinking capturing the insights of users as they use and add content to a topic map as “evolutionary change.”

Others?

CORDIS – EU research projects under FP7 (2007-2013)

Monday, March 10th, 2014

CORDIS – EU research projects under FP7 (2007-2013)

Description:

This dataset contains projects funded by the European Union under the seventh framework programme for research and technological development (FP7) from 2007 to 2013. Grant information is provided for each project, including reference, acronym, dates, funding, programmes, participant countries, subjects and objectives. A smaller file is also provided without the texts for objectives.

The column separator is the “;” character.

The “Achievements” column is blank for all 22,653 projects/rows.

Can you suggest other sources will machine readable data on the results from EU research projects under FP7 (2007-2013)?

Thanks!

I first saw this in a tweet by Stefano Bertolo.

Fostering Innovation?

Thursday, March 6th, 2014

How Academia and Publishing are Destroying Scientific Innovation: A Conversation with Sydney Brenner by Elizabeth Dzeng.

From the post:

I recently had the privilege of speaking with Professor Sydney Brenner, a professor of Genetic medicine at the University of Cambridge and Nobel Laureate in Physiology or Medicine in 2002. My original intention was to ask him about Professor Frederick Sanger, the two-time Nobel Prize winner famous for his discovery of the structure of proteins and his development of DNA sequencing methods, who passed away in November. I wanted to do the classic tribute by exploring his scientific contributions and getting a first hand account of what it was like to work with him at Cambridge’s Medical Research Council’s (MRC) Laboratory for Molecular Biology (LMB) and at King’s College where they were both fellows. What transpired instead was a fascinating account of the LMB’s quest to unlock the genetic code and a critical commentary on why our current scientific research environment makes this kind of breakthrough unlikely today.

If you or any funders you know are interested in fostering innovation, that is actually enabling innovation to happen, this is a must read interview for you.

If you are any funders you know are interested in boosting about “fostering innovation,” creating “new breakthroughs” while funding the usual suspects, etc., just pass this one by.

One can only hope that observations of proven innovators like Sydney Brenner will carry more weight that political ideologies in the research funding process.

I first saw this in a tweet by Ivan Herman.

Knight News Challenge

Sunday, March 2nd, 2014

Knight News Challenge

Phases of the Challenge:

Submissions (February 27 – March 18)
Feedback (March 18 – April 18)
Refinement (April 18 – 28)
Evaluation (Begins April 28)

From the webpage:

How can we strengthen the Internet for free expression and innovation?

This is an open call for ideas. We want to discover projects that make the Internet better. We believe that access to information is key to vibrant and successful communities, and we want the Internet to remain an open, equitable platform for free expression, commerce and learning. We want an Internet that fuels innovation through the creation and sharing of ideas.

We don’t have specific projects that we’re hoping to see in response to our question. Instead, we want this challenge to attract a range of approaches. In addition to technologies, we’re open to ideas focused on journalism, policy, research, education– any innovative project that results in a stronger Internet.

So we want to know what you think– what captures your imagination when you think about the Internet as a place for free expression and innovation? In June we will award $2.75 million, including $250,000 from the Ford Foundation, to support the most compelling ideas.

Breaking the strangle hold of page rank is on top of my short list. There is a great deal to be said for the “wisdom” of crowds, but one of those things is that it doesn’t respond well to the passage of time. Old material keeps racking up credibility long past its “ignore by date.”

More granular date sorting would be a strong second on my list.

What’s on your short list?

Data-Driven Discovery Initiative

Saturday, February 15th, 2014

Data-Driven Discovery Initiative

Pre-Applications Due February 24, 2014 by 5 pm Pacific Time.

15 Awards at $1,500,000 each, at $200K-$300K/year for five years.

From the post:

Our Data-Driven Discovery Initiative seeks to advance the people and practices of data-intensive science, to take advantage of the increasing volume, velocity, and variety of scientific data to make new discoveries. Within this initiative, we’re supporting data-driven discovery investigators – individuals who exemplify multidisciplinary, data-driven science, coalescing natural sciences with methods from statistics and computer science.

These innovators are striking out in new directions and are willing to take risks with the potential of huge payoffs in some aspect of data-intensive science. Successful applicants must make a strong case for developments in the natural sciences (biology, physics, astronomy, etc.) or science enabling methodologies (statistics, machine learning, scalable algorithms, etc.), and applicants that credibly combine the two are especially encouraged. Note that the Science Program does not fund disease targeted research.

It is anticipated that the DDD initiative will make about 15 awards at ~$1,500,000 each, at $200K-$300K/year for five years.

Pre-applications are due Monday, February 24, 2014 by 5 pm Pacific Time. To begin the pre-application process, click the “Apply Here” button above. We expect to extend invitations for full applications in April 2014. Full applications will be due five weeks after the invitation is sent, currently anticipated for mid-May 2014.

Apply Here

If you are interested in leveraging topic maps in your application, give me a call!

As far as I know, topic maps remain the only technology that documents the basis for merging distinct representations of the same subject.

Mappings, such as you find in Talend and other enterprise data management technologies, is great, so long as you don’t care why a particular mapping was done.

And in many cases, it may not matter. When you are exporting one time mailing list for a media campaign. It’s going to be discarded upon use so who cares?

In other cases, where labor intensive work is required to discover the “why” of a prior mapping, documenting that “why” would be useful.

Topic maps can document as much or as little of the semantics of your data and data processing stack as you desire. Topic maps can’t make legacy data and data semantic issues go away, but they can become manageable.

Big Mechanism (DARPA)

Saturday, February 8th, 2014

Big Mechanism, Solicitation Number: DARPA-BAA-14-14

Reponse Data: Mar 18, 2014 12:00 pm Eastern

From the solicitation:

Here is one way to factor the technologies in the Big Mechanism program:

  1. Read abstracts and papers to extract fragments of causal mechanisms;
  2. Assemble fragments into more complete Big Mechanisms;
  3. Explain and reason with Big Mechanisms.

Here is a sample from Reading:

As with all natural language processing, Reading is bedeviled by ambiguity [5]. The mapping of named entities to biological entities is many-to-many. Context matters, but is often missing; for example, the organism in which a pathway is studied might be mentioned once at the beginning of a document and ignored thereafter. Although the target semantics involves processes, these can be described at different levels of detail and precision. For example, “β-catenin is a critical component of Wnt-mediated transcriptional activation” tells us only that β-catenin is involved in a process; whereas, “ARF6 activation promotes the intracellular accumulation of β-catenin” tells us that ARF6 promotes a process; and “L-cells treated with the GSK3 β inhibitor LiCl (50 mM) . . . showed a marked increase in β-catenin fluorescence within 30 – 60 min” describes the kinetics of a process. Processes also can be described as modular abstractions, as in “. . . the endocytosis of growth factor receptors and robust activation of extracellular signal-regulated kinase”. It might be possible to extract causal skeletons of complicated processes (i.e., the entities and how they causally influence each other) by reading abstracts, but it seems likely that extracting the kinetics of processes will require reading full papers. It is unclear whether this program will be able to provide useful explanations of processes if it doesn’t extract the kinetics of these processes.

An interesting solicitation. Yes?

I thought it was odd though that the solicitation starts out with:

DARPA is soliciting innovative research proposals in the area of reading research papers and abstracts to construct and reason over explanatory, causal models of complicated systems. Proposed research should investigate innovative approaches that enable revolutionary advances in science, devices, or systems. Specifically excluded is research that primarily results in evolutionary improvements to the existing state of practice. (emphasis added)

But then gives the goals for 18 months in part as:

  • Development of a formal representation language for biological processes;
  • Extraction of fragments of known signaling networks from a relatively small and carefully selected corpus of texts and encoding of these fragments in a formal language;

Aren’t people developing formal languages to track signaling networks right now? I am puzzled how that squares with the criteria:

Specifically excluded is research that primarily results in evolutionary improvements to the existing state of practice. ?

It does look like an exciting project, assuming it isn’t limited to current approaches.

Open Educational Resources for Biomedical Big Data

Friday, January 17th, 2014

Open Educational Resources for Biomedical Big Data (R25)

Deadline for submission: April 1, 2014

Additional information: bd2k_training@mail.nih.gov

As part of the NIH Big Data to Knowledge (BD2K) project, BD2K R25 FOA will support:

Curriculum or Methods Development of innovative open educational resources that enhance the ability of the workforce to use and analyze biomedical Big Data.

The challenges:

The major challenges to using biomedical Big Data include the following:

Locating data and software tools: Investigators need straightforward means of knowing what datasets and software tools are available and where to obtain them, along with descriptions of each dataset or tool. Ideally, investigators should be able to easily locate all published and resource datasets and software tools, both basic and clinical, and, to the extent possible, unpublished or proprietary data and software.

Gaining access to data and software tools: Investigators need straightforward means of 1) releasing datasets and metadata in standard formats; 2) obtaining access to specific datasets or portions of datasets; 3) studying datasets with the appropriate software tools in suitable environments; and 4) obtaining analyzed datasets.

Standardizing data and metadata: Investigators need data to be in standard formats to facilitate interoperability, data sharing, and the use of tools to manage and analyze the data. The datasets need to be described by standard metadata to allow novel uses as well as reuse and integration.

Sharing data and software: While significant progress has been made in broad and rapid sharing of data and software, it is not yet the norm in all areas of biomedical research. More effective data- and software-sharing would be facilitated by changes in the research culture, recognition of the contributions made by data and software generators, and technical innovations. Validation of software to ensure quality, reproducibility, provenance, and interoperability is a notable goal.

Organizing, managing, and processing biomedical Big Data: Investigators need biomedical data to be organized and managed in a robust way that allows them to be fully used; currently, most data are not sufficiently well organized. Barriers exist to releasing, transferring, storing, and retrieving large amounts of data. Research is needed to design innovative approaches and effective software tools for organizing biomedical Big Data for data integration and sharing while protecting human subject privacy.

Developing new methods for analyzing biomedical Big Data: The size, complexity, and multidimensional nature of many datasets make data analysis extremely challenging. Substantial research is needed to develop new methods and software tools for analyzing such large, complex, and multidimensional datasets. User-friendly data workflow platforms and visualization tools are also needed to facilitate the analysis of Big Data.

Training researchers for analyzing biomedical Big Data: Advances in biomedical sciences using Big Data will require more scientists with the appropriate data science expertise and skills to develop methods and design tools, including those in many quantitative science areas such as computational biology, biomedical informatics, biostatistics, and related areas. In addition, users of Big Data software tools and resources must be trained to utilize them well.

Another big data biomedical data integration funding opportunity!

I do wonder about the suggestion:

The datasets need to be described by standard metadata to allow novel uses as well as reuse and integration.

Do they mean:

“Standard” metadata for a particular academic lab?

“Standard” metadata for a particular industry lab?

“Standard” metadata for either one five (5) years ago?

“Standard” metadata for either one (5) years from now?

The problem being the familiar one that knowledge that isn’t moving forward is outdated.

It’s hard to do good research with outdated information.

Making metadata dynamic, so that it reflects yesterday’s terminology, today’s and someday tomorrow’s, would be far more useful.

The metadata displayed to any user would be their choice of metadata and not the complexities that make the metadata dynamic.

Interested?

Courses for Skills Development in Biomedical Big Data Science

Thursday, January 16th, 2014

Courses for Skills Development in Biomedical Big Data Science

Deadline for submission: April 1, 2014

Additional information: bd2k_training@mail.nih.gov

As part of the NIH Big Data to Knowledge (BD2K) the purpose of BD2K R25 FOA will support:

Courses for Skills Development in topics necessary for the utilization of Big Data, including the computational and statistical sciences in a biomedical context. Courses will equip individuals with additional skills and knowledge to utilize biomedical Big Data.

Challenges in biomedical Big Data?

The major challenges to using biomedical Big Data include the following:

Locating data and software tools: Investigators need straightforward means of knowing what datasets and software tools are available and where to obtain them, along with descriptions of each dataset or tool. Ideally, investigators should be able to easily locate all published and resource datasets and software tools, both basic and clinical, and, to the extent possible, unpublished or proprietary data and software.

Gaining access to data and software tools: Investigators need straightforward means of 1) releasing datasets and metadata in standard formats; 2) obtaining access to specific datasets or portions of datasets; 3) studying datasets with the appropriate software tools in suitable environments; and 4) obtaining analyzed datasets.

Standardizing data and metadata: Investigators need data to be in standard formats to facilitate interoperability, data sharing, and the use of tools to manage and analyze the data. The datasets need to be described by standard metadata to allow novel uses as well as reuse and integration.

Sharing data and software: While significant progress has been made in broad and rapid sharing of data and software, it is not yet the norm in all areas of biomedical research. More effective data- and software-sharing would be facilitated by changes in the research culture, recognition of the contributions made by data and software generators, and technical innovations. Validation of software to ensure quality, reproducibility, provenance, and interoperability is a notable goal.

Organizing, managing, and processing biomedical Big Data: Investigators need biomedical data to be organized and managed in a robust way that allows them to be fully used; currently, most data are not sufficiently well organized. Barriers exist to releasing, transferring, storing, and retrieving large amounts of data. Research is needed to design innovative approaches and effective software tools for organizing biomedical Big Data for data integration and sharing while protecting human subject privacy.

Developing new methods for analyzing biomedical Big Data: The size, complexity, and multidimensional nature of many datasets make data analysis extremely challenging. Substantial research is needed to develop new methods and software tools for analyzing such large, complex, and multidimensional datasets. User-friendly data workflow platforms and visualization tools are also needed to facilitate the analysis of Big Data.

Training researchers for analyzing biomedical Big Data: Advances in biomedical sciences using Big Data will require more scientists with the appropriate data science expertise and skills to develop methods and design tools, including those in many quantitative science areas such as computational biology, biomedical informatics, biostatistics, and related areas. In addition, users of Big Data software tools and resources must be trained to utilize them well.

It’s hard to me to read that list and not see subject identity as playing some role in meeting all of those challenges. Not a complete solution because there are a variety of problems in each challenge. But to preserve access to data sets over time, issues and approaches, subject identity is a necessary component of any solution.

Applicants have to be institutions of higher education but I assume they can hire expertise as required.

On Self-Licking Ice Cream Cones

Thursday, December 5th, 2013

On Self-Licking Ice Cream Cones by Peter Worden. 1992

Ben Brody in The definitive glossary of modern US military slang quotes the following definition for a Self-Licking Ice Cream Cone:

A military doctrine or political process that appears to exist in order to justify its own existence, often producing irrelevant indicators of its own success. For example, continually releasing figures on the amount of Taliban weapons seized, as if there were a finite supply of such weapons. While seizing the weapons, soldiers raid Afghan villages, enraging the residents and legitimizing the Taliban’s cause.

Wikipedia at (Self-licking ice cream cone) reports the phrase was first used by Pete Worden in “On Self-Licking Ice Cream Cones” in 1992 to describe the NASA bureaucracy.

The keywords for the document are: Ice Cream Cones; Pork; NASA; Mafia; Congress.

Birds of a feather I would say.

Worden isolates several problems:

Problems, National, The Budget Process


This unfortunate train of events has resulted in a NASA which, more than any other agency, believes it works only for the appropriations committees. The senior staff of those committees, who have little interest in science or space, effectively run NASA. NASA senior offiicials’ noses are usually found at waist level near those committee staffers.

Problems, Closer to Home, NASA

“The Self-Licking Ice Cream Cone”

Since NASA effectively works for the most porkish part of Congress, it is not surprising that their programs are designed to maximize and perpetuate jobs programs in key Congressional districts. The Space Shuttle-Space Station is an outrageous example. Almost two-thirds of NASA’s budget is tied up in this self-licking program. The Shuttle is an unbelievably costly was to get to space at $1 billion a pop. The Space Station is a silly design. Yet, this Station is designed so it can only be built by the Shuttle and the Shuttle is the only way to construct the Station….

“Inmates Running the Asylum”

NASA’s vaulted “peer review” process is not a positive factor, but an example of the “pork” mentality within the scientific community. It results in needlessly complex programs whose primary objective is not putting instruments in orbit, but maximizing the number of constituencies and investigators, thereby maximizing the political invulnerability of the program….

“Mafia Tactics”

…The EOS is a case in point. About a year ago, encouraged by criticism from some quarters of Congress and in the press, some scientists and satellite contractors began proposing small, cheap, near-term alternatives to the EOS “battlestars.” Senior NASA officials conducted, with impunity, an unbelievable campaign of threats against these critics. Members of the White House advisory committees were told they would not get NASA funding if they continued to probe the program….

“Shoot the Sick Horses, and their Trainers”

It is outrageous that the Hubble disaster resulted in no repercussions. All we hear is that some un-named technician, no longer working for the contractor, made a mistake in the early 1980s. Even in the Defense Department, current officials would lost their jobs over allowing such an untested and expensive system to be launched.

Compare Worden’s complaints to the security apparatus represented by the NSA and its kin.

Have you heard of any repercussions for any of the security failures and/or outrages?

Is there any doubt that the security apparatus exists solely to perpetuate the security apparatus?

By definition the NSA is a Self-Licking Ice Cream Cone.

Time to find a trash can.


EOS: Earth Observing System

Hubble: The Hubble Space Telescope Optical Systems Failure Report (pdf) Long before all the dazzling images from Hubble, it was virtually orbiting space junk for several years.

Pitch Advice For Entrepreneurs

Saturday, October 26th, 2013

Pitch Advice For Entrepreneurs: LinkedIn’s Series B Pitch to Greylock by Reid Hoffman.

From the post:

At Greylock, my partners and I are driven by one guiding mission: always help entrepreneurs. It doesn’t matter whether an entrepreneur is in our portfolio, whether we’re considering an investment, or whether we’re casually meeting for the first time.

Entrepreneurs often ask me for help with their pitch decks. Because we value integrity and confidentiality at Greylock, we never share an entrepreneur’s pitch deck with others. What I’ve honorably been able to do, however, is share the deck I used to pitch LinkedIn to Greylock for a Series B investment back in 2004.

This past May was the 10th anniversary of LinkedIn, and while reflecting on my entrepreneurial journey, I realized that no one gets to see the presentation decks for successful companies. This gave me an idea: I could help many more entrepreneurs by making the deck available not just to the Greylock network of entrepreneurs, but to everyone.

Today, I share the Series B deck with you, too. It has many stylistic errors — and a few substantive ones, too — that I would now change having learned more, but I realized that it still provides useful insights for entrepreneurs and startup participants outside of the Greylock network, particularly across three areas of interest:

  • how entrepreneurs should approach the pitch process
  • the evolution of LinkedIn as a company
  • the consumer internet landscape in 2004 vs. today

Read, digest, and then read again.

I first saw this in a tweet by Tim O’Reilly.

NIH Big Data to Knowledge (BD2K) Initiative [TM Opportunity?]

Sunday, July 28th, 2013

NIH Big Data to Knowledge (BD2K) Initiative by Shar Steed.

From the post:

The National Institutes of Health (NIH) has announced the Centers of Excellence for Big Data Computing in the Biomedical Sciences (U54) funding opportunity announcement, the first in its Big Data to Knowledge (BD2K) Initiative.

The purpose of the BD2K initiative is to help biomedical scientists fully utilize Big Data being generated by research communities. As technology advances, scientists are generating and using large, complex, and diverse datasets, which is making the biomedical research enterprise more data-intensive and data-driven. According to the BD2K website:

[further down in the post]

Data integration: An applicant may propose a Center that will develop efficient and meaningful ways to create connections across data types (i.e., unimodal or multimodal data integration).

That sounds like topic maps doesn’t it?

At least if we get away from black/white, match one of a set of IRIs or not, type merging practices.

For more details:

A webinar for applicants is scheduled for Thursday, September 12, 2013, from 3 – 4:30 pm EDT. Click here for more information.

Be aware of this workshop:

August 21, 2013 – August 22, 2013
NIH Data Catalogue
Chair:
Francine Berman, Ph.D.

This workshop seeks to identify the least duplicative and burdensome, and most sustainable and scalable method to create and maintain an NIH Data Catalog. An NIH Data Catalog would make biomedical data findable and citable, as PubMed does for scientific publications, and would link data to relevant grants, publications, software, or other relevant resources. The Data Catalog would be integrated with other BD2K initiatives as part of the broad NIH response to the challenges and opportunities of Big Data and seek to create an ongoing dialog with stakeholders and users from the biomedical community.

Contact: BD2Kworkshops@mail.nih.gov

Let’s see: “…least duplicative and burdensome, and most sustainable and scalable method to create and maintain an NIH Data Catalog.”

Recast existing data as RDF with a suitable OWL Ontology. – Duplicative, burdensome, not sustainable or scalable.

Accept all existing data as it exists and write subject identity and merging rules: Non-duplicative, existing systems persist so less burdensome, re-use of existing data = sustainable, only open question is scalability.

Sounds like a topic map opportunity to me.

You?

Microsoft kicks off its own bug bounty programme [seed money?]

Sunday, June 23rd, 2013

Microsoft kicks off its own bug bounty programme

From the post:

Microsoft has announced a three-pronged bug bounty programme for its upcoming Windows and Internet Explorer versions. The company will start paying security researchers for disclosing security vulnerabilities to it in a responsible manner, similar to Google’s bug bounty programme for Chrome and Chrome OS that has been ongoing since 2010. Under Microsoft’s new initiative, researchers can report vulnerabilities in the under-development Windows 8.1 and the preview of its Internet Explorer 11 browser. If submissions are accompanied by ideas about how to defend against the attack, the submitting researcher will earn a substantial monetary bonus.

Under the Mitigation Bypass Bounty category, Microsoft will pay researchers up to $100,000 for “truly novel exploitation techniques” against the protections of the latest version of Windows, with up to an additional $50,000 BlueHat Bonus for Defense for ideas how to defend against them. These two categories are open indefinitely. Until 26 July, researchers can also earn up to $11,000 for reporting critical vulnerabilities that affect the Internet Explorer 11 Preview on Windows 8.1 Preview. The company’s bug bounty programme will open for submissions on 26 June, the same day that the company plans to release the Windows 8.1 preview to the wider public.

I mention this as a source of funding for startups, particularly those interested in topic maps.