Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

March 18, 2015

Open Source Tensor Libraries For Data Science

Filed under: Data Science,Mathematics,Open Source,Programming — Patrick Durusau @ 5:20 pm

Let’s build open source tensor libraries for data science by Ben Lorica.

From the post:

Data scientists frequently find themselves dealing with high-dimensional feature spaces. As an example, text mining usually involves vocabularies comprised of 10,000+ different words. Many analytic problems involve linear algebra, particularly 2D matrix factorization techniques, for which several open source implementations are available. Anyone working on implementing machine learning algorithms ends up needing a good library for matrix analysis and operations.

But why stop at 2D representations? In a recent Strata + Hadoop World San Jose presentation, UC Irvine professor Anima Anandkumar described how techniques developed for higher-dimensional arrays can be applied to machine learning. Tensors are generalizations of matrices that let you look beyond pairwise relationships to higher-dimensional models (a matrix is a second-order tensor). For instance, one can examine patterns between any three (or more) dimensions in data sets. In a text mining application, this leads to models that incorporate the co-occurrence of three or more words, and in social networks, you can use tensors to encode arbitrary degrees of influence (e.g., “friend of friend of friend” of a user).

In case you are interested, Wikipedia has a list of software packages for tensor analaysis.

Not mentioned by Wikipedia: Facebook open sourcing TH++ last year, a library for tensor analysis. Along with fblualibz, which includes a bridge between Python and Lua (for running tensor analysis).

Uni10 wasn’t mentioned by Wikipedia either.

Good starting place: Big Tensor Mining, Carnegie Mellon Database Group.

Suggest you join an existing effort before you start duplicating existing work.

February 14, 2015

Thank Snowden: Internet Industry Now Considers The Intelligence Community An Adversary, Not A Partner

Filed under: Cybersecurity,NSA,Open Source,Security — Patrick Durusau @ 2:31 pm

Thank Snowden: Internet Industry Now Considers The Intelligence Community An Adversary, Not A Partner by Mike Masnick

From the post:

We already wrote about the information sharing efforts coming out of the White House cybersecurity summit at Stanford today. That’s supposedly the focus of the event. However, there’s a much bigger issue happening as well: and it’s the growing distrust between the tech industry and the intelligence community. As Bloomberg notes, the CEOs of Google, Yahoo and Facebook were all invited to join President Obama at the summit and all three declined. Apple’s CEO Tim Cook will be there, but he appears to be delivering a message to the intelligence and law enforcement communities, if they think they’re going to get him to drop the plan to encrypt iOS devices by default:


In an interview last month, Timothy D. Cook, Apple’s chief executive, said the N.S.A. “would have to cart us out in a box” before the company would provide the government a back door to its products. Apple recently began encrypting phones and tablets using a scheme that would force the government to go directly to the user for their information. And intelligence agencies are bracing for another wave of encryption.

Disclosure: I have been guilty of what I am about to criticize Mike Masnick about and will almost certainly be guilty of it in the future. That, however, does not make it right.

What would you say is being assumed in the Mike’s title?

Guesses anyone?

What if it read: U.S. Internet Industry Now Considers The U.S. Intelligence Community An Adversary, Not A Partner?

Does that help?

The trivial point is that the “Internet Industry” isn’t limited to the U.S. and Mike’s readership isn’t either.

More disturbing though is that the “U.S. (meant here descriptively) Internet Industry” at one point did consider the “U.S. (again descriptively) Intelligence Community” as a partner at one point.

That being the case and seeing how Mike duplicates that assumption in his title, how should countries besides the U.S. view the reliability (in terms of government access) of U.S. produced software?

That’s a simple enough question.

What is your answer?

The assumption of partnership between the “U.S. Internet Industry” and the “U.S. Intelligence Community” would have me running to back an alternative to China’s recent proposal for source code being delivered to the government (in that case China).

Rather than every country having different import requirements for software sales, why not require the public posting of commercial software source for software sales anywhere?

Posting of source code doesn’t lessen your rights to the code (see copyright statutes) and it makes detection of software piracy trivially easy since all commercial software has to post its source code.

Oh, some teenager might compile a copy but do you really think major corporations in any country are going to take that sort of risk? It just makes no sense.

As far as the “U.S. Intelligence Community” concerns, remember “The treacherous are ever distrustful…” The ill-intent of the world they see is a reflection of their own malice towards others. Or after years of systematic abuse, the smoldering anger of the abused.

January 8, 2015

WorldWide Telescope (MS) Goes Open Source!

Filed under: Astroinformatics,Open Source — Patrick Durusau @ 10:15 am

Microsoft is Open‐Sourcing WorldWide Telescope in 2015

From the post:

Why is this great news?

Millions of people rely on WorldWide Telescope (WWT) as their unified astronomical image and data environment for exploratory research, teaching, and public outreach. With OpenWWT, any individual or organization will be able to adapt and extend the functionality of WorldWide Telescope to meet any research or educational need. Extensions to the software will continuously enhance astronomical research, formal and informal learning, and public outreach.

What is WWT, and where did it come from?

WorldWide Telescope began in 2007 as a research project, led from within Microsoft Research. Early partners included astronomers and educators from Caltech, Harvard, Johns Hopkins, Northwestern, the University of Chicago, and several NASA facilities. Thanks to these collaborations and Microsoft’s leadership, WWT has reached its goal of creating a free unified contextual visualization of the Universe with global reach that lets users explore multispectral imagery, all of which is deeply connected to scholarly publications and online research databases.

The WWT software was designed with rich interactivity in mind. Guided tours which can be created within the program, offer scripted paths through the 3D environment, allowing media-rich interactive stories to be told, about anything from star formation to the discovery of the large scale structure of the Universe. On the web, WWT is used as both as a standalone program and as an API, in teaching and in research—where it offers unparalleled options for sharing and contextualizing data sets, on the “2D” multispectral sky and/or within the “3D” Universe.

How can you help?

Open-sourcing WWT will allow the people who can best imagine how WWT should evolve to meet the expanding research and teaching challenges in astronomy to guide and foster future development. The OpenWWT Consortium’s members are institutions who will guide WWT’s transition from Microsoft Research to a new host organization. The Consortium and hosting organization will work with the broader astronomical community on a three-part mission of: 1) advancing astronomical research, 2) improving formal and informal astronomy education; and 3) enhancing public outreach.

Join us. If you and your institution want to help shape the future of WWT to support your needs, and the future of open-source software development in Astronomy, then ask us about joining the OpenWWT Consortium.

To contact the WWT team, or inquire about joining the OpenWWT Consortium, contact Doug Roberts at doug-roberts@northwestern.edu.

What a nice way to start the day!

I’m Twitter follower #30 for OpenWWT. What Twitter follower are you going to be?

If you are interested in astronomy, teaching, interfaces, coding great interfaces, etc., there is something of interest for you here.

Enjoy!

December 26, 2014

Seldon

Filed under: Data Mining,Open Source,Predictive Analytics — Patrick Durusau @ 1:59 pm

Seldon wants to make life easier for data scientists, with a new open-source platform by Martin Bryant.

From the post:

It feels that these days we live our whole digital lives according mysterious algorithms that predict what we’ll want from apps and websites. A new open-source product could help those building the products we use worry less about writing those algorithms in the first place.

As increasing numbers of companies hire in-house data science teams, there’s a growing need for tools they can work with so they don’t need to build new software from scratch. That’s the gambit behind the launch of Seldon, a new open-source predictions API launching early in the new year.

Seldon is designed to make it easy to plug in the algorithms needed for predictions that can recommend content to customers, offer app personalization features and the like. Aimed primarily at media and e-commerce companies, it will be available both as a free-to-use self-hosted product and a fully hosted, cloud-based version.

If you think Inadvertent Algorithmic Cruelty is a problem, just wait until people who don’t understand the data or the algorithms start using them in prepackaged form.

Packaged predictive analytics are about as safe as arming school crossing guards with .600 Nitro Express rifles to ward off speeders. As attractive as the second suggestion sounds, there would be numerous safety concerns.

Different but no less pressing safety concerns abound with packaged predictive analytics. Being disconnected from the actual algorithms, can enterprises claim immunity for race, gender or sexual orientation based discrimination? Hard to prove “intent” when the answers in question were generated in complete ignorance of the algorithmic choices that drove the results.

At least Seldon is open source and so the algorithms can be examined, should you be interested in how results are calculated. But open source algorithms are but one aspect of the problem. What of the data? Blind application of algorithms, even neutral ones, can lead to any number of results. If you let me supply the data, I can give you a guarantee of the results from any known algorithm. “Untouched by human hands” as they say.

When you are given recommendations based on predictive analytics do you ask for the data and/or algorithms? Who in your enterprise can do due diligence to verify the results? Who is on the line for bad decisions based on poor predictive analytics?

I first saw this in a tweet by Gregory Piatetsky.

December 17, 2014

Orleans Goes Open Source

Filed under: .Net,Actor-Based,Cloud Computing,HyTime,Microsoft,Open Source — Patrick Durusau @ 7:03 pm

Orleans Goes Open Source

From the post:

Since the release of the Project “Orleans” Public Preview at //build/ 2014 we have received a lot of positive feedback from the community. We took your suggestions and fixed a number of issues that you reported in the Refresh release in September.

Now we decided to take the next logical step, and do the thing many of you have been asking for – to open-source “Orleans”. The preparation work has already commenced, and we expect to be ready in early 2015. The code will be released by Microsoft Research under an MIT license and published on GitHub. We hope this will enable direct contribution by the community to the project. We thought we would share the decision to open-source “Orleans” ahead of the actual availability of the code, so that you can plan accordingly.

The real excitement for me comes from a post just below this announcement: A Framework for Cloud Computing,


To avoid these complexities, we built the Orleans programming model and runtime, which raises the level of the actor abstraction. Orleans targets developers who are not distributed system experts, although our expert customers have found it attractive too. It is actor-based, but differs from existing actor-based platforms by treating actors as virtual entities, not as physical ones. First, an Orleans actor always exists, virtually. It cannot be explicitly created or destroyed. Its existence transcends the lifetime of any of its in-memory instantiations, and thus transcends the lifetime of any particular server. Second, Orleans actors are automatically instantiated: if there is no in-memory instance of an actor, a message sent to the actor causes a new instance to be created on an available server. An unused actor instance is automatically reclaimed as part of runtime resource management. An actor never fails: if a server S crashes, the next message sent to an actor A that was running on S causes Orleans to automatically re-instantiate A on another server, eliminating the need for applications to supervise and explicitly re-create failed actors. Third, the location of the actor instance is transparent to the application code, which greatly simplifies programming. And fourth, Orleans can automatically create multiple instances of the same stateless actor, seamlessly scaling out hot actors.

Overall, Orleans gives developers a virtual “actor space” that, analogous to virtual memory, allows them to invoke any actor in the system, whether or not it is present in memory. Virtualization relies on indirection that maps from virtual actors to their physical instantiations that are currently running. This level of indirection provides the runtime with the opportunity to solve many hard distributed systems problems that must otherwise be addressed by the developer, such as actor placement and load balancing, deactivation of unused actors, and actor recovery after server failures, which are notoriously difficult for them to get right. Thus, the virtual actor approach significantly simplifies the programming model while allowing the runtime to balance load and recover from failures transparently. (emphasis added)

Not in a distributed computing context but the “look and its there” model is something I recall from HyTime. So nice to see good ideas resurface!

Just imagine doing that with topic maps, including having properties of a topic, should you choose to look for them. If you don’t need a topic, why carry the overhead around? Wait for someone to ask for it.

This week alone, Microsoft continues its fight for users, announces an open source project that will make me at least read about .Net, ;-), I think Microsoft merits a lot of kudos and good wishes for the holiday season!

I first say this at: Microsoft open sources cloud framework that powers Halo by Jonathan Vanian.

December 14, 2014

Instant Hosting of Open Source Projects with GitHub-style Ribbons

Filed under: Open Source,OpenShift — Patrick Durusau @ 5:15 pm

Instant Hosting of Open Source Projects with GitHub-style Ribbons by Ryan Jarvinen.

From the post:

In this post I’ll show you how to create your own GitHub-style ribbons for launching open source projects on OpenShift.

The popular “Fork me on GitHub” ribbons provide a great way to raise awareness for your favorite open source projects. Now, the same technique can be used to instantly launch clones of your application, helping to rapidly grow your community!

Take advantage of [the following link is broken as of 12/14/2014] OpenShift’s web-based app creation workflow – streamlining installation, hosting, and management of instances – by crafting a workflow URL that contains information about your project.

I thought this could be useful in the not too distant future.

Better to blog about it here than to search for it in the nightmare of my bookmarks. 😉

December 10, 2014

Yet More “Hive” Confusion

Filed under: Crowd Sourcing,Open Source,Semantic Diversity — Patrick Durusau @ 4:28 pm

The New York Times R&D Lab releases Hive, an open-source crowdsourcing tool by Justin Ellis.

From the post:

A few months ago we told you about a new tool from The New York Times that allowed readers to help identify ads inside the paper’s massive archive. Madison, as it was called, was the first iteration on a new crowdsourcing tool from The New York Times R&D Lab that would make it easier to break down specific tasks and get users to help an organization get at the data they need.

Today the R&D Lab is opening up the platform that powers the whole thing. Hive is an open-source framework that lets anyone build their own crowdsourcing project. The code responsible for Hive is now available on GitHub. With Hive, a developer can create assignments for users, define what they need to do, and keep track of their progress in helping to solve problems.

Not all that long ago, I penned: Avoiding “Hive” Confusion, which pointed out the possible confusion between Apache Hive and High-performance Integrated Virtual Environment (HIVE), in mid to late October, 2014. Now, barely two months later we have another “Hive” in the information technology field.

I have no idea how many “hives” there are inside or outside of IT but as of today, I can name at least three (3).

Have you ever thought that semantic confusion is part and parcel of the human condition? Can be allowed for, can be compensated for, but can never be eliminated.

November 25, 2014

NSA partners with Apache to release open-source data traffic program

Filed under: NSA,Open Source — Patrick Durusau @ 9:01 pm

NSA partners with Apache to release open-source data traffic program by Steven J. Vaughan-Nichols.

From the post:

Many of you probably think that the National Security Agency (NSA) and open-source software get along like a house on fire. That's to say, flaming destruction. You would be wrong.

[image and link omitted]

In partnership with the Apache Software Foundation, the NSA announced on Tuesday that it is releasing the source code for Niagarafiles (Nifi). The spy agency said that Nifi "automates data flows among multiple computer networks, even when data formats and protocols differ".

Details on how Nifi does this are scant at this point, while the ASF continues to set up the site where Nifi's code will reside.

In a statement, Nifi's lead developer Joseph L Witt said the software "provides a way to prioritize data flows more effectively and get rid of artificial delays in identifying and transmitting critical information".

I don’t doubt the NSA efforts at open source software. That isn’t saying anything about how closely the code would need to be proofed.

Perhaps encouraging more open source projects from the NSA will eat into the time they have to spend writing malware. 😉

Something to look forward to!

November 13, 2014

Wintel and Open Source

Filed under: C/C++,Julia,Open Source — Patrick Durusau @ 6:44 pm

The software world is reverberating with the news that Microsoft is in the process of making .NET completely open source.

On the same day, Intel announced that it had released “Julia2C, a source-to-source translator from Julia to C.”

Hmmm, is this evidence that open source is a viable path for commercial vendors? 😉

Next Question: How long before non-open source code become a liability? As in a nesting place for government surveillance/malware.

Speculation: Not as long as it took Wintel to move towards open source.

Consumers should demand open source code as a condition for purchase. All software, all the time.

The Battleship Moves

Filed under: Microsoft,Open Source — Patrick Durusau @ 2:45 pm

A milestone moment for Microsoft: .NET is now an open-source project by Jonathan Vanian.

During the acrimonious debate about OOXML, a friend said that Microsoft was like a very large battleship, it could turn, but movement wasn’t ever sudden.

From what I read in Jonathan’s post, MS is in the process of making yet another turn, this time to make .NET an open source project.

A move that gives credence to the proposition that being open source isn’t inconsistent with being a commercial enterprise and a profitable one.

But just as important is commercial open source software as a bulwark against government surveillance. Consumers will have the choice of buying binary and possibly government surveillance infected software or they can use open source and the services of traditional vendors such as MS, IBM, HP, etc. to compile specific software packages for their use.

Opening up such a large package isn’t an overnight lark so I encourage everyone to be patient as MS eases .NET into the waters of open source. Continued good experiences with an open source .NET will further the open source agenda at Microsoft.

The more open source software in use, the fewer dark places for government surveillance to hide.

Fewer dark places for government surveillance to hide.” Yet another benefit from open source software!

October 6, 2014

Bossies 2014: The Best of Open Source Software Awards

Filed under: Open Source,Software — Patrick Durusau @ 4:30 pm

Bossies 2014: The Best of Open Source Software Awards by Doug Dineley.

From the post:

If you hadn’t noticed, we’re in the midst of an incredible boom in enterprise technology development — and open source is leading it. You’re unlikely to find better proof of that dynamism than this year’s Best of Open Source Software Awards, affectionately known as the Bossies.

Have a look for yourself. The result of months of exploration and evaluation, plus the recommendations of many expert contributors, the 2014 Bossies cover more than 130 award winners in six categories:

(emphasis added)

Hard to judge the count because winners are presented one page at a time in each category. Not to mention that at least one winner appears in two separate categories.

Put into lists and sorted for review we find:

Open source applications (16)

Open source application development tools (42)

Open source big data tools (20)

Open source desktop and mobile software (14)

Open source data center and cloud software (19)

Open source networking and security software (9)

Creating the list presentation allows us to discover the actual count, allowing for entries with more than one software package mentioned, is 122 software packages.

BTW, Docker appears under application development tools and under data center and cloud software. Which should make the final count 121 different software packages. (You will have to check the entries at InfoWorld to verify that number.)

PS: The original presentation was in no discernible order. I put the lists into alphabetical order for ease of finding.

September 20, 2014

ApacheCon EU 2014

Filed under: Conferences,Open Source — Patrick Durusau @ 7:36 pm

ApacheCon EU 2014

ApacheCon Europe 2014 – November 17-21 in Budapest, Hungary.

November is going to be here sooner than you think. You need to register now and start making travel arrangements.

A quick scroll down the schedule page will give you an idea of the breath of the Apache Foundation activities.

September 13, 2014

Open AI Resources

Filed under: Artificial Intelligence,Open Source — Patrick Durusau @ 10:54 am

Open AI Resources

From the about page:

We all go further when we all work together. That’s the promise of Open AIR, an open source collaboration hub for AI researchers. With the decline of university- and government-sponsored research and the rise of large search and social media companies insistence on proprietary software, the field is quickly privatizing. Open AIR is the antidote: it’s important for leading scientists and researchers to keep our AI research out in the open, shareable, and extensible by the community. Join us in our goal to keep the field moving forward, together, openly.

An impressive collection of open source AI software and data.

The categories are:

A number of the major players in AI research are part of this project, which bodes well for it being maintained into the future.

If you create or encounter any open AI resources not listed at Open AI Resources, please Submit a Resource.

I first saw this in a tweet by Ana-Maria Popescu.

August 15, 2014

XPERT (Xerte Public E-learning ReposiTory)

Filed under: Education,Open Source — Patrick Durusau @ 12:43 pm

XPERT (Xerte Public E-learning ReposiTory)

From the about page:

XPERT (Xerte Public E-learning ReposiTory) project is a JISC funded rapid innovation project (summer 2009) to explore the potential of delivering and supporting a distributed repository of e-learning resources created and seamlessly published through the open source e-learning development tool called Xerte Online Toolkits. The aim of XPERT is to progress the vision of a distributed architecture of e-learning resources for sharing and re-use.

Learners and educators can use XPERT to search a growing database of open learning resources suitable for students at all levels of study in a wide range of different subjects.

Creators of learning resources can also contribute to XPERT via RSS feeds created seamlessly through local installations of Xerte Online Toolkits. Xpert has been fully integrated into Xerte Online Toolkits, an open source content authoring tool from The University of Nottingham.

Other useful links:

Xerte Project Toolkits

Xerte Community.

You may want to start with the browse option because the main interface is rather stark.

The Google interface is “stark” in the same sense but Google has indexed a substantial portion of all online content. I’m not very likely to draw a blank. Xpert, with a base of 364,979 resources, the odds of my drawing a blank are far higher.

The keywords are in three distinct alphabetical segments, starting with “a” or a digit, ending and then another digit or “a” follows and end, one after the other. Hebrew and what appears to be Chinese appears at the end of the keyword list, in no particular order. I don’t know if that is an artifact of the software or of its use.

The same repeated alphabetical segments occurs in Author. Under Type there are some true types such as “color print” but the majority of the listing is file sizes in bytes. Not sure why file size would be a “type.” Institution has similar issues.

If you are looking for a volunteer opportunity, helping XPert with alphabetization would enhance the browsing experience for the resources it has collected.

I first saw this in a tweet by Graham Steel.

May 29, 2014

Open-Source Intelligence

Filed under: Intelligence,Open Source — Patrick Durusau @ 7:10 pm

Big data brings new power to open-source intelligence by Matthew Moran.

From the post:

In November 2013, the New Yorker published a profile of Eliot Higgins – or Brown Moses as he is known to almost 17,000 Twitter followers. An unemployed finance and admin worker at the time, Higgins was held up as an example of what can happen when we take advantage of the enormous amount of information being spread across the internet every day. The New Yorker’s eight-page spread described Higgins as “perhaps the foremost expert on the munitions used in the [Syrian] war”, a remarkable description for someone with no formal training in munitions or intelligence.

Higgins does not speak Arabic and has never been to the Middle East. He operates from his home in Leicester and, until recently, conducted his online investigations as an unpaid hobby. Yet the description was well-founded. Since starting his blog in 2012, Higgins has uncovered evidence of the Syrian army’s use of cluster bombs and exposed the transfer of weapons from Iran to Syria. And he has done it armed with nothing more than a laptop and an eye for detail.

This type of work is a form of open-source intelligence. Higgins exploits publicly accessible material such as online photos, video and social media updates to piece together information about the Syrian conflict. His analyses have formed the basis of reports in The Guardian and a blog for The New York Times, while his research has been cited by Human Rights Watch.

Matthew makes a compelling case for open-source intelligence, using Eliot Higgins as an example.

No guarantees of riches or fame but data is out there to be mined and curated.

All that is required is for you to find it, package it and find the right audience and/or buyer.

No small order but what else are you doing this coming weekend? 😉

PS: Where would you place requests for intelligence or offer intelligence for sale? Just curious.

May 21, 2014

Govcode

Filed under: Open Source,Programming — Patrick Durusau @ 8:11 pm

Govcode: Government Open Source Projects

This is a handy collection of government projects from GitHub.

Weren’t we just talking about a corpus of software earlier today? Are you thinking about a corpus of government open source projects would give you insight into their non-open source code?

Like handwriting, don’t programmers code the same way for open as well as closed source software?

Interesting.

Tie people to projects, code, agencies, and magic happens.

March 23, 2014

XDATA – DARPA

Filed under: DARPA,Government Data,Open Source — Tags: — Patrick Durusau @ 7:00 pm

XDATA – DARPA

From the about page:

The DARPA Open Catalog is a list of DARPA-sponsored open source software products and related publications. Each resource link shown on this site links back to information about each project, including links to the code repository and software license information.

This site reorganizes the resources of the Open Catalog (specifically the XDATA program) in a way that is easily sortable based on language, project or team. More information about XDATA’s open source software toolkits and peer-reviewed publications can be found on the DARPA Open Catalog, located at http://www.darpa.mil/OpenCatalog/.

For more information about this site, e-mail us at piim@newschool.edu.

A great public service for anyone interested in DARPA XDATA projects.

You could view this as encouragement to donate time to government hackathons.

I disagree.

Donating services to an organization that pays for IT and then accepts crap results, encourages poor IT management.

March 10, 2014

Open Source: Option of the Security Conscious

Filed under: Cybersecurity,Linux OS,Open Source,Security — Patrick Durusau @ 10:00 am

International Space Station attacked by ‘virus epidemics’ by Samuel Gibbs.

From the post:

Malware made its way aboard the International Space Station (ISS) causing “virus epidemics” in space, according to security expert Eugene Kaspersky.

Kaspersky, head of security firm Kaspersky labs, revealed at the Canberra Press Club 2013 in Australia that before the ISS switched from Windows XP to Linux computers, Russian cosmonauts managed to carry infected USB storage devices aboard the station spreading computer viruses to the connected computers.

…..

In May, the United Space Alliance, which oversees the running of if the ISS in orbit, migrated all the computer systems related to the ISS over to Linux for security, stability and reliability reasons.

If your or your company is at all concerned with security issues, open source software is the only realistic option.

Not that open source software has fewer bugs in fact on release, but because there is the potential for a large community of users to be seeking those bugs out and fixing them.

The recent Apple “goto fail” farce would not happen in an open source product. Some tester, intentionally or accidentally would use invalid credentials and so the problem would have surfaced.

If we are lucky, Apple had one tester who was also tasked with other duties and so we got what Apple chose to pay for.

This is not a knock against software companies that sell software for a profit. Rather it is a challenge to the current marketing of software for a profit.

Imagine that MS SQL Server was open source but commercial software. That is the source code is freely available but the licensing prohibits its use for commercial resale.

Do you really think that banks, insurance companies, enterprises are going to be grabbing source code and compiling it to avoid license fees?

I admit to having a low opinion of the morality of bank, insurance companies, etc., but they also have finely tuned senses of risk. Might save a few bucks in the short run, but the consequences of getting caught are quite severe.

So there would be lots of hobbyists hacking on, trying to improve, etc. MS SQL Server source code.

You know that hackers can no more keep a secret than a member of Congress, albeit hackers don’t usually blurt out secrets on the evening news. Every bug, improvement, etc. would become public knowledge fairly quickly.

MS could even make contribution of bugs, fixes as a condition of the open source download.

MS could continue to sell MS SQL Server as commercial software as before making it open source.

The difference would be instead of N programmers working to find and fix bugs, there would be N + Internet community working to find and fix bugs.

The other difference being the security conscious in military, national security, and government organizations, would not have to be planning migrations away from closed source software.

Post-Snowden, open source software is the only viable security option.

PS: Yes, I have seen the “we are not betraying you now” and/or “we betray you only when required by law to do so,” statements from various vendors.

I much prefer to not be betrayed at all.

You?

PS: There is another advantage to vendors from an all open source policy on software. Vendors worry about others copying their code, etc. With open source that should be easy enough to monitor and prove.

February 4, 2014

DARPA Open Catalog

Filed under: Open Source,Programming — Patrick Durusau @ 2:22 pm

DARPA Open Catalog

From the webpage:

Welcome to the DARPA Open Catalog, which contains a curated list of DARPA-sponsored software and peer-reviewed publications. DARPA funds fundamental and applied research in a variety of areas including data science, cyber, anomaly detection, etc., which may lead to experimental results and reusable technology designed to benefit multiple government domains.

The DARPA Open Catalog organizes publically releasable material from DARPA programs, beginning with the XDATA program in the Information Innovation Office (I2O). XDATA is developing an open source software library for big data. DARPA has an open source strategy through XDATA and other I2O programs to help increase the impact of government investments.

DARPA is interested in building communities around government-funded software and research. If the R&D community shows sufficient interest, DARPA will continue to make available information generated by DARPA programs, including software, publications, data and experimental results. Future updates are scheduled to include components from other I2O programs such as Broad Operational Language Translation (BOLT) and Visual Media Reasoning (VMR).

I don’t know if I would use binaries from DARPA but with open source you get to choose your own comfort level. 😉

Maybe I should ask:

How does it feel for DARPA to be more open source than your favorite vendor?

What do you think? More backdoors in open source* or binary software?

I first saw this in a tweet by Tim O’Reilly.

* Remember that open source doesn’t mean non-commercial. You can always copyright open source code. Copyright has protected Mickey Mouse longer than binary has protected COBOL programs.

Besides, open source with copyright makes it easier for you to search for infringing code doesn’t it? Enables you to ask what your competitors must be hiding. Yes?

January 20, 2014

OpenAIRE Legal Study has been published

Filed under: Law,Licensing,Open Access,Open Data,Open Source — Patrick Durusau @ 2:14 pm

OpenAIRE Legal Study has been published

From the post:

Guibault, Lucie; Wiebe, Andreas (Eds) (2013) Safe to be Open: Study on the protection of research data and recommendation for access and usage. The full-text of the book is available (PDF, ca. 2 MB ) under the CC BY 4.0 license. Published by University of Göttingen Press (Copies can be ordered from the publisher’s website)

Any e-infrastructure which primarily relies on harvesting external data sources (e.g. repositories) needs to be fully aware of any legal implications for re-use of this knowledge, and further application by 3rd parties. OpenAIRE’s legal study will put forward recommendations as to applicable licenses that appropriately address scientific data in the context of OpenAIRE.

CAUTION:: Safe to be Open is a EU-centric publication and while very useful in copyright discussions elsewhere, should not be relied upon as legal advice. (That’s not an opinion about relying on it in the EU. Ask local counsel for that advice.)

I say that having witnessed too many licensing discussions that were uninformed by legal counsel. Entertaining to be sure but if I have a copyright question, I will be posing it to counsel who is being paid to be correct.

At least until ignorance of the law becomes an affirmative shield against liability for copyright infringement. 😉

To be sure, I recommend reading of Safe to be Open as a means to become informed about the contours of access and usage of research data in the EU. And possibly a model for solutions in legal systems that lag behind the EU in that regard.

Personally I favor Attribution CC BY because the other CC licenses presume the licensed material was created without unacknowledged/uncompensated contributions from others.

Think of all the people who taught you to read, write, program and all the people whose work you have read, been influenced by, etc. Hopefully you can add to the sum of communal knowledge but it is unfair to claim ownership of the whole of communal knowledge simply because you contributed a small part. (That’s not legal advice either, just my personal opinion.)

Without all the instrument makers, composers, singers, organists, etc. that came before him, Mozart would not the same Mozart that we remember. Just as gifted but without a context to display his gifts.

Patent and copyright need to be recognized as “thumbs on the scale” against development of services and knowledge. That’s where I would start a discussion of copyright and patents.

December 6, 2013

Glitch is Dead, Long Live Glitch!

Filed under: Graphics,Open Source — Patrick Durusau @ 6:58 pm

Glitch is Dead, Long Live Glitch!: Art & Code from the Game Released into Public Domain by Tiny Speck.

From the website:

The collaborative, web-based, massively multiplayer game Glitch began its initial private testing in 2009, opened to the public in 2010, and was shut down in 2012. It was played by more than 150,000 people and was widely hailed for its original and highly creative visual style.

The entire library of art assets from the game, has been made freely available, dedicated to the public domain. Code from the game client is included to help developers work with the assets. All of it can be downloaded and used by anyone, for any purpose. (But: use it for good.)

Tiny Speck, Inc., the game’s developer, has relinquished its ownership of copyright over these 10,000+ assets in the hopes that they help others in their creative endeavours and build on Glitch’s legacy of simple fun, creativity and an appreciation for the preposterous. Go and make beautiful things.

I never played Glitch but the art could be useful.

Or perhaps even the online game code if you are looking to create a topic map gaming site.

Read the release for the details of the licensing.

I first saw this in Nat Torkington’s Four short links: 22 November 2013.

November 19, 2013

Mortar’s Open Source Community

Filed under: BigData,Ethics,Mortar,Open Source — Patrick Durusau @ 8:28 pm

Building Mortar’s Open Source Community: Announcing Public Plans by K. Young.

From the post:

We’re big fans of GitHub. There are a lot of things to like about the company and the fantastic service they’ve built. However, one of the things we’ve come to admire most about GitHub is their pricing model.

If you’re giving back to the community by making your work public, you can use GitHub for free. It’s a great approach that drives tremendous benefits to the GitHub community.

Starting today, Mortar is following GitHub’s lead in supporting those who contribute to the data science community.

If you’re improving the data science community by allowing your Mortar projects to be seen and forked by the public, we will support you by providing free access to our complete platform (including unlimited development time, up to 25 public projects, and email support). In short, you’ll pay nothing beyond Amazon Web Services’ standard Elastic MapReduce fees if you decide to run a job.

A good illustration of the difference between talking about ethics (Ethics of Big Data?) and acting ethically.

Acting ethically benefits the community.

Government grants to discuss ethics, well, you know who benefits from that.

September 22, 2013

ZFS disciples form one true open source database

Filed under: Files,Open Source,Storage — Patrick Durusau @ 10:46 am

ZFS disciples form one true open source database by Lucy Carey.

From the post:

The acronym ‘ZFS’ may no longer actually stand for anything, but the “world’s most advanced file sharing system” is in no way redundant. Yesterday, it emerged that corporate advocates of the Sun Microsystems file system and logical volume manager have joined together to offer a new “truly open source” incarnation of the file system, called, fittingly enough, OpenZFS.

Along with the the launch of the open-zfs.org website – which is incidentally, a domain owned by ZFS co-founder Matt Ahrens- the group of ZFS lovers, which includes developers from the illumos, FreeBSD, Linux, and OS X platforms, as well as an assortment of other parties who are building products on top of OpenZFS, have set out a clear set of objectives.

Speaking of scaling, Wikipedia reports:

A ZFS file system can store up to 256 quadrillion Zebibytes (ZiB).

Just in case anyone mentions scalable storage as an issue. 😉

September 6, 2013

Open Source = See Where It Keeps Its Brain

Filed under: Cybersecurity,NSA,Open Source,Security — Patrick Durusau @ 5:52 pm

A recent article by James Ball, Julian Borger and Glenn Greenwald, How US and UK spy agencies defeat internet privacy and security confirms non-open source software is dangerous to your privacy, business, military and government (if you are not the U.S.).

One brief quote from an article you need to digest in full:

Funding for the program – $254.9m for this year – dwarfs that of the Prism program, which operates at a cost of $20m a year, according to previous NSA documents. Since 2011, the total spending on Sigint enabling has topped $800m. The program “actively engages US and foreign IT industries to covertly influence and/or overtly leverage their commercial products’ designs”, the document states. None of the companies involved in such partnerships are named; these details are guarded by still higher levels of classification.

Among other things, the program is designed to “insert vulnerabilities into commercial encryption systems”. These would be known to the NSA, but to no one else, including ordinary customers, who are tellingly referred to in the document as “adversaries”.

No names but it isn’t hard to guess whose software products has backdoors.

How to know if your system is vulnerable to the U.S. government?

Find the Gartner Report that includes your current office suite or other software.

Compare the names in the Gartner report to your non-open source software. Read’em and weep.

How to stop being vulnerable to the U.S. government?

A bit harder but doable.

Support the Apache Software Foundation and other open source software projects.

As Ginny Weasley finds in the Harry Potter series, it’s important to know where magical objects keep their brains.

Same is true for software. Just because you can’t see into it, doesn’t mean it can see you. It may be spying on you.

Open software is far less likely to spy on you. Why? Because the backdoor or security compromise would be visible to anyone. Including people who would blow the whistle.

OpenOffice or other open source software not meeting your needs?

For OpenDocument Format (ODF) (used by numerous open source software projects), post your needs: office-comment-subscribe@lists.oasis-open.org (subscription link).

Support the open source project of your choice.

Or not, if you like being spied on by software you paid for.

August 13, 2013

Source Code Search Engines [DYI Drones]

Filed under: Open Source,Programming,Software — Patrick Durusau @ 4:04 pm

Open Source Matters: 6 Source Code Search Engines You Can Use For Programming Projects by Saikat Basu.

From the post:

The Open source movement is playing a remarkable role in pushing technology and making it available to all. The success of Linux is also an example how open source can translate into a successful business model. Open source is pretty much mainstream now and in the coming years, it could have a major footprint across cutting edge educational technology and aerospace (think DIY drones).

Open source projects need all the help they can get. If not with funding, then with volunteers contributing to open source programming and free tools they can brandish. Search engines tuned with algorithms to find source code for programming projects are among the tools for the kit bag. While reusing code is a much debated topic in higher circles, they could be of help to beginner programmers and those trying to work their way through a coding logjam by cross-referencing their code. Here are six:

I don’t think any of these search engine will show up in comScore results. 😉

But they are search engines for a particular niche. And so free to optimize for their expected content, rather than trying to search everything. (Is there a lesson there?)

Which ones do you like best?

PS: On DYI drones, see: DIY DRONES – The Leading Community for Personal UAVs.

You may recall Abbie Hoffman saying in Steal this Book:

If you are around a military base, you will find it relatively easy to get your hands on an M-79 grenade launcher, which is like a giant shotgun and is probably the best self-defense weapon of all time. Just inquire discreetly among some long-haired soldiers.

Will DYI drones replace the M-79 grenade launcher as the “best” self-defense weapon?

June 22, 2013

13 Things People Hate about Your Open Source Docs [+ One More]

Filed under: Documentation,Open Source — Patrick Durusau @ 4:28 pm

13 Things People Hate about Your Open Source Docs by Andy Lester.

From the post:

Most open source developers like to think about the quality of the software they build, but the quality of the documentation is often forgotten. Nobody talks about how great a project’s docs are, and yet documentation has a direct impact on your project’s success. Without good documentation, people either do not use your project, or they do not enjoy using it. Happy users are the ones who spread the news about your project – which they do only after they understand how it works, which they learn from the software’s documentation.

Yet, too many open source projects have disappointing documentation. And it can be disappointing in several ways.

The examples I give below are hardly authoritative, and I don’t mean to pick on any particular project. They’re only those that I’ve used recently, and not meant to be exemplars of awfulness. Every project has committed at least a few of these sins. See how many your favorite software is guilty of (whether you are user or developer), and how many you personally can help fix.

Andy’s list:

  1. Lacking a good README or introduction
  2. Docs not available online
  3. Docs only available online
  4. Docs not installed with the package
  5. Lack of screenshots
  6. Lack of realistic examples
  7. Inadequate links and references
  8. Forgetting the new user
  9. Not listening to the users
  10. Not accepting user input
  11. No way to see what the software does without installing it
  12. Relying on technology to do your writing
  13. Arrogance and hostility toward the user

See Andy’s post for the details on his points and the comments that follow.

I do think Andy missed one point:

14. Commercial entity open sources a product, machine generates documentation, expects users to contribute patches to the documentation for free.

What seems odd about that to you?

Developers getting paid to develop poor documentation and their response to user comments on documentation is the “community” should fix it for free.

At least in a true open source project, everyone is contributing and can use the (hopefully) great results equally.

Not so with a, “well…., for that you would need commercial license X” type project.

I first saw this in a tweet by Alexandre.

June 8, 2013

Big Data Open Source Tools

Filed under: BigData,Open Source — Patrick Durusau @ 9:33 am

Open Source Tools

If you are looking for an big data open source project or just want to illustrate the depth of open source software for big data, this is the graphic for you!

Categories:

  • Big Data Search
  • Business Inteligence
  • Data aggregation
  • Data Analysis & Platforms
  • Databases / Data warehousing
  • Data Mining
  • Document Store
  • Graphs
  • Grid Solutions
  • KeyValue
  • Multidimensional
  • Multimodel
  • Multivalue database
  • Object databases
  • Operational
  • Social
  • XML Databases

The basis for a trivia game at a conference? Moderator pulls name of a software project out of the hat and you have ten seconds to name three technical facts about the software?

Could be really amusing. Not quite the Newly Wed game but still amusing.

You can download it as a PDF.

April 16, 2013

Anniversary! Microsoft Open Technologies, Inc. (MS Open Tech)

Filed under: Microsoft,Open Source — Patrick Durusau @ 6:58 pm

You’re invited to help us celebrate an unlikely pairing in open source by Gianugo Rabellino.

From the post:

We are just days away from reaching a significant milestone for our team and the open source and open standards communities: the first anniversary of Microsoft Open Technologies, Inc. (MS Open Tech) — a wholly owned subsidiary of Microsoft.

We can’t think of anyone better to celebrate with than YOU, the members of the open source and open standards community and technology industry who have helped us along on our adventure over the past year.

We’d like to extend an open (pun intended!) invitation to celebrate with us on April 25, and share your burning questions on the future of the subsidiary, open source at-large and how MS Open Tech can better connect with the developer community to present even more choice and freedom.

I’ll be proud to share the stage with our amazing MS Open Tech leadership team: Jean Paoli, President; Kamaljit Bath, Engineering team leader; and Paul Cotton, Standards team leader and Co-Chair of the W3C HTML Working Group.

You have three choices:

  1. You can be a hard ass and stay home to “punish” MS for real and imagined slights and sins over the years. (You won’t be missed.)
  2. You can be obnoxious and attend, doing your best to not have a good time and trying to keep others from having a good time. (Better to stay home.)
  3. You can attend, have a good time, ask good questions, encourage more innovation and support by Microsoft for the open source and open standards communities.

Microsoft is going to be a major player in whatever solution to semantic interoperability catches on.

If that is topic maps, then Microsoft will be into topic maps.

I would prefer that be under the open source/open standards banner.

Distance prevents me from attending but I will be there in spirit!

Happy Anniversary to Microsoft Open Technologies, Inc.!

March 7, 2013

Open Source for Cybersecurity?

Filed under: Cybersecurity,Open Source,Security — Patrick Durusau @ 5:15 pm

A couple of weeks ago I posted: Crowdsourcing Cybersecurity: A Proposal (Part 1) and Crowdsourcing Cybersecurity: A Proposal (Part 2), concluding that publicity (not secrecy) about security flaws would enhance cybersecurity.

Then this week I read:

A classic open source koan is that “with many eyes, all bugs become shallow.” In IT security, is it that with many eyes, all worms become shallow?

Burton: What the Department of Defense said was if someone has malicious intent and the code isn’t available, they’ll have some way of getting the code. But if it is available and everyone has access to it, then any vulnerabilities that are there are much more likely to be corrected than before they’re exploited.

(From Alex Howard’s interview of CFPB ( Consumer Financial Protection Bureau ) CIO Chris Willey (@ChrisWilleyDC) and acting deputy CIO Matthew Burton (@MatthewBurton), reported in: Open source is interoperable with smarter government at the CFPB.

If the “white hats” aren’t going to recognize the benefits of crowdsourcing cybersecurity, perhaps it is time for the “black hats” to take up the mantle of crowdsourcing.

Perhaps that will force the “white hats” to adapt better security measures than “security by secrecy.”

Public mappings of security flaws anyone?


Update: DARPA to Turn Off Funding for Hackers Pursuing Cybersecurity Research

The Pentagon is scuttling a program that awards grants to reformed hackers and security professionals for short-term research with game-changing potential, according to cybersecurity firm Kaspersky Lab.

That’s the ticket. If we don’t know it, it must not be known.

March 3, 2013

Liferay / Marketplace

Filed under: Enterprise Integration,Open Source,Software — Patrick Durusau @ 2:14 pm

Liferay. Enterprise. Open Source. For Life.

Enterprise.

Liferay, Inc. was founded in 2004 in response to growing demand for Liferay Portal, the market’s leading independent portal product that was garnering industry acclaim and adoption across the world. Today, Liferay, Inc. houses a professional services group that provides training, consulting and enterprise support services to our clientele in the Americas, EMEA, and Asia Pacific. It also houses a core development team that steers product development.

Open Source.

Liferay Portal was, in fact, created in 2000 and boasts a rich open source heritage that offers organizations a level of innovation and flexibility unrivaled in the industry. Thanks to a decade of ongoing collaboration with its active and mature open source community, Liferay’s product development is the result of direct input from users with representation from all industries and organizational roles. It is for this reason, that organizations turn to Liferay technology for exceptional user experience, UI, and both technological and business flexibility.

For Life.

Liferay, Inc. was founded for a purpose greater than revenue and profit growth. Each quarter we donate to a number of worthy causes decided upon by our own employees. In the past we have made financial contributions toward AIDS relief and the Sudan refugee crisis through well-respected organizations such as Samaritan’s Purse and World Vision. This desire to impact the world community is the heart of our company, and ultimately the reason why we exist.

The Liferay Marketplace may be of interest for open source topic map projects.

There are only a few mentions of topic maps in the mailing list archives and none of those are recent.

Could be time to rekindle that conversation.

I first saw this at: Beyond Search.

« Newer PostsOlder Posts »

Powered by WordPress