Data Fusion « Another Word For It

Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

April 27, 2016

Reboot Your $100+ Million F-35 Stealth Jet Every 10 Hours Instead of 4 (TM Fusion)

Filed under: Cybersecurity,Data Fusion,Data Integration,Data Science,Topic Maps — Patrick Durusau @ 10:07 am

Pentagon identifies cause of F-35 radar software issue

From the post:

The Pentagon has found the root cause of stability issues with the radar software being tested for the F-35 stealth fighter jet made by Lockheed Martin Corp, U.S. Defense Acquisition Chief Frank Kendall told a congressional hearing on Tuesday.

Last month the Pentagon said the software instability issue meant the sensors had to be restarted once every four hours of flying.

Kendall and Air Force Lieutenant General Christopher Bogdan, the program executive officer for the F-35, told a Senate Armed Service Committee hearing in written testimony that the cause of the problem was the timing of “software messages from the sensors to the main F-35” computer. They added that stability issues had improved to where the sensors only needed to be restarted after more than 10 hours.

“We are cautiously optimistic that these fixes will resolve the current stability problems, but are waiting to see how the software performs in an operational test environment,” the officials said in a written statement.
… (emphasis added)

At $100+ Million plane that requires rebooting every ten hours? I’m not a pilot but that sounds like a real weakness.

The precise nature of the software glitch isn’t described but you can guess one of the problems from Lockheed Martin’s, Software You Wish You Had: Inside the F-35 Supercomputer:

…
The human brain relies on five senses—sight, smell, taste, touch and hearing—to provide the information it needs to analyze and understand the surrounding environment.

Similarly, the F-35 relies on five types of sensors: Electronic Warfare (EW), Radar, Communication, Navigation and Identification (CNI), Electro-Optical Targeting System (EOTS) and the Distributed Aperture System (DAS). The F-35 “brain”—the process that combines this stellar amount of information into an integrated picture of the environment—is known as sensor fusion.

At any given moment, fusion processes large amounts of data from sensors around the aircraft—plus additional information from datalinks with other in-air F-35s—and combines them into a centralized view of activity in the jet’s environment, displayed to the pilot.

In everyday life, you can imagine how useful this software might be—like going out for a jog in your neighborhood and picking up on real-time information about obstacles that lie ahead, changes in traffic patterns that may affect your route, and whether or not you are likely to pass by a friend near the local park.

F-35 fusion not only combines data, but figures out what additional information is needed and automatically tasks sensors to gather it—without the pilot ever having to ask.
… (emphasis added)

The fusion of data from other in-air F-35s is a classic topic map merging of data problem.

You have one subject, say an anti-aircraft missile site, seen from up to four (in the F-35 specs) F-35s. As is the habit of most physical objects, it has only one geographic location but the fusion computer for the F-35 doesn’t come up with than answer.

Kris Osborn writes in Software Glitch Causes F-35 to Incorrectly Detect Targets in Formation:

…
“When you have two, three or four F-35s looking at the same threat, they don’t all see it exactly the same because of the angles that they are looking at and what their sensors pick up,” Bogdan told reporters Tuesday. “When there is a slight difference in what those four airplanes might be seeing, the fusion model can’t decide if it’s one threat or more than one threat. If two airplanes are looking at the same thing, they see it slightly differently because of the physics of it.”

For example, if a group of F-35s detect a single ground threat such as anti-aircraft weaponry, the sensors on the planes may have trouble distinguishing whether it was an isolated threat or several objects, Bogdan explained.

As a result, F-35 engineers are working with Navy experts and academics from John’s Hopkins Applied Physics Laboratory to adjust the sensitivity of the fusion algorithms for the JSF’s 2B software package so that groups of planes can correctly identify or discern threats.

“What we want to have happen is no matter which airplane is picking up the threat – whatever the angles or the sensors – they correctly identify a single threat and then pass that information to all four airplanes so that all four airplanes are looking at the same threat at the same place,” Bogdan said.
…

Unless Bogdan is using “sensitivity” in a very unusual sense, that doesn’t sound like the issue with the fusion computer of the F-35.

Rather the problem is the fusion computer has no explicit doctrine of subject identity to use when it is merging data from different F-35s, whether it be two, three, four or even more F-35s. The display of tactical information should be seamless to the pilot and without human intervention.

I’m sure members of Congress were impressed with General Bogdan using words like “angles” and “physics,” but the underlying subject identity issue isn’t hard to address.

At issue is the location of a potential target on the ground. Within some pre-defined metric, anything located within a given area is the “same target.”

The Air Force has already paid for this type of analysis and the mathematics of what is called Circular Error Probability (CEP) has been published in Use of Circular Error Probability in Target Detection by William Nelson (1988).

You need to use the “current” location of the detecting aircraft, allowances for inaccuracy in estimating the location of the target, etc., but once you call out the subject identity as an issue, its a matter of making choices of how accurate you want the subject identification to be.

Before you forward this to Gen. Bogdan as a way forward on the fusion computer, realize that CEP is only one aspect of target identification. But, calling the subject identity of targets out explicitly, enables reliable presentation of single/multiple targets to pilots.

Your call, confusing displays or a reliable, useful display.

PS: I assume military subject identity systems would not be running XTM software. Same principles apply even if the syntax is different.

Comments Off

April 7, 2015

Federal Data Integration: Dengue Fever

Filed under: Data Aggregation,Data Fusion,Data Integration,Integration,Prediction,Predictive Analytics — Patrick Durusau @ 4:26 pm

The White House issued a press release today (April 7, 2015) titled: FACT SHEET: Administration Announces Actions To Protect Communities From The Impacts Of Climate Change.

That press release reads in part:

…
Unleashing Data: As part of the Administration’s Predict the Next Pandemic Initiative, in May 2015, an interagency working group co-chaired by OSTP, the CDC, and the Department of Defense will launch a pilot project to simulate efforts to forecast epidemics of dengue – a mosquito-transmitted viral disease affecting millions of people every year, including U.S. travelers and residents of the tropical regions of the U.S. such as Puerto Rico. The pilot project will consolidate data sets from across the federal government and academia on the environment, disease incidence, and weather, and challenge the research and modeling community to develop predictive models for dengue and other infectious diseases based on those datasets. In August 2015, OSTP plans to convene a meeting to evaluate resulting models and showcase this effort as a “proof-of-concept” for similar forecasting efforts for other infectious diseases.
…

I tried finding more details on earlier workshops in this effort but limiting the search to “Predict the Next Pandemic Initiative” and the domain to “.gov,” I got two “hits.” One of which was the press release I cite above.

I sent a message (webform) to the White House Office of Science and Technology Policy office and will update you with any additional information that arrives.

Of course my curiosity is about the means used to integrate the data sets. Once integrated, such data sets can be re-used, at least until it is time to integrate additional data sets. Bearing in mind that dirty data can lead to poor decision making, I would rather not duplicate the cleaning of data time after time.

Comments Off

June 29, 2012

Fusion and inference from multiple data sources in a commensurate space

Filed under: Data Fusion,Feature Vectors,Procrustes Transformation — Patrick Durusau @ 3:15 pm

Fusion and inference from multiple data sources in a commensurate space by Zhiliang Ma, David J. Marchette and Carey E. Priebe. (Ma, Z., Marchette, D. J. and Priebe, C. E. (2012), Fusion and inference from multiple data sources in a commensurate space. Statistical Analy Data Mining, 5: 187–193. doi: 10.1002/sam.11142)

Abstract:

Given objects measured under multiple conditions—for example, indoor lighting versus outdoor lighting for face recognition, multiple language translation for document matching, etc.—the challenging task is to perform data fusion and utilize all the available information for inferential purposes. We consider two exploitation tasks: (i) how to determine whether a set of feature vectors represent a single object measured under different conditions; and (ii) how to create a classifier based on training data from one condition in order to classify objects measured under other conditions. The key to both problems is to transform data from multiple conditions into one commensurate space, where the (transformed) feature vectors are comparable and would be treated as if they were collected under the same condition. Toward this end, we studied Procrustes analysis and developed a new approach, which uses the interpoint dissimilarities for each condition. We impute the dissimilarities between measurements of different conditions to create one omnibus dissimilarity matrix, which is then embedded into Euclidean space. We illustrate our methodology on English and French documents collected from Wikipedia, demonstrating superior performance compared to that obtained via standard Procrustes transformation.

An early example of identity issues in topic maps from Steve Newcomb made this paper resonate for me. Steve used the example that his home has a set of geographic coordinates, a street address and a set of directions to arrive at his home, all of which identify the same subjects. All the things that can be said using one identifier can be gathered up with statements using the other identifiers.

While I still have reservations about the use of Euclidean space when dealing with non-Euclidean semantics, one has to admit that it is possible to derive some value from it.

I had to file an ILL for a print copy of the article. More to follow when it arrives.

Comments Off

April 16, 2012

Working with your Data: Easier and More Fun

Filed under: Data Fusion,Data Integration,Fusion Tables — Patrick Durusau @ 7:15 pm

Working with your Data: Easier and More Fun by Rebecca Shapley.

From the post:

The Fusion Tables team has been a little quiet lately, but that’s just because we’ve been working hard on a whole bunch of new stuff that makes it easier to discover, manage and visualize data.

New features from Fusion Tables include:

Faceted search
Multiple tabs
Line charts
Graph visualizations
New API that returns JSON
and more features on the way!

The ability of tools to ease users into data mining, visualization and exploration continues to increase.

Question: How do you counter mis-application of a tool with a sophisticated looking result?

Comments Off

May 12, 2011

Information Heterogeneity and Fusion

Filed under: Data Fusion,Heterogeneous Data,Information Integration,Mapping — Patrick Durusau @ 7:54 am

2nd International Workshop on Information Heterogeneity and Fusion in Recommender Systems (HetRec 2011)

Important Dates:

Paper submission deadline: 25th July 2011
Notification of acceptance: 19th August 2011
Camera-ready version due: 12th September 2011
Workshop: 23rd or 27th October 2011

Datasets are also being made available. Just in case you can’t find any heterogeneous data lying around.

Looks like a perfect venue for topic map papers. (Not to mention that a re-usable mapping between recommender systems looks like a commercial opportunity.)

From the website:

In recent years, increasing attention has been given to finding ways for combining, integrating and mediating heterogeneous sources of information for the purpose of providing better personalized services in many information seeking and e-commerce applications. Information heterogeneity can indeed be identified in any of the pillars of a recommender system: the modeling of user preferences, the description of resource contents, the modeling and exploitation of the context in which recommendations are made, and the characteristics of the suggested resource lists.

Almost all current recommender systems are designed for specific domains and applications, and thus usually try to make best use of a local user model, using a single kind of personal data, and without explicitly addressing the heterogeneity of the existing personal information that may be freely available (on social networks, homepages, etc.). Recognizing this limitation, among other issues: a) user models could be based on different types of explicit and implicit personal preferences, such as ratings, tags, textual reviews, records of views, queries, and purchases; b) recommended resources may belong to several domains and media, and may be described with multilingual metadata; c) context could be modeled and exploited in multi-dimensional feature spaces; d) and ranked recommendation lists could be diverse according to particular user preferences and resource attributes, oriented to groups of users, and driven by multiple user evaluation criteria.

The aim of HetRec workshop is to bring together students, faculty, researchers and professionals from both academia and industry who are interested in addressing any of the above forms of information heterogeneity and fusion in recommender systems. We would like to raise awareness of the potential of using multiple sources of information, and look for sharing expertise and suitable models and techniques.

Another dire need is for strong datasets, and one of our aims is to establish benchmarks and standard datasets on which the problems could be investigated. In this edition, we make available on-line datasets with heterogeneous information from several social systems. These datasets can be used by participants to experiment and evaluate their recommendation approaches, and be enriched with additional data, which may be published at the workshop website for future use.

Comments Off

November 26, 2010

Ensemble Based Systems in Decision Making

Filed under: Classifier Fusion,Data Fusion — Patrick Durusau @ 11:11 am

Ensemble Based Systems in Decision Making Authors: Robi Polikar Keywords: Multiple classifier systems, classifier combination, classifier fusion, classifier selection, classifier diversity, incremental learning, data fusion

Abstract:

In matters of great importance that have financial, medical, social, or other implications, we often seek a second opinion before making a decision, sometimes a third, and sometimes many more. In doing so, we weigh the individual opinions, and combine them through some thought process to reach a final decision that is presumably the most informed one. The process of consulting “several experts” before making a final decision is perhaps second nature to us; yet, the extensive benefits of such a process in automated decision making applications have only recently been discovered by computational intelligence community.

Also known under various other names, such as multiple classifier systems, committee of classifiers, or mixture of experts, ensemble based systems have shown to produce favorable results compared to those of single-expert systems for a broad range of applications and under a variety of scenarios. Design, implementation and application of such systems are the main topics of this article. Specifically, this paper reviews conditions under which ensemble based systems may be more beneficial than their single classifier counterparts, algorithms for generating individual components of the ensemble systems, and various procedures through which the individual classifiers can be combined. We discuss popular ensemble based algorithms, such as bagging, boosting, AdaBoost, stacked generalization, and hierarchical mixture of experts; as well as commonly used combination rules, including algebraic combination of outputs, voting based techniques, behavior knowledge space, and decision templates. Finally, we look at current and future research directions for novel applications of ensemble systems. Such applications include incremental learning, data fusion, feature selection, learning with missing features, confidence estimation, and error correcting output codes; all areas in which ensemble systems have shown great promise

Ironic that the second paragraph of the abstract starts off with the very semantic diversity that bedevils effective information retrieval and navigation.

Excellent survey article on ensemble systems.

Questions:

Read and summarize this article. (1-2 pages)
Choose a data set (list to be posted for class). Outline the choices or evaluations you would make in assembling an ensemble system. (3-5 pages, no citations)
Build an ensemble system to assist with building a topic map for a specific data set (Project)

Comments Off