Archive for the ‘Factor Analysis’ Category

Factor Analysis: A Short Introduction, Part 1 [Reducing Dimensionality]

Saturday, September 15th, 2012

Factor Analysis: A Short Introduction, Part 1 by Maike Rahn.

From the post:

Why use factor analysis?

Factor analysis is a useful tool for investigating variable relationships for complex concepts such as socioeconomic status, dietary patterns, or psychological scales.

It allows researchers to investigate concepts that are not easily measured directly by collapsing a large number of variables into a few interpretable underlying factors.

What is a factor?

The key concept of factor analysis is that multiple observed variables have similar patterns of responses because of their association with an underlying latent variable, the factor, which cannot easily be measured.

For example, people may respond similarly to questions about income, education, and occupation, which are all associated with the latent variable socioeconomic status.

I mention factor analysis as an example of

  • reducing dimensionality
  • exchanging a not easily measured latent variable for measurable ones
  • attributing a relationship between a not easily measured latent variable and measurable ones

Factor analysis has been successfully used in a number of fields.

However, to reliably integrate information based on factor analysis you will need to probe the (often) unstated assumptions of such analysis.

PS: You may find the pointers in Wikipedia useful: Factor Analysis.

Factor Analysis at 100:… [Two Subjects – One Name]

Friday, August 10th, 2012

Factor Analysis at 100: Historical Developments And Future Directions (Cudeck, MacCallum, Lawrence Erlbaum Associates Inc, 2007. (384 pp.)) was mentioned by Christophe Lalanne in Some Random Notes, as one of his recent book acquisitions.

While searching for that volume, I encountered a conference with the same name: Factor Analysis at 100: Historical Developments And Future Directions [Conference, 2004] .

At the conference site you will find links to materials from thirteen speakers, plus a “Factor Analysis Genealogy” and “Factor Analysis Timeline.”

The presentations from the conference became papers that appear in the volume Christophe recently purchased.

Charles Spearman’s paper, “General Intelligence, Objectively Determined and Measured,” in the American Journal of Psychology [PDF version] [HTML version] (1904) was posted to the conference homepage.


The relevant subject identifiers are obvious. What else would you add to topics representing these subjects? Why?

A Performance Study of Data Mining Techniques: Multiple Linear Regression vs. Factor Analysis

Wednesday, September 7th, 2011

A Performance Study of Data Mining Techniques: Multiple Linear Regression vs. Factor Analysis by Abhishek Taneja and R.K.Chauhan.

Abstract:

The growing volume of data usually creates an interesting challenge for the need of data analysis tools that discover regularities in these data. Data mining has emerged as disciplines that contribute tools for data analysis, discovery of hidden knowledge, and autonomous decision making in many application domains. The purpose of this study is to compare the performance of two data mining techniques viz., factor analysis and multiple linear regression for different sample sizes on three unique sets of data. The performance of the two data mining techniques is compared on following parameters like mean square error (MSE), R-square, R-Square adjusted, condition number, root mean square error(RMSE), number of variables included in the prediction model, modified coefficient of efficiency, F-value, and test of normality. These parameters have been computed using various data mining tools like SPSS, XLstat, Stata, and MS-Excel. It is seen that for all the given dataset, factor analysis outperform multiple linear regression. But the absolute value of prediction accuracy varied between the three datasets indicating that the data distribution and data characteristics play a major role in choosing the correct prediction technique.

I had to do a double-take when I saw “factor analysis” in the title of this article. I remember factor analysis from Schubert’s The judicial mind revisited : psychometric analysis of Supreme Court ideology, where Schubert used factor analysis to model the relative positions of the Supreme Court Justices. Schubert taught himself factor analysis on a Frieden rotary calculator. (I had one of those too but that’s a different story.)

The real lesson of this article comes at the end of the abstract: the data distribution and data characteristics play a major role in choosing the correct prediction technique.