Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

January 24, 2019

What The Hell Happened (2016) – Data Questions

Filed under: Data,Politics,Survey — Patrick Durusau @ 9:12 pm

What The Hell Happened (WTHH)

From the homepage:

Every progressive remembers waking up on November 9th, 2016. The question on everyone’s mind was… “What the hell happened?”

Pundits were quick to blame “identity politics” for Clinton’s loss. Recent research suggests this framing may have led voters to be less supportive of women candidates and candidates of color.

That’s why we’re introducing the What The Hell Happened Project, where we will work with academics, practitioners and advocates to explain the 2018 election from beginning to end.

Let’s cut to the data:

This survey is based on 3,215 interviews of registered voters conducted by YouGov. The sample was weighted according to age, sex, race, education, urban/rural status, partisanship, marital status, and Census region to be nationally representative of 2018 voters according to Catalist, and to a post-election correction consisting of the national two-party vote share. Respondents were selected from YouGov and other opt-in panels to be representative of registered voters. The weights range from 0.28 to 4.6, with a mean of 1 and a standard deviation of 0.53.

The survey dataset includes measures of political participation such as activism, group consciousness, and vote choice. It also includes measures of interest including items from a hostile sexism battery, racial resentment, fear of demographic change, fear of cultural change, and a variety of policy positions. It includes a rich demographic battery of items like age, race, ethnicity, sex, party identification, income, education, and US state. Please see the attached codebook for a full description and coding of the variables in this survey, as well as the toplines for breakdowns of some of the key variables.

The dataset also includes recodes to scale the hostile sexism items to a 0-1 scale of hostile sexism, the racial animus items to a 0-1 scale of racial animus, and the demographic change items to a 0-1 scale of fear of demographic change. See the codebook for more details. We created a two-way vote choice variable to capture Democrat/Republican voting by imputing the vote choice of undecided respondents based on a Catalist partisanship model for those respondents, who comprised about 5% of the sample.

To explore the data we have embedded a Crunchbox, which you can use to easily make crosstabs and charts of the data. Here, you can click around many of the political and demographic items and look around for interesting trends to explore.

If you want a winning candidate in 2020, repeat every morning: Focus on 2020, Focus on 2020.

Your candidate is not running in 2016 or even 2018.

And, your candidate needs better voter data than WTHH offers here.

First, how was the data gathered?

Respondents were selected from YouGov and other opt-in panels to be representative of registered voters.

Yikes! That’s not how professional pollers do surveys. It may be ok for learning analysis tools but not for serious political forecasting.

Second, what manipulation, if any, of the data, has been performed?

The sample was weighted according to age, sex, race, education, urban/rural status, partisanship, marital status, and Census region to be nationally representative of 2018 voters according to Catalist, and to a post-election correction consisting of the national two-party vote share.

Oh. So we don’t know what biases or faults the weighting process may have introduced to the data. Great.

How were the questions constructed and tested?

Don’t know. (Without this step we don’t know what the question may or may not be measuring.)

How many questions were asked? (56)

Fifty-six questions. Really?

In the 1960 presidential campaign, John F. Kennedy’s staff has a matrix of 480 voter types and 52 issue clusters.

Do you see such a matrix coming out of 56 questions? Neither do I.

The WTHH data is interesting in an amateurish sort of way but winning in 2020 requires the latest data gathering and modeling techniques. Not to mention getting voters to the polling places (modeling a solution for registered but non-voting voters would be a real plus). Your Secretary of State should have prior voting behavior records.

April 30, 2015

New Survey Technique! Ask Village Idiots

Filed under: Artificial Intelligence,News,Survey — Patrick Durusau @ 1:38 pm

I was deeply disappointed to see Scientific Computing with the headline: ‘Avengers’ Stars Wary of Artificial Intelligence by Ryan Pearson.

The respondents are all talented movie stars but acting talent and even celebrity doesn’t give them insight into issues such as artificial intelligence. You might as well ask football coaches about the radiation hazards of a possible mission to Mars. Football coaches, the winning ones anyway, are bright and intelligent folks, but as a class, aren’t the usual suspects to ask about inter-planetary radiation hazards.

President Reagan was known to confuse movies with reality but that was under extenuating circumstances. Confusing people acting in movies with people who are actually informed on a subject doesn’t make for useful news reporting.

Asking Chris Hemsworth who plays Thor in Avengers: Age of Ultron what the residents of Asgard think about relief efforts for victims of the recent earthquake in Nepal would be as meaningful.

They still publish the National Enquirer. A much better venue for “surveys” of the uninformed.

March 12, 2015

Speaking of Numbers and Big Data Disruption

Filed under: BigData,Statistics,Survey — Patrick Durusau @ 6:49 pm

Survey: Big Data is Disrupting Business as Usual by George Leopold.

From the post:

Sixty-four percent of the enterprises surveyed said big data is beginning to change the traditional boundaries of their businesses, allowing more agile providers to grab market share. More than half of those surveyed said they are facing greater competition from “data-enabled startups” while 27 percent reported competition from new players from other industries.

Hence, enterprises slow to embrace data analytics are now fretting over their very survival, EMC and the consulting firm argued.

Those fears are expected to drive investment in big data over the next three years, with 54 percent of respondents saying they plan to increase investment in big data tools. Among those who have already made big data investments, 61 percent said data analytics are already driving company revenues. The fruits of these big data efforts are proving as valuable as existing products and services, the survey found.

That sounds important, except they never say how business is being disrupted? Seems like that would be an important point to make. Yes?

And note the 61% who “…said data analytics are already driving company revenues…” are “…among those who have already made big data investments….” Was that ten people? Twenty? And who after making a major investment is going to say that it sucks?

The survey itself sounds suspect if you read the end of the post:

Capgemini said its big data report is based on an online survey conducted in August 2014 of more than 1,000 senior executives across nine industries in ten global markets. Survey author FreshMinds also conducted follow-up interviews with some respondents.

I think there is a reason that Gallup and those sort of folks don’t do online surveys. It has something to do with accuracy if I recall correctly. 😉

October 24, 2014

analyze survey data for free

Filed under: Public Data,R,Survey — Patrick Durusau @ 10:08 am

Anthony Damico has “unlocked” a number of public survey data sets with blog posts that detail how to analyze those sets with R.

Forty-six (46) data set are covered so far:

unlocked public-use data sets

An impressive donation of value to R and public data and an example that merits emulation! Pass this along.

I first saw this in a tweet by Sharon Machlis.

August 23, 2014

Data + Design

Filed under: Data,Design,Survey,Visualization — Patrick Durusau @ 2:17 pm

Data + Design: A simple introduction to preparing and visualizing information by Trina Chiasson, Dyanna Gregory and others.

From the webpage:

ABOUT

Information design is about understanding data.

Whether you’re writing an article for your newspaper, showing the results of a campaign, introducing your academic research, illustrating your team’s performance metrics, or shedding light on civic issues, you need to know how to present your data so that other people can understand it.

Regardless of what tools you use to collect data and build visualizations, as an author you need to make decisions around your subjects and datasets in order to tell a good story. And for that, you need to understand key topics in collecting, cleaning, and visualizing data.

This free, Creative Commons-licensed e-book explains important data concepts in simple language. Think of it as an in-depth data FAQ for graphic designers, content producers, and less-technical folks who want some extra help knowing where to begin, and what to watch out for when visualizing information.

As of today, the Data + Design is the product of fifty (50) volunteers from fourteen (14) countries. At eighteen (18) chapters and just shy of three-hundred (300) pages, this is a solid introduction to data and its visualization.

The source code is on GitHub, along with information on how you can contribute to this project.

A great starting place but my social science background is responsible for my caution concerning chapters 3 and 4 on survey design and questions.

All of the information and advice in those chapters is good, but it leaves the impression that you (the reader) can design an effective survey instrument. There is a big difference between an “effective” survey instrument and a series of questions pretending to be a survey instrument. Both will measure “something” but the question is whether a survey instrument provides you will actionable intelligence.

For a survey on any remotely mission critical, like user feedback on an interface or service, get as much professional help as you can afford.

When was the last time you heard of a candidate for political office or serious vendor using Survey Monkey? There’s a reason for that lack of reports. Can you guess that reason?

I first saw this in a tweet by Meta Brown.

August 22, 2014

Computer Science – Know Thyself!

Filed under: Computer Science,Social Sciences,Survey — Patrick Durusau @ 10:34 am

Putting the science in computer science by Felienne Hermans.

From the description:

Programmers love science! At least, so they say. Because when it comes to the ‘science’ of developing code, the most used tool is brutal debate. Vim versus emacs, static versus dynamic typing, Java versus C#, this can go on for hours at end. In this session, software engineering professor Felienne Hermans will present the latest research in software engineering that tries to understand and explain what programming methods, languages and tools are best suited for different types of development.

Great slides from Felienne’s keynote at ALE 2014.

I mention this to emphasize the need for social science research techniques and methodologies for application development. Investigation of computer science debates with such methods may lead to less resistance to them for user facing issues.

Perhaps a recognition that we are all “users,” bringing common human experiences to different interfaces with computers, will result in better interfaces for all.

January 18, 2014

A course in sample surveys for political science

Filed under: Politics,Statistics,Survey — Patrick Durusau @ 8:11 pm

A course in sample surveys for political science by Andrew Gelman.

From the post:

A colleague asked if I had any material for a course in sample surveys. And indeed I do. See here.

It’s all the slides for a 14-week course, also the syllabus (“surveyscourse.pdf”), the final exam (“final2012.pdf”) and various misc files. Also more discussion of final exam questions here (keep scrolling thru the “previous entries” until you get to Question 1).

Enjoy! This is in no way a self-contained teach-it-yourself course, but I do think it could be helpful for anyone who is trying to teach a class on this material.

An impressive bundle of survey material!

I mention it because you may be collecting survey data or at least asked to process survey data.

Hopefully it won’t originate from Survey Monkey.

If I had $1 for every survey composed by a statistical or survey illiterate on Survey Monkey, I could make a substantial down payment on the national debt.

That’s not the fault of Survey Monkey but there is more to survey work than asking questions.

If you don’t know how to write a survey, do us all a favor, make up the numbers and say that in a footnote. You will be in good company with the piracy estimators.

Powered by WordPress