Archive for the ‘Twitter’ Category

Busting Fake Tweeters

Tuesday, October 10th, 2017

The ultimate guide to bust fake tweeters: A video toolkit in 10 steps by Henk van Ess.

From the post:

Twitter is full of false information. Even Twitter co-founder Ev Williams recognizes that there is a “junk information epidemic going on,” as “[ad-driven platforms] are benefiting from people generating attention at pretty much any cost.”

This video toolkit is intended to help you debunk dubious tweets. It was first developed in research by the Institute for Strategic Dialogue and the Arena Program at the London School of Economics to detect Russian social media influence during the German elections. It was also the basis for a related BuzzFeed article on a Russian bot farm and tweets about the AfD  — the far-right party that will enter the German parliament for the first time.

This is an excellence resource for teaching users skepticism about Twitter accounts.

For your use in creating a personal cheatsheet (read van Ess for the links):

  1. Find the exact minute of birth
  2. Find the first words
  3. Check the followers
  4. Find Twitter users in Facebook
  5. Find suspicious words in tweets
  6. Searching in big data
  7. Connect a made up Twitter handle to a real social media account
  8. Find a social score
  9. How alive is the bot?
  10. When (and how) is your bot tweeting?

Deciding that a Twitter account maybe a legitimate is only the first step in evaluating tweeted content.

The @WSJ account belongs to the Wall Street Journal, but it doesn’t follow their tweets are accurate or even true. Witness their repetition of government rumors about Kerpersky Lab for example. Not one shred of evidence, but WSJ repeats it.

Be skeptical of all Tweets, not just ones attributed to the “enemy of the day.”

Salvation for the Left Behind on Twitter’s 280 Character Limit

Wednesday, September 27th, 2017

If you are one of the “left behind” on Twitter’s expansion to a 280 character limit, don’t despair!

Robert Graham (@ErrataRob) rides to your rescue with: Browser hacking for 280 character tweets.

Well, truth is Bob covers more than simply reaching the new 280 character limit for the left behind, covering HTTP requests, introduces Chrome’s DevTool, command line use of cURL.

Take a few minutes to walk through Bob’s post.

A little knowledge of browsers and tools will put you far ahead of your management.

Women in Data Science (~1200) – Potential Speaker List

Sunday, September 24th, 2017

When I last posted about Data Science Renee‘s twitter list of women in data science in had ~632 members.

That was in April of 2016.

As of today, the list has 1,203 members! By the time you look, that number will be different again.

I call this a “potential speaker list” because not every member may be interested in your conference or have the time to attend.

Have you made a serious effort to recruit women speakers if you have not consulted this list and others like it?

Serious question.

Do you have a serious answer?

Twitter – Government Censor’s Friend

Saturday, July 15th, 2017

Governments, democratic, non-democratic, kingships, etc. that keep secrets from the public, share a common enemy in Wikileaks.

Wikileaks self-describes in part as:

WikiLeaks is a multi-national media organization and associated library. It was founded by its publisher Julian Assange in 2006.

WikiLeaks specializes in the analysis and publication of large datasets of censored or otherwise restricted official materials involving war, spying and corruption. It has so far published more than 10 million documents and associated analyses.

“WikiLeaks is a giant library of the world’s most persecuted documents. We give asylum to these documents, we analyze them, we promote them and we obtain more.” – Julian Assange, Der Spiegel Interview.

WikiLeaks has contractual relationships and secure communications paths to more than 100 major media organizations from around the world. This gives WikiLeaks sources negotiating power, impact and technical protections that would otherwise be difficult or impossible to achieve.

Although no organization can hope to have a perfect record forever, thus far WikiLeaks has a perfect in document authentication and resistance to all censorship attempts.

Those same governments, share a common ally in Twitter, which has engaged in systematic actions to diminish the presence/influence of Julian Assange on Twitter.

Caitlin Johnstone documents Twitter’s intentional campaign against Assange in Twitter Is Using Account Verification To Stifle Leaks And Promote War Propaganda.

Catch Johnstone’s post for the details but then:

  1. Follow @JulianAssange on Twitter (watch for minor variations that are not this account.
  2. Tweet to your followers, at least once a week, urging them to follow @JulianAssange
  3. Investigate and support non-censoring alternatives to Twitter.

You can verify Twitter’s dilution of Julian Assange for yourself.

Type “JulianAssange_” in the Twitter search box (my results):

Twitter was a remarkably good idea, but has long since poisoned itself with censorship and pettiness.

Your suggested alternative?

Thank You, Scott – SNL

Friday, May 26th, 2017

I posted this to Facebook, search for “Thanks Scott SNL” to find my post or that of others.

Included this note (with edits):

Appropriate social media warriors (myself included). From sexism and racism to fracking and pipelines, push back in the real world if you [want] change. Push back on social media for a warm but meaningless feeling of solidarity.

For me the “real world,” includes cyberspace, where pushing can have consequences.

You?

Mastodon (Tor Access Recommended)

Wednesday, April 5th, 2017

Mastodon

From the homepage:

Mastodon is a free, open-source social network. A decentralized alternative to commercial platforms, it avoids the risks of a single company monopolizing your communication. Pick a server that you trust — whichever you choose, you can interact with everyone else. Anyone can run their own Mastodon instance and participate in the social network seamlessly.

What sets Mastodon apart:

  • Timelines are chronological
  • Public timelines
  • 500 characters per post
  • GIFV sets and short videos
  • Granular, per-post privacy settings
  • Rich block and muting tools
  • Ethical design: no ads, no tracking
  • Open API for apps and services

… (emphasis in original)

No regex for filtering posts but it does have:

  • Block notifications from non-followers
  • Block notifications from people you don’t follow

One or both should cover most of the harassment cases.

I was surprised by the “Pick a server that you trust…” suggestion.

Really? A remote server being run by someone unknown to me? Bad enough that I have to “trust” my ISP, to a degree, but an unknown?

You really need a Tor based email account and use Tor for access to Mastodon. Seriously.

Politics For Your Twitter Feed

Sunday, March 26th, 2017

Hungry for more political tweets?

GovTrack created the Members of Congress Twitter list.

Barometer of congressional mood?

Enjoy!

Creating A Social Media ‘Botnet’ To Skew A Debate

Friday, March 10th, 2017

New Research Shows How Common Core Critics Built Social Media ‘Botnets’ to Skew the Education Debate by Kevin Mahnken.

From the post:

Anyone following education news on Twitter between 2013 and 2016 would have been hard-pressed to ignore the gradual curdling of Americans’ attitudes toward the Common Core State Standards. Once seen as an innocuous effort to lift performance in classrooms, they slowly came to be denounced as “Dirty Commie agenda trash” and a “Liberal/Islam indoctrination curriculum.”

After years of social media attacks, the damage is impressive to behold: In 2013, 83 percent of respondents in Education Next’s annual poll of Americans’ education attitudes felt favorably about the Common Core, including 82 percent of Republicans. But by the summer of 2016, support had eroded, with those numbers measuring only 50 percent and 39 percent, respectively. The uproar reached such heights, and so quickly, that it seemed to reflect a spontaneous populist rebellion against the most visible education reform in a decade.

Not so, say researchers with the University of Pennsylvania’s Consortium for Policy Research in Education. Last week, they released the #commoncore project, a study that suggests that public animosity toward Common Core was manipulated — and exaggerated — by organized online communities using cutting-edge social media strategies.

As the project’s authors write, the effect of these strategies was “the illusion of a vociferous Twitter conversation waged by a spontaneous mass of disconnected peers, whereas in actuality the peers are the unified proxy voice of a single viewpoint.”

Translation: A small circle of Common Core critics were able to create and then conduct their own echo chambers, skewing the Twitter debate in the process.

The most successful of these coordinated campaigns originated with the Patriot Journalist Network, a for-profit group that can be tied to almost one-quarter of all Twitter activity around the issue; on certain days, its PJNET hashtag has appeared in 69 percent of Common Core–related tweets.

The team of authors tracked nearly a million tweets sent during four half-year spans between September 2013 and April 2016, studying both how the online conversation about the standards grew (more than 50 percent between the first phase, September 2013 through February 2014, and the third, May 2015 through October 2015) and how its interlocutors changed over time.

Mahnken talks as though creating a ‘botnet’ to defeat adoption of the Common Core State Standards is a bad thing.

I never cared for #commoncore because testing makes money for large and small testing vendors. It has no other demonstrated impact on the educational process.

Let’s assume you want to build a championship high school baseball team. To do that, various officious intermeddlers, who have no experience with baseball, fund creation of the Common Core Baseball Standards.

Every three years, every child is tested against the Common Core Baseball Standards and their performance recorded. No funds are allocated for additional training for gifted performers, equipment, baseball fields, etc.

By the time these students reach high school, will you have the basis for a championship team? Perhaps, but if you do, it due to random chance and not the Common Core Baseball Standards.

If you want a championship high school baseball team, you fund training, equipment, baseball fields and equipment, in addition to spending money on the best facilities for your hoped for championship high school team. Consistently and over time you spend money.

The key to better education results isn’t testing, but funding based on the education results you hope to achieve.

I do commend the #commoncore project website for being an impressive presentation of Twitter data, even though it is clearly a propaganda machine for pro Common Core advocates.

The challenge here is to work backwards from what was observed by the project to both principles and tactics that made #stopcommoncore so successful. That is we know it has succeeded, at least to some degree, but how do we replicate that success on other issues?

Replication is how science demonstrates the reliability of a technique.

Looking forward to hearing your thoughts, suggestions, etc.

Enjoy!

Continuing Management Fail At Twitter

Monday, March 6th, 2017

Twitter management continues to fail.

Consider censoring the account of Lauri Love. (a rumored hacker)

Competent management at Twitter would be licensing the rights to create shareable mutes/filters for all posts from Lauri Love.

The FBI, Breitbart, US State Department, and others would vie for users of their filters, which block “dangerous and/or seditious content.”

Filters licensed in increments, depending on how many shares you want to enable.

Twitter with no censorship at all would drive the market for such filters.

Licensing filters by number of shares provides a steady revenue stream and Twitter could its censorship prone barnacles. More profit, reduced costs, what’s not to like?

PS: I ask nothing for this suggestion. Getting Twitter out of the censorship game on behalf of governments is benefit enough for me.

Trump Tweets Strategically – You Respond (fill in the blank)

Saturday, March 4th, 2017

George Lakoff tweeted:

Here’s an example of a “strategic” tweet by Trump.

Donald J. Trump tweets:

Terrible! Just found out that Obama had my “wires tapped” in Trump Tower just before the victory. Nothing found. This is McCarthyism!

For testing purposes, how would you characterize this sample of tweets that are a small part of the 35K replies to Trump’s tweet.


pourmecoffee‏Verified account @pourmecoffee
@realDonaldTrump Correct. Making allegations without evidence is the literal definition of McCarthyism.

FFT-Obama for Prison‏ @FemalesForTrump
.@pourmecoffee
when will the liars learn. Trump ALWAYS does his homework! The truth will support his tweet in 3, 2, 1 …
#saturdaymorning

Ignatz‏ @ignatzz
@FemalesForTrump @pourmecoffee Yes, I remember that proof that Obama was born in Kenya. And the Bowling Green Massacre.

FFT-Obama for Prison‏ @FemalesForTrump
@ignatzz @pourmecoffee he WAS born in Kenya. Hawaii b/c is a fake. #fact
He didn’t make the bowling green statement. Now go away

Lisa Armstrong‏Verified account @LisaArmstrong
@FemalesForTrump You people are still stuck on the lie that Obama was born in Kenya? Why? Where is the proof? #alternativefacts

Jet Black‏ @jetd69
@LisaArmstrong @FemalesForTrump There’s little point in arguing with her. She’s as off her chops as he is. Females for Trump indeed!

Lisa Armstrong‏Verified account @LisaArmstrong
@jetd69 @FemalesForTrump I know you’re right. It’s just that the willingness of #Trump supporters to believe flat out lies astounds me.

AngieStrader‏ @AngieStrader
@LisaArmstrong @jetd69 @FemalesForTrump this goes both ways. Dems want Trump on treason. Based on what facts? What verifiable sources?

Lisa Armstrong‏Verified account @LisaArmstrong
@AngieStrader The difference is there’s a long list of shady things Trump has actually done. These are facts. Obama being Kenyan is a lie.

Do you see any strategic tweets in that list or in the other 37K responses (as of Saturday afternoon, 4 March 2017)?

If the point of Trump’s tweet was diversion, I would have to say it succeeded beautifully.

You?

The strategic response to a Trump tweet is ignoring them in favor of propagating your theme.

Twitter reduces reach of users it believes are abusive [More Opaque Censorship]

Friday, February 17th, 2017

Twitter reduces reach of users it believes are abusive

More opaque censorship from Twitter:

Twitter has begun temporarily decreasing the reach of tweets from users it believes are engaging in abusive behaviour.

The new action prevents tweets from users Twitter has identified as being abusive from being displayed to people who do not follow them for 12 hours, thus reducing the user’s reach.

If the user were to mention someone who does not follow them on the social media site, that person would not see the tweet in their notifications. Again, this would last for 12 hours.

If the user who had posted abusive tweets was retweeted by someone else, this tweet would not be able to be seen by people who do not follow them, again reducing their Twitter reach.
… (emphasis in original)

I’m assuming this is one of the changes Ed Ho alluded to in An Update on Safety (February 7, 2017) when he said:

Collapsing potentially abusive or low-quality Tweets:

Our team has also been working on identifying and collapsing potentially abusive and low-quality replies so the most relevant conversations are brought forward. These Tweet replies will still be accessible to those who seek them out. You can expect to see this change rolling out in the coming weeks.
… (emphasis in original)

No announcements for:

  • Grounds for being deemed “abusive.”
  • Process for contesting designation as “abusive.”

Twitter is practicing censorship, the basis for which is opaque and the censored have no impartial public forum for contesting that censorship.

In the interest of space, I forego the obvious historical comparisons.

All of which could have been avoided by granting Twitter users:

The ability to create and share filters for tweets.

Even a crude filtering mechanism should enable me to filter tweets that contain my Twitter handle, but that don’t originate from anyone I follow.

So Ed Ho, why aren’t users being empowered to filter their own streams?

Republican Regime Creates New Cyber Market – Burner Twitter/Facebook Accounts

Thursday, February 9th, 2017

The current Republican regime has embarked upon creating a new cyber market, less than a month after taking office.

Samatha Dean (Tech Times) reports:

Planning a visit to the U.S.? Your passport is not the only thing you may have to turn in at the immigration counter, be prepared to relinquish your social media account passwords as well to the border security agents.

That’s right! According to a new protocol from the Homeland Security that is under consideration, visitors to the U.S. may have to give their Twitter and Facebook passwords to the border security agents.

The news comes close on the heels of the Trump administration issuing the immigration ban, which resulted in a massive state of confusion at airports, where several people were debarred from entering the country.

John F. Kelly, the Homeland Security Secretary, shared with the Congress on Feb. 7 that the Trump administration was considering this option. The measure was being weighed as a means to sieve visa applications and sift through refugees from the Muslim majority countries that are under the 90-day immigration ban.

I say burner Twitter/Facebook accounts, if you plan on making a second trip to the US, you will need to have the burner accounts maintained over the years.

The need for burner Twitter/Facebook accounts, ones you can freely disclose to border security agents, presents a wide range of data science issues.

In no particular order:

  • Defeating Twitter/Facebook security on a large scale. Not trivial but not the hard part either
  • Creating accounts with the most common names
  • Automated posting to accounts in their native language
  • Posts must be indistinguishable from human user postings, i.e., no auto-retweets of Sean Spicer
  • Profile of tweets/posts shows consistent usage

I haven’t thought about burner bank account details before but that certainly should be doable. Especially if you have a set of banks on the Net that don’t have much overhead but exist to keep records one to the other.

Burner bank accounts could be useful to more than just travelers to the United States.

Kudos to the new Republican regime and their market creation efforts!

Twistance – “Rogue” Twitter Accounts – US Federal Science Agencies

Thursday, January 26th, 2017

Alice Stollmeyer has put together Twistance:

Twitter + resistance = #Twistance. “Rogue” Twitter accounts from US federal science agencies.

As of 26 January 2017, 44 members and 5,133 subscribers.

A long overdue step towards free speech for government employees and voters making decisions on what is known inside the federal government.

Caution:

A claim to be an “alternative” account may or may not be true. As with the official accounts, evaluate factual claims for yourself. Use good security practices when communicating with unknown accounts. (Some of the account names are very close in spelling but are separate accounts.)

  • Alt Hi Volcanoes NP The Unofficial “Resistance” team of Hawaii Volcanoes National Park. Not taxpayer funded.
  • Alt HHS Unofficial and unaffiliated resistance account by concerned scientists for humanity.
  • The Alt NPS and EPA Real news regarding the NPS, EPA, climate science and environmentalism
  • Alt Science Raising awareness of climate change and other threats posed by science denial. Not affiliated with the US gov. #Resist
  • Alternative CDC Unofficial unaffiliated resistance account by concerned scientists for humanity.
  • Alternative HeHo A parody account for the Herbert Hoover National Historic Site
  • Alternative NIH Unofficial group of science advocates. Stand up for science, rights, equality, social justice, & ultimately, for the health of humanity.
  • Alternative NOAA The Unofficial “Resistance” team of the NOAA. Account not tax payer subsidized. We study the oceans, and the atmosphere to understand our planet. #MASA
  • AltBadlandsNatPark You’ll never shut us down, Drumpf!
  • Alt-Badlands NPS Bigly fake #badlandsnationalpark. ‘Sad!’ – Donald J Trump. #badlands #climate #science #datarefuge #resist #resistance
  • AltEPA He can take our official Twitter but he’ll never take our FREEDOM. UNOFFICIALLY resisting.
  • altEPA The Unofficial “Resistance” team of U.S. Environmental Protection Agency. Not taxpayer subsidised! Environmental conditions may vary from alternative facts.
  • AltFDA Uncensored FDA
  • AltGlacierNPS The unofficial Twitter site for Glacier National Park of Science Fact.
  • AltHot Springs NP The Resistance Account of America’s First Resort and Preserve. Account Run By Friends of HSNP.
  • AltLassenVolcanicNP The Unofficial “Resistance” team. Within peaceful mountain forests you will find hissing fumaroles and boiling mud pots and people ready to fight for science.
  • AltMountRainierNPS Unofficial “Resistance” Team from the Mount Rainier National Park Service. Protecting what’s important..
  • AltNASA The unofficial #resist team of the National Aeronautics and Space Administration.
  • AltOlympicNPS Unofficial resistance team of the Olympic National Park. protecting what’s important and fighting fascism with science.
  • AltRockyNPS Unofficial account that is being held for people associated with RMNP. DM if you might be interested in it.
  • AltUSARC USARC’s main duties are to develop an integrated national Arctic research policy and to assist in establishing an Arctic research plan to implement it.
  • AltUSDA Resisting the censorship of facts and science. Truth wins in the end.
  • AltUSForestService The unofficial, and unsanctioned, “Resistance” team for the U.S. Forest Service. Not an official Forest Service account, not publicly funded, citizen run.
  • AltUSFWS The Alt U.S. Fish Wildlife Service (AltUSFWS) is dedicated to the conservation, protection and enhancement of fish, wildlife and plants and their habitats
  • AltUSFWSRefuge The Alt U.S. Fish Wildlife Service (AltUSFWSRefuge) is dedicated to the conservation, protection and enhancement of fish, wildlife and plants and their habitats
  • ALTUSNatParkSer The Unofficial team of U.S. National Park Service. Not taxpayer subsidised! Come for rugged scenery, fossil beds, 89 million acres of landscape
  • AltUSNatParkService The Unofficial #Resistance team of U.S. National Park Service. Not taxpayer subsidised! Come for rugged scenery, facts & 89 million acres of landscape #climate
  • AltNWS The Unofficial Resistance team of U.S. National Weather Service. Not taxpayer subsidized! Come for non-partisan science-based weather, water, and climate info.
  • AltYellowstoneNatPar We are a group of employees and scientists in Yellowstone national park. We are here to continue providing the public with important information
  • AltYosemiteNPS “Unofficial” Resistance Team. Reporting facts & protecting what’s important!
  • Angry National Park Preserving the ecological and historical integrity of National Parks while also making them available and accessible for public use and enjoyment dammit all.
  • BadHombreLands NPS Unofficial feed of Badlands NP. Protecting rugged scenery, fossil beds, 244,000 acres of mixed-grass prairie & wildlife from two-bit cheetoh-hued despots.
  • BadlandsNPSFans Shmofficial fake feed of South Dakota’s Badlands National Park (Great Again™ Edition) Account not run by park employees, current or former, so leave them alone.
  • GlacierNPS The alternative Twitter site for Glacier National Park.
  • March for Science Planning a March for Science. Date TBD. We’ll let you know when official merchandise is out to cover march costs.
  • NOAA (uncensored)
  • Resistance_NASA We are a #Resist sect of the National Aeronautics and Space Administration.
  • Rogue NASA The unofficial “Resistance” team of NASA. Not an official NASA account. Not managed by gov’t employees. Come for the facts, stay for the snark.
  • NatlParksUnderground We post the information Donald Trump censors #FindYourPark #NPS100
  • NWS Podunk We’re the third wheel of forecast offices. We still use WSR-57. Winner of Biggest Polygon at the county fair. Not an actual NWS office…but we should be.
  • Rogue NOAA Research on our climate, oceans, and marine resources should be subject to peer [not political] review. *Not an official NOAA account*
  • Stuff EPA Would Say We post info that Donald Trump censors. We report what the U.S. Environmental Protection Agency would say. Chime in w/ #StuffEPAWouldSay
  • U.S. EPA – Ungagged Ungagged news, links, tips, and conversation that the U.S. Environmental Protection Agency is unable to tell you. Not directly affiliated with @EPA.
  • U.S. Science Service Uncensored & unofficial tweets re: the science happening at the @EPA, @USDA, @NatParkService, @NASA, @NOAA etc. #ClimateChangeIsReal #DefendScience

Why I Tweet by Donald Trump

Thursday, January 19th, 2017

David Uberti and Pete Vernon in The coming storm for journalism under Trump capture why Donald Trump tweets:


As Trump explained the retention of his personal Twitter handle to the Sunday Times recently: “I thought I’d do less of it, but I’m covered so dishonestly by the press—so dishonestly—that I can put out Twitter…I can go bing bing bing and I just keep going and they put it on and as soon as I tweet it out—this morning on television, Fox: Donald Trump, we have breaking news.

In order for Trump tweets to become news, two things are required:

  1. Trump tweets (quite common)
  2. Media evaluates the tweets to be newsworthy (should be less common)

Reported as newsworthy tweets are unlikely to match the sheer volume of Trump’s tweeting.

You have all read:

trump-on-sat-night-460

Is Trump’s opinion, to which he is entitled, about Saturday Night Live newsworthy?

Trump on television is as trustworthy as the “semi-literate one-legged man” Dickens quoted for the title “Our Mutual Friend” is on English grammar. (Modern American Usage by William Follett, edited by Jacques Barzum. Under the entry for “mutual friend.”)

Other examples abound but suffice it to say the media needs to make its own judgments about newsworthy or not.

Otherwise the natters of another semi-literate become news by default for the next four years.

Online Database of “Verified” Twitter Accounts (Right On!)

Friday, January 6th, 2017

The WikiLeaks Task Force tweeted on 6 Jan. 2017:

We are thinking of making an online database with all “verified” twitter accounts & their family/job/financial/housing relationships.

There are a number of comments to this tweet, the ones containing “dox,” “doxx,” “doxing,” “creepy,” “evil,” etc. that should be ignored.

Ignored because intelligence agencies, news organizations, merchants, banks, etc. are all collecting and organizing that data and more.

Ignored because the public should not preemptively disarm itself.

If anything, the Wikileaks Task Force should start with “verified” Twitter accounts and expand outwards, rapidly.

The public should be able to rapidly find relationships of individuals nominated for office, who contribute money to candidates, who profit from contracts, who launder public money. The public should have the same advantages intelligence agencies enjoy today.

To the nay-sayers to the WikiLeaks Task Force proposal:

Why do you seek to prevent putting the public on a better footing vis-a-vis government?

Question to my readers: What do the nay-sayers gain from a disarmed public?

Three More Reasons To Learn R

Friday, January 6th, 2017

Three reasons to learn R today by David Smith.

From the post:

If you're just getting started with data science, the Sharp Sight Labs blog argues that R is the best data science language to learn today.

The blog post gives several detailed reasons, but the main arguments are:

  1. R is an extremely popular (arguably the most popular) data progamming language, and ranks highly in several popularity surveys.
  2. Learning R is a great way of learning data science, with many R-based books and resources for probability, frequentist and Bayesian statistics, data visualization, machine learning and more.
  3. Python is another excellent language for data science, but with R it's easier to learn the foundations.

Once you've learned the basics, Sharp Sight also argues that R is also a great data science to master, even though it's an old langauge compared to some of the newer alternatives. Every tool has a shelf life, but R isn't going anywhere and learning R gives you a foundation beyond the language itself.

If you want to get started with R, Sharp Sight labs offers a data science crash course. You might also want to check out the Introduction to R for Data Science course on EdX.

Sharp Sight Labs: Why R is the best data science language to learn today, and Why you should master R (even if it might eventually become obsolete)

If you need more reasons to learn R:

  • Unlike Facebook, R isn’t a sinkhole of non-testable propositions.
  • Unlike Instagram, R is rarely NSFW.
  • Unlike Twitter, R is a marketable skill.

Glad to hear you are learning R!

Mining Twitter Data with Python [Trump Years Ahead]

Wednesday, December 21st, 2016

Marco Bonzanini, author of Mastering Social Media Mining with Python, has a seven part series of posts on mining Twitter with Python.

If you haven’t been mining Twitter before now, President-elect Donald Trump is about to change all that.

What if Trump continues to tweet as President and authorizes his appointees to do the same? Spontaneity isn’t the same thing as openness but it could prove to be interesting.

Auto Trump fact-checks – Alternative to Twitter Censorship

Monday, December 19th, 2016

Washington Post automatically inserts Trump fact-checks into Twitter by Sam Machkovech.

From the post:

In an apparent first for any American news outlet, the Washington Post released a Chrome plug-in on Friday designed to fact-check posts from a single Twitter account. Can you guess which one?

The new “RealDonaldContext” plug-in for the Google Chrome browser, released by WaPo reporter Philip Bump, adds fact-check summaries to selected posts by President-elect Donald Trump. Users will need to click a post in The Donald’s Twitter feed to see any fact-check information from the Washington Post, which appears as a gray text box beneath the tweet.

I differ with the Washington Post on its slavish reporting of unsubstantiated claims of the US intelligence community, but high marks for the “RealDonaldContext” plug-in for the Google Chrome browser!

What a great alternative to censoring “fake news” on Twitter! Fact check it!

Pointers to source code for similar plug-ins?

The Twitterverse of Donald Trump, in 26,234 Tweets

Tuesday, December 13th, 2016

The Twitterverse of Donald Trump, in 26,234 Tweets by Lam Thuy Vo.

From the post:


We wanted to get a better idea of where President-elect Donald Trump gets his information. So we analyzed everything he has tweeted since he launched his campaign to take a look at the links he has shared and the news sources they came from.

Step-by-step guide to the software and analysis Trump’s tweets!

Excellent!

Follow: @lamthuyvo.

Which public figure’s tweets are you going to track/analyze?

Gab – Censorship Lite?

Tuesday, November 29th, 2016

I submitted my email today at Gab and got this message:

Done! You’re #1320420 in the waiting list.

Only three rules:

Illegal Pornography

We have a zero tolerance policy against illegal pornography. Such material will be instantly removed and the owning account will be dealt with appropriately per the advice of our legal counsel. We reserve the right to ban accounts that share such material. We may also report the user to local law enforcement per the advice our legal counsel.

Threats and Terrorism

We have a zero tolerance policy for violence and terrorism. Users are not allowed to make threats of, or promote, violence of any kind or promote terrorist organizations or agendas. Such users will be instantly removed and the owning account will be dealt with appropriately per the advice of our legal counsel. We may also report the user to local and/or federal law enforcement per the advice of our legal counsel.

What defines a ‘terrorist organization or agenda’? Any group that is labelled as a terrorist organization by the United Nations and/or United States of America classifies as a terrorist organization on Gab.

Private Information

Users are not allowed to post other’s confidential information, including but not limited to, credit card numbers, street numbers, SSNs, without their expressed authorization.

If Gab is listening, I can get the rules down to one:

Court Ordered Removal

When Gab receives a court order from a court of competent jurisdiction ordering the removal of identified, posted content, at (service address), the posted, identified content will be removed.

Simple, fair, gets Gab and its staff out of the censorship business and provides a transparent remedy.

At no cost to Gab!

What’s there not to like?

Gab should review my posts: Monetizing Hate Speech and False News and Preserving Ad Revenue With Filtering (Hate As Renewal Resource), while it is in closed beta.

Twitter and Facebook can keep spending uncompensated time and effort trying to be universal and fair censors. Gab has the opportunity to reach up and grab those $100 bills flying overhead for filtered news services.

What is the New York Times if not an opinionated and poorly run filter on all the possible information it could report?

Apply that same lesson to social media!

PS: Seriously, before going public, I would go to the one court-based rule on content. There’s no profit and no wins in censoring any content on your own. Someone will always want more or less. Courts get paid to make those decisions.

Check with your lawyers but if you don’t look at any content, you can’t be charged with constructive notice of it. Unless and until someone points it out, then you have to follow DCMA, court orders, etc.

Preserving Ad Revenue With Filtering (Hate As Renewal Resource)

Monday, November 21st, 2016

Facebook and Twitter haven’t implemented robust and shareable filters for their respective content streams for fear of disturbing their ad revenue streams.* The power to filter feared as the power to exclude ads.

Other possible explanations include: Drone employment, old/new friends hired to discuss censoring content; Hubris, wanting to decide what is “best” for others to see and read; NIH (not invented here), which explains silence concerning my proposals for shareable content filters; others?

* Lest I be accused of spreading “fake news,” my explanation for the lack of robust and shareable filters on content on Facebook and Twitter is based solely on my analysis of their behavior and not any inside leaks, etc.

I have a solution for fearing filters as interfering with ad revenue.

All Facebook posts and Twitter tweets, will be delivered with an additional Boolean field, ad, which defaults to true (empty field), meaning the content can be filtered. (following Clojure) When the field is false, that content cannot be filtered.

Filters being registered and shared via Facebook and Twitter, testing those filters for proper operation (and not applying them if they filter ad content) is purely an algorithmic process.

Users pay to post ad content, a step where the false flag can be entered, resulting in no more ad freeloaders being free from filters.

What’s my interest? I’m interested in the creation of commercial filters for aggregation, exclusion and creating a value-add product based on information streams. Moreover, ending futile and bigoted attempts at censorship seems like a worthwhile goal to me.

The revenue potential for filters is nearly unlimited.

The number of people who hate rivals the number who want to filter the content seen by others. An unrestrained Facebook/Twitter will attract more hate and “fake news,” which in turn will drive a great need for filters.

Not a virtuous cycle but certainly a profitable one. Think of hate and the desire to censor as renewable resources powering that cycle.

PS: I’m not an advocate for hate and censorship but they are both quite common. Marketing is based on consumers as you find them, not as you wish they were.

Mute Account vs. Mute Word/Hashtag – Ineffectual Muting @Twitter

Thursday, November 17th, 2016

twitter-hate-speech-460

I mentioned yesterday the distinction between muting an account versus the new muting by word or #hashtag at Twitter.

Take a moment to check my sources at Twitter support to make sure I have the rules correctly stated. I’ll wait.

(I’m not a journalist but readers should be enabled to satisfy themselves claims I make are at least plausible.)

No feedback from Twitter on the don’t appear in your timeline vs. do appear in your timeline distinction.

Why would I want to only block notifications of what I think of as hate speech and still have those tweets in my timeline?

Then it occurred to me:

If you can block tweets from appearing in your timeline by word or hashtag, you can block advertising tweets from appearing in your timeline.

You cannot effectively mute hate speech @Twitter because you could also mute advertising.

What about it Twitter?

Must feminists, people of color, minorities of all types be subjected to hate speech in order to preserve your revenue streams?


Not that I object to Twitter having revenue streams from advertising but it needs to be more sophisticated than the Nigerian spammer model now in use. Charge a higher price for targeted advertising that users are unlikely to block.

For example, I would be highly unlikely to block ads for cs theory/semantic integration tomes. On the other hand, I would follow a mute list that blocked histories of famous cricket matches. (Apologies to any cricket players in the audience.)

In my post: Twitter Almost Enables Personal Muting + Roving Citizen-Censors I offer a solution that requires only minor changes based on data Twitter already collects plus regexes for muting. It puts what you see entirely in the hands of users.

That enables Twitter to get out of the censorship business altogether, something it doesn’t do well anyway, and puts users in charge of what they see. A win-win from my perspective.

Alt-right suspensions lay bare Twitter’s consistency [hypocrisy] problem

Thursday, November 17th, 2016

Alt-right suspensions lay bare Twitter’s consistency problem by Nausicaa Renner.

From the post:

TWITTER SUSPENDED A NUMBER OF ACCOUNTS associated with the alt-right, USA Today reported this morning. This move was bound to be divisive: While Twitter has banned and suspended users in the past (prominently, Milo Yiannopoulos for incitement), USA Today points out the company has never suspended so many at once—at least seven in this case. Richard Spencer, one of the suspended users and prominent alt-righter, also had a verified account on Twitter. He claims, “I, and a number of other people who have just got banned, weren’t even trolling.”

If this is true, it would be a powerful political statement, indeed. As David Frum notes in The Atlantic, “These suspensions seem motivated entirely by viewpoint, not by behavior.” Frum goes on to argue that a kingpin strategy on Twitter’s part will only strengthen the alt-right’s audience. But we may never know Twitter’s reasoning for suspending the accounts. Twitter declined to comment on its moves, citing privacy and security reasons.

(emphasis in original)

Contrary to the claims of the Southern Poverty Law Center (SPLC) to Twitter, these users may not have been suspended for violating Twitter’s terms of service, but for their viewpoints.

Like the CIA, FBI and NSA, Twitter uses secrecy to avoid accountability and transparency for its suspension process.

The secrecy – avoidance of accountability/transparency pattern is one you should commit to memory. It is quite common.

Twitter needs to develop better muting options for users and abandon account suspension (save on court order) altogether.

Twitter Almost Enables Personal Muting + Roving Citizen-Censors

Wednesday, November 16th, 2016

Investigating news reports of Twitter enabling muting of words and hashtags lead me to Advanced muting options on Twitter. Also relevant is Muting accounts on Twitter.

Alex Hern‘s post: Twitter users to get ability to mute words and conversations prompted this search because I found:

After nine years, Twitter users will finally be able to mute specific conversations on the site, as well as filter out all tweets with a particular word or phrase from their notifications.

The much requested features are being rolled out today, according to the company. Muting conversations serves two obvious purposes: users who have a tweet go viral will no longer have to deal with thousands of replies from strangers, while users stuck in an interminable conversation between people they don’t know will be able to silently drop out of the discussion.

A broader mute filter serves some clear general uses as well. Users will now be able to mute the names of popular TV shows, for instance, or the teams playing in a match they intend to watch later in the day, from showing up in their notifications, although the mute will not affect a user’s main timeline. “This is a feature we’ve heard many of you ask for, and we’re going to keep listening to make it better and more comprehensive over time,” says Twitter in a blogpost.

to be too vague to be useful.

Starting with Advanced muting options on Twitter, you don’t have to read far to find:

Note: Muting words and hashtags only applies to your notifications. You will still see these Tweets in your timeline and via search. The muted words and hashtags are applied to replies and mentions, including all interactions on those replies and mentions: likes, Retweets, additional replies, and Quote Tweets.

That’s the second paragraph and displayed with a high-lighted background.

So, “muting” of words and hashtags only stops notifications.

“Muted” offensive or inappropriate content is still visible “in your timeline and search.”

Perhaps really muting based on words and hashtags will be a paid subscription feature?

The other curious aspect is that “muting” an account carries an entirely different meaning.

The first sentence in Muting accounts on Twitter reads:

Mute is a feature that allows you to remove an account’s Tweets from your timeline without unfollowing or blocking that account.

Quick Summary:

  • Mute account – Tweets don’t appear in your timeline.
  • Mute by word or hashtag – Tweets do appear in your timeline.

How lame is that?

Solution That Avoids Censorship

The solution to Twitter’s “hate speech,” which means different things to different people isn’t hard to imagine:

  1. Mute by account, word, hashtag or regex – Tweets don’t appear in your timeline.
  2. Mute lists can be shared and/or followed by others.

Which means that if I trust N’s judgment on “hate speech,” I can follow their mute list. That saves me the effort of constructing my own mute list and perhaps even encourages the construction of public mute lists.

Twitter has the technical capability to produce such a solution in short order so you have to wonder why they haven’t? I have no delusion of being the first person to have imagined such a solution. Twitter? Comments?

The Alternative Solution – Roving Citizen-Censors

The alternative to a clean and non-censoring solution is covered in the USA Today report Twitter suspends alt-right accounts:

Twitter suspended a number of accounts associated with the alt-right movement, the same day the social media service said it would crack down on hate speech.

Among those suspended was Richard Spencer, who runs an alt-right think tank and had a verified account on Twitter.

The alt-right, a loosely organized group that espouses white nationalism, emerged as a counterpoint to mainstream conservatism and has flourished online. Spencer has said he wants blacks, Asians, Hispanics and Jews removed from the U.S.

[I personally find Richard Spencer’s views abhorrent and report them here only by way of example.]

From the report, Twitter didn’t go gunning for Richard Spencer’s account but the Southern Poverty Law Center (SPLC) did.

The SPLC didn’t follow more than 100 white supermacists to counter their outlandish claims or to offer a counter-narrative. They followed to gather evidence of alleged violations of Twitter’s terms of service and to request removal of those accounts.

Government censorship of free speech is bad enough, enabling roving bands of self-righteous citizen-censors to do the same is even worse.

The counter-claim that Twitter isn’t the government, it’s not censorship, etc., is intellectually and morally dishonest. Technically true in U.S. constitutional law sense but suppression of speech is the goal and that’s censorship, whatever fig leaf the SPLC wants to put on it. They should be honest enough to claim and defend the right to censor the speech of others.

I would not vote in their favor, that is to say they have a right to censor the speech of others. They are free to block speech they don’t care to hear, which is what my solution to “hate speech” on Twitter enables.

Support muting, not censorship or roving bands of citizen-censors.

Debate Night Twitter: Analyzing Twitter’s Reaction to the Presidential Debate

Sunday, November 6th, 2016

Debate Night Twitter: Analyzing Twitter’s Reaction to the Presidential Debate by George McIntire.

A bit dated content-wise but George covers techniques, from data gathering to analysis, useful for future events. Possible Presidential inauguration riots on January 20, 2017 for example. Or, the 2017 Super Bowl, where Lady GaGa will be performing.

From the post:

This past Sunday, Donald Trump and Hillary Clinton participated in a town hall-style debate, the second of three such events in this presidential campaign. It was an extremely contentious affair that reverberated across social media.

The political showdown was massively anticipated; the negative atmosphere of the campaign and last week’s news of Trump making lewd comments about women on tape certainly contributed to the fire. Trump further escalated the immense tension by holding a press conference with women who’ve accused former President Bill Clinton of abusing.

With having a near unprecedented amount of attention and hostility, I wanted to gauge Twitter’s reaction to the event. In this project, I streamed tweets under the hashtag #debate and analyzed them to discover trends in Twitter’s mood and how users were reacting to not just the debate overall but to certain events in the debate.

What techniques will you apply to your tweet data sets?

How To Use Twitter to Learn Data Science (or anything)

Wednesday, November 2nd, 2016

How To Use Twitter to Learn Data Science (or anything) by Data Science Renee.

Judging from the date on the post (May 2016), Renee’s enthusiasm for Twitter came before her recently breaking 10,000 followers on Twitter. (Congratulations!)

The one thing I don’t see Renee mentioning is the use of your own Twitter account to gain experience with a whole range of data mining tools.

Your Twitter feed will quickly out-strip your ability to “keep up,” so how do you propose to deal with that problem?

Renee suggests limiting examination of your timeline (in part), but have you considered using machine learning to assist you?

Or visualizing your areas of interests or people that you follow?

Indexing resources pointed to in tweets?

NLP processing of tweets?

Every tool of data science that you will be using for clients is relevant to your own Twitter feed.

What better way to learn tools than using them on content that interests you?

Enjoy!

BTW, follow Data Science Renee for a broad range of data science tools and topics!

Monetizing Twitter Trolls

Sunday, October 23rd, 2016

Alex Hern‘s coverage of Twitter’s fail-to-sell story, Did trolls cost Twitter $3.5bn and its sale?, is a typical short on facts story about abuse on Twitter.

When I say short on facts, I don’t deny any of the anecdotal accounts of abuse on Twitter and other social media.

Here’s the data problem with abuse at Twitter:

As of May of 2016, Twitter had 310 million active monthly users over 1.3 billion accounts.

Number of Twitter users who are abusive (trolls): unknown

Number of Twitter users who are victims: unknown

Number of abusive tweets, daily/weekly/monthly: unknown

Type/frequency of abusive tweets, language, images, disclosure: unknown

Costs to effectively control trolls: unknown

Trolls and abuse should be opposed both at Twitter and elsewhere, but without supporting data, creating corporate priorities and revenues to effectively block (not end, block) abuse isn’t possible.

Since troll hunting at present is a drain on the bottom line with no return for Twitter, what if Twitter were to monetize its trolls?

That is create a mechanism whereby trolls became the drivers of a revenue stream from Twitter.

One such approach would be to throw off all the filtering that Twitter does as part of its basic service. If you have Twitter basic service, you will see posts from everyone from committed jihadists to the Federal Reserve. Not blocked accounts, no deleted accounts, etc.

Twitter removes material under direct court order only. Put the burden and expense on going to court for every tweet on both individuals and governments. No exceptions.

Next, Twitter creates the Twitter+ account, where for an annual fee, users can access advanced filtering that includes blocking people, language, image analysis of images posted to them, etc.

Price point experiments should set the fees for Twitter+ accounts. Filtering will be a decision based on real revenue numbers. Not flights of fancy by the Guardian or Sales Force.

BTW, the open Twitter I suggest creates more eyes for ads, which should also improve the bottom line at Twitter.

An “open” Twitter will attract more trolls and drive more users to Twitter+ accounts.

Twitter trolls generate the revenue to fight them.

I rather like that.

You?

Twitter Logic: 1 call on Github v. 885,222 calls on Twitter

Sunday, October 23rd, 2016

Chris Albon’s collection of 885,222 tweets (ids only) for the third presidential debate of 2016 proves bad design decisions aren’t only made inside the Capital Beltway.

Chris could not post his tweet collection, only the tweet ids under Twitter’s terms of service.

The terms of service reference the Developer Policy and under that policy you will find:


F. Be a Good Partner to Twitter

1. Follow the guidelines for using Tweets in broadcast if you display Tweets offline.

2. If you provide Content to third parties, including downloadable datasets of Content or an API that returns Content, you will only distribute or allow download of Tweet IDs and/or User IDs.

a. You may, however, provide export via non-automated means (e.g., download of spreadsheets or PDF files, or use of a “save as” button) of up to 50,000 public Tweets and/or User Objects per user of your Service, per day.

b. Any Content provided to third parties via non-automated file download remains subject to this Policy.
…(emphasis added)

Just to be clear, I find Twitter extremely useful for staying current on CS research topics and think developers should be “…good partners to Twitter.”

However, Chris is prohibited from posting a data set of 885,222 tweets on Gibhub, where users could download it with no impact on Twitter, versus every user who want to explore that data set must submit 885,222 requests to Twitter servers.

Having one hit on Github for 885,222 tweets versus 885,222 on Twitter servers sounds like being a “good partner” to me.

Multiple that by all the researchers who are building Twitter data sets and the drain on Twitter resources grows without any benefit to Twitter.

It’s true that someday Twitter might be able to monetize references to its data collections, but server and bandwidth expenses are present line items in their budget.

Enabling the distribution of full tweet datasets is one step towards improving their bottom line.

PS: Please share this with anyone you know at Twitter. Thanks!

Political Noise Data (Tweets From 3rd 2016 Presidential Debate)

Sunday, October 23rd, 2016

Chris Albon has collected data on 885,222 debate tweets from the third Presidential Debate of 2016.

As you can see from the transcript, it wasn’t a “debate” in any meaningful sense of the term.

The quality of tweets about that debate are equally questionable.

However, the people behind those tweets vote, buy products, click on ads, etc., so despite my title description as “political noise data,” it is important political noise data.

To conform to Twitter terms of service, Chris provides the relevant tweet ids and a script to enable construction of your own data set.

BTW, Chris includes his Twitter mining scripts.

Enjoy!

ISIS Turns To Telegram App After Twitter Crackdown [Farce Alert + My Telegram Handle]

Monday, August 29th, 2016

ISIS Turns To Telegram App After Twitter Crackdown

From the post:

With the micro-blogging site Twitter coming down heavily on ISIS-sponsored accounts, the terrorist organisation and its followers are fast joining the heavily-encrypted messaging app Telegram built by a Russian developer.

On Telegram, the ISIS followers are laying out detailed plans to conduct bombing attacks in the west, voanews.com reported on Monday.

France and Germany have issued statements that they now want a crackdown against them on Telegram.

“Encrypted communications among terrorists constitute a challenge during investigations. Solutions must be found to enable effective investigation… while at the same time protecting the digital privacy of citizens by ensuring the availability of strong encryption,” the statement said.

Really?

Oh, did you notice the source? “Voanews.com reported on Monday.”

If you skip over to that post: IS Followers Flock to Telegram After being Driven from Twitter (I don’t want to shame the author so omitting their name), it reads in part:

With millions of IS loyalists communicating with one another on Telegram and spreading their message of radical Islam and extremism, France and Germany last week said that they want a continent wide effort to allow for a crackdown on Telegram.

“Encrypted communications among terrorists constitute a challenge during investigations,” France and Germany said in a statement. “Solutions must be found to enable effective investigation… while at the same time protecting the digital privacy of citizens by ensuring the availability of strong encryption.”

On private Telegram channels, IS followers have laid out detailed plans to poison Westerners and conduct bombing attacks, reports say.

What? “…millions of IS loyalists…?” IS in total is about 30K of active fighters, maybe. Millions of loyalists? Documentation? Citation of some sort? Being the Voice of America, I’d say they pulled that number out of a dark place.

Meanwhile, while complaining about the strong encryption, they are party to:

detailed plans to poison Westerners and conduct bombing attacks, reports say.

You do know wishing Westerners would choke on their Fritos doesn’t constitute a plan. Yes?

Neither does wishing to have an unspecified bomb, to be exploded at some unspecified location, at no particular time, constitute planning either.

Not to mention that “reports say” is a euphemism for: “…we just made it up.”

Get yourself to Telegram!

telegram-01-460

telegram-03-460

They left out my favorite:

Annoy governments seeking to invade a person’s privacy.

Reclaim your privacy today! Telegram!


Caveat: I tried using one device for the SMS to setup my smartphone. Nada, nyet, no joy. Had to use my cellphone number to setup the account on the cellphone. OK, but annoying.

BTW, on Telegram, my handle is @PatrickDurusau.

Yes, my real name. Which excludes this account from anything requiring OpSec. 😉