Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

February 27, 2016

Is Conduct/Truth A Defense to Censorship?

Filed under: Censorship,Facebook,Twitter — Patrick Durusau @ 4:50 pm

While Twitter sets up its Platonic panel of censors (Plato’s Republic, Books 2/3)*, I am wondering if conduct/truth will be a defense to censorship for accounts that make positive posts about the Islamic State?

I ask because of a message suggesting accounts (Facebook?) might be suspended for posts following these rules:

  • Do no use foul language and try to not get in a fight with people
  • Do not write too much for people to read
  • Make your point easy as not everyone has the same knowledge as you about the Islamic state and/or Islam
  • Use a VPN…
  • Use an account that you don’t really need because this is like a martydom operation, your account will probably be banned
  • Post images supporting the Islamic state
  • Give positive facts about the Islamic state
  • Share Islamic state video’s that show the mercy and kindness of the Islamic state towards Muslims, and/or showing Muslim’s support towards the Islamic state. Or any videos that will attract people to the Islamic state
  • Prove rumors about the Islamic state false
  • Give convincing Islamic information about topics discussed like the legitimacy of the khilafa, killing civilians of the kuffar, the takfeer made on Arab rules, etc.
  • Or simply just post a short quick comment showing your support like “dawlat al Islam baqiaa” or anything else (make sure ppl can understand it
  • Remember to like all the comments you see that are supporting the Islamic state with all your accounts!

Posted (but not endorsed) by J. Faraday on 27 February 2016.

If we were to re-cast those as rule of conduct, non-Islamic State specific, where N is the issue under discussion:

  • Do no use foul language and try to not get in a fight with people
  • Do not write too much for people to read
  • Make your point easy [to understand] as not everyone has the same knowledge as you about N
  • Post images supporting N
  • Give positive facts about N
  • Share N videos that show the mercy and kindness of N, and/or showing A support towards N. Or any videos that will attract people to N
  • Prove rumors about N false
  • Give convincing N information about topics discussed
  • Or simply just post a short quick comment showing your support or anything else (make sure ppl can understand it
  • Remember to like all the comments you see that are supporting N with all your accounts!

Is there something objectionable about those rules when N = Islamic State?

As far as being truthful, say for example claims by the Islamic State that Arab governments are corrupt, we can’t use a corruption index that lists Qatar at #22 (Denmark is #1 as the least corrupt) and Saudi Arabia at #48, when Bloomberg lists Qatar and Saudi Arabia as scoring zero (0) on budget transparency.

There are more corrupt governments than Qatar and Saudi Arabia, the failed state of Somalia for example, and perhaps the Sudan. Still, I wouldn’t ban anyone for saying both Qatar and Saudi Arabia are cesspools of corruption. They don’t match the structural corruption in Washington, D.C. but it isn’t for lack of trying.

Here the question for Twitter’s Platonic guardians (Trust and Safety Council):

Can an account that follows the rules of behavior outlined above be banned for truthful posts?

I think we all know the answer but I’m interested in seeing if Twitter will admit to censoring factually truthful information.

* Someone very dear to me objected to my reference to Twitterists (sp?) as Stalinists. It was literary hyperbole and so not literally true. Perhaps “Platonic guardians” will be more palatable. Same outcome, just a different moniker.

February 19, 2016

How to find breaking news on Twitter

Filed under: News,Searching,Tweets,Twitter — Patrick Durusau @ 3:03 pm

How to find breaking news on Twitter by Ruben Bouwmeester, Julia Bayer, and Alastair Reid.

From the post:

By its very nature, breaking news happens unexpectedly. Simply waiting for something to start trending on Twitter is not an option for journalists – you’ll have to actively seek it out.

The most important rule is to switch perspectives with the eyewitness and ask yourself, “What would I tweet if I were an eyewitness to an accident or disaster?”

To find breaking news on Twitter you have to think like a person who’s experiencing something out of the ordinary. Eyewitnesses tend to share what they see unfiltered and directly on social media, usually by expressing their first impressions and feelings. Eyewitness media can include very raw language that reflects the shock felt as a result of the situation. These posts often include misspellings.

In this article, we’ll outline some search terms you can use in order to find breaking news. The list is not intended as exhaustive, but a starting point on which to build and refine searches on Twitter to find the latest information.

Great collections of starter search terms but those are going to vary depending on your domain of “breaking” news.

Good illustration of use of Twitter search operators.

Other collections of Twitter search terms?

February 15, 2016

Twitter Suspension Tracker

Filed under: Censorship,Tweets,Twitter — Patrick Durusau @ 2:36 pm

Twitter Suspension Tracker by Lee Johnstone.

From the about page:

This site (Twitter Suspension Monitor) was created to do one purpose, log and track suspended twitter accounts.

The system periodically checks marked suspended accounts for possible reactivation and remarks them accordingly. This allows the system to start tracking how many hours, days or even weeks and months a users twitter account got suspended for. Ontop of site submitted entrys Twitter Suspension Monitor also scrapes data directly from twitter in hope to find many more suspended accounts.

Not transparency but some reflected light on the Twitter account suspension process.

Tweets from suspended accounts disappear.

Stalin would have felt right at home with Twitter’s methods if not its ideology.

Here’s a photo of Stalin for the webpage of the Twitter Trust & Safety Council:

220px-CroppedStalin1943

Members of the Twitter Trust & Safety Council should use it as their twitter profile image. Enable all of us to identify Twitter censorship collaborators.

However urgent current hysteria, censors are judged only one way in history.

Is that what you want for your legacy? Twitter, same question.

February 6, 2016

Are You A Scientific Twitter User or Polluter?

Filed under: Science,Twitter — Patrick Durusau @ 11:22 am

Realscientists posted this image to Twitter:

science

Self-Scoring Test:

In the last week, how often have you retweeted without “read[ing] the actual paper” pointed to by a tweet?

How many times did you retweet in total?

Formula: retweets w/o reading / retweets in total = % of retweets w/o reading.

No scale with superlatives because I don’t have numbers to establish a baseline for the “average” Twitter user.

I do know that I see click-bait, out-dated and factually wrong material retweeted by people who know better. That’s Twitter pollution.

Ask yourself: Am I a scientific Twitter user or a polluter?

Your call.

February 5, 2016

Is Twitter A Global Town Censor? (Data Project)

Filed under: Censorship,Free Speech,Government,Tweets,Twitter — Patrick Durusau @ 9:51 pm

Twitter Steps Up Efforts to Thwart Terrorists’ Tweets by Mike Isaac.

From the post:

For years, Twitter has positioned itself as a “global town square” that is open to discourse from all. And for years, extremist groups like the Islamic State have taken advantage of that stance, using Twitter as a place to spread their messages.

Twitter on Friday made clear that it was stepping up its fight to stem that tide. The social media company said it had suspended 125,000 Twitter accounts associated with extremism since the middle of 2015, the first time it has publicized the number of accounts it has suspended. Twitter also said it had expanded the teams that review reports of accounts connected to extremism, to remove the accounts more quickly.

“As the nature of the terrorist threat has changed, so has our ongoing work in this area,” Twitter said in a statement, adding that it “condemns the use of Twitter to promote terrorism.” The company said its collective moves had already produced results, “including an increase in account suspensions and this type of activity shifting off Twitter.”

The disclosure follows intensifying pressure on Twitter and other technology companies from the White House, presidential candidates like Hillary Clinton and government agencies to take more action to combat the digital practices of terrorist groups. The scrutiny has grown after mass shootings in Paris and San Bernardino, Calif., last year, because of concerns that radicalizations can be accelerated by extremist postings on the web and social media.

Just so you know what the Twitter rule is:

Violent threats (direct or indirect): You may not make threats of violence or promote violence, including threatening or promoting terrorism. (The Twitter Rules)

Here’s your chance to engage in real data science and help decide the question if Twitter had changed from global town hall to global town censor.

Here’s the data gathering project:

Monitor all the Twitter streams for Republican and Democratic candidates for the U.S. presidency for tweets advocating violence/terrorism.

File requests with Twitter for those accounts to be replaced.

FYI: When you report a message (Reporting a Tweet or Direct Message for violations), it will disappear from Messages inbox.

You must copy every tweet you report (accounts disappear as well) if you want to keep a record of your report.

Keep track of your reports and the tweet you copied before reporting.

Post the record of your reports and the tweets reported, plus any response from Twitter.

Suggestions on how to format these reports?

Or would you rather not know what Twitter is deciding for you?

How much data needs to be collected to move onto part 2 of the project – data analysis?


Suggestions on who at Twitter to contact for a listing of the 125,000 accounts that were silenced along with the Twitter history for each one? (Or the entire history of silenced accounts at Twitter? Who gets censored by topic, race, gender, location, etc., are all open questions.)

That could change the Twitter process from a black box to having marginally more transparency. You would have to guess at why any particular account was silenced.

If Twitter wants to take credit for censoring public discourse then the least it can do is be honest about who was censored and what they were saying to be censored.

Yes?

January 29, 2016

Twitter Graph Analytics From NodeXL (With Questions)

Filed under: Graphics,Graphs,NodeXL,Twitter — Patrick Durusau @ 4:35 pm

I’m sure you have seen this rather impressive Twitter graphic:

node-js-graph

And you can see a larger version, with a link to the interactive version here: https://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=61591

Impressive visualization but…, tell me, what can you learn from these tweets about big data?

I mean, visualization is a great tool but if I am not better informed after using the visualization than before, what’s the point?

If you go to the interactive version, you will find lists derived from the data, such as “top 10 vertices, ranked by Betweeness Centrality,” top 10 URLs in the graph and groups in the graph, top domains in the graph and groups in the graph, etc.

None of which is evident from casual inspection of the graph. (Top influencers might be if I could get the interactive version to resize but difficult unless the step between #11 and #10 was fairly large.

Nothing wrong with eye candy but for touting the usefulness of visualization, let’s look for more intuitive visualizations.

I saw this particular version in a tweet by Kirk D. Borne.

January 28, 2016

Twitter Account Details

Filed under: Twitter — Patrick Durusau @ 8:00 pm

Kirk D. Borne tweeted this page which returns all of his Twitter details.

You can try mine as well, patrickDurusau.

The generic link: http://www.twitteraccountsdetails.com/

Enjoy!

BTW, if you aren’t already following Kirk D. Borne, you should be.

January 17, 2016

Map Of A Single Tweet – Not Suitable For Current Use

Filed under: Tweets,Twitter — Patrick Durusau @ 5:38 pm

I encountered a color-coded map of a single Tweet today:

tweet-map

Either select the image to see it full-size or follow the original link: http://online.wsj.com/public/resources/documents/TweetMetadata.pdf.

I haven’t done a detailed comparison against the Twitter API documentation but suffice it to say this map should not be cited and used only with caution.

I don’t think anything in the map is wrong, but it isn’t complete, missing for example, possibly_sensitive, quoted_status_id, quoted_status_id_str, quoted_status and others.

Suggestions for an updated map of a single Tweet?

Even the out-dated map gives you a good idea of the richness of information that can be transmitted by a single tweet.

Makes me wonder who is using the 140 characters and/or additional data for open but secure communication?

January 7, 2016

Twitter Fighting Censorship? (Man Bites Dog Story?)

Filed under: Censorship,Tweets,Twitter — Patrick Durusau @ 2:55 pm

Twitter sues Turkey over ‘terror propaganda’ fine

From the post:

Twitter has challenged Turkey in an Ankara court seeking to cancel a $50,000 fine for not removing content from its website, the social media site’s lawyer told Al Jazeera on Thursday.

Turkey temporarily banned access to Twitter several times in the past for failing to comply with requests to remove content. But the 150,000 lira ($50,000) fine imposed by the Information and Communication Technologies Authority (BTK) was the first of its kind imposed by Turkish authorities on Twitter.

A Turkish official told Reuters news agency on Thursday that much of the material in question was related to the Kurdistan Workers Party (PKK), which Ankara called “terrorist propaganda”.

Twitter, in its lawsuit, is arguing the fine goes against Turkish law and should be annulled, the official told Reuters.

Reading about Twitter opposing censorship is like seeing a news account about a man biting a dog. That really is news!

I say that because only a few months ago in Secretive Twitter Censorship Fairy Strikes Again!, I pointed to reports of Twitter silencing 10,000 Islamic State accounts on April 2nd of 2015. More censorship of Islamic State accounts followed but that’s an impressive total for one day.

From all reports, entirely at Twitter’s on initiative. Why Twitter decided to single out accounts that favor the Islamic State over those that favor the U.S. military isn’t clear. The U.S. military is carrying out daily bombing attacks in Iraq and Syria, something you can’t say about the Islamic State.

Now Twitter finds itself in the unhappy position of being an inadequate censor, a censor that violates the fundamental premise of being a common carrier, that is it is open to all opinions, fair and foul, and a censor that has failed a state that is even less tolerant of free speech than Twitter.

Despised by one side for censorship and loathed by the other for being an inadequate toady.

Not an enviable position.

Just my suggestion but Twitter needs to reach out to the telcos and others who provide international connectivity for phones and other services to Turkey.

A 24 to 72 hour black-out of all telecommunications, for banks, media, phone, internet, should give the Turkish government a taste of the economic disruption, to say nothing of disruption of government, that will follow future attempts to censor, fine or block any international common carrier.

The telcos and other have the power to bring outlandish actors such as the Turkish government to a rapid heel.

It’s time that power was put to use.

You see, no bombs, no boots on the ground, no lengthy and tiresome exchanges of blustering speeches, just a quick trip back to the 19th century to remind Turkey’s leaders how painful a longer visit could be.

January 5, 2016

Back from the Dead: Politwoops

Filed under: Journalism,Privacy,Tweets,Twitter — Patrick Durusau @ 7:07 pm

Months after Twitter revoked API access, Politwoops is back, tracking the words politicians take back by Joseph Lichterman.

From the post:

We’ll forgive you if you missed the news, since it was announced on New Year’s Eve: Politwoops, the service which tracks politicians’ deleted tweets, is coming back after Twitter agreed to let it access the service’s API once again.

On Tuesday, the Open State Foundation, the Dutch nonprofit that runs the international editions of Politwoops, said it was functioning again in 25 countries, including the United Kingdom, the Netherlands, Ireland, and Turkey. The American version of Politwoops, operated by the Sunlight Foundation, isn’t back up yet, but the foundation said in a statement that “in the coming days and weeks, we’ll be working behind the scenes to get Politwoops up and running.”

Excellent news!

Politwoops will be reporting tweets that politicians send and then suddenly regret.

I don’t disagree with Twitter that any user can delete their tweets but strongly disagree that I can’t capture the original tweet and at a minimum, point to its absence from the “now” Twitter archive.

Politicians should not be allowed to hide from their sporadic truthful tweets.

December 2, 2015

Twitter Journalism Tips

Filed under: Journalism,News,Reporting,Tweets,Twitter — Patrick Durusau @ 1:49 pm

Twitter Journalism Tips from FirstDraftNews.

Five videos on effective use of Twitter for journalism.

The videos are:

How To Use Twitter Lists For Journalism 2:37

Why I Love Twitter Lists – Sue Llewellyn 1:48

Journalist Guide: How To Use Tweekdeck 1:14

Journalist Tweetdeck Tips – Reuters, George Sargent 1:12

Searching For Geolocated Posts On Twitter 1:26

The times shown are minutes followed by seconds.

Labeled for journalism but anyone searching Twitter, librarians, authors, researchers, even “fans” (shudder), will find useful information in these videos.

If you don’t know FirstDraftNews, you need to get acquainted.

November 18, 2015

Using Twitter To Control A Botnet

Filed under: Cybersecurity,Security,Twitter — Patrick Durusau @ 10:44 am

Twitter Direct Messages to control hacked computers by John Zorabedian.

From the post:

Direct Messages on Twitter are a way for users to send messages to individuals or a group of users privately, as opposed to regular tweets, which can be seen by everyone.

Twitter has expended a lot of effort to stamp out the predictable abuses of the Direct Message medium – namely spam and phishing attacks.

But now, self-styled security researcher Paul Amar has created a free Python-based tool called Twittor that uses Direct Messages on Twitter as a command-and-control server for botnets.

As you probably know, cybercriminals use botnets in a variety of ways to launch attacks.

But the one thing we don’t quite get in all of this is, “Why?”

Many security tools, like Nmap and Metasploit, cut both ways, being useful for researchers and penetration testers but also handy for crooks.

But publishing a free tool that helps you operate a botnet via Twitter Direct Message seems a strange way to conduct security research, especially when Twitbots are nothing new.

Amusing indignant stance by naked security on yet another tool for controlling botnets.

Notice the “self-styled security researcher,” I guess Anonymous are “self-styled” hackers and “…a strange way to conduct security research…,” as though anyone would make appoint naked security as security research censor.

Software is neither good nor bad and the conduct of government, police departments, corporations, security researchers has left little doubt that presuming a “good side” is at best naive if not fatally stupid.

There are those who, for present purposes, are not known to be on some other side but that is about as far as you can go safely.

You can find a highly similar article at: Tool Controls Botnet With Twitter Direct Messages by Kelly Jackson Higgins, which supplies the link missing from the naked security post:

Twittor is available on Github.

Kelly reports that Amar is working on adding a data extraction tool to Twittor.

October 26, 2015

How to Get Free Access to Academic Papers on Twitter [3 Rules]

Filed under: Computer Science,Twitter — Patrick Durusau @ 8:56 pm

How to Get Free Access to Academic Papers on Twitter by Aamna Mohdin.

From the post:

Most academic journals charge expensive subscriptions and, for those without a login, fees of $30 or more per article. Now academics are using the hashtag #icanhazpdf to freely share copyrighted papers.

Scientists are tweeting a link of the paywalled article along with their email address in the hashtag—a riff on the infamous meme of a fluffy cat’s “I Can Has Cheezburger?” line. Someone else who does have access to the article downloads a pdf of the paper and emails the file to the person requesting it. The initial tweet is then deleted as soon as the requester receives the file.

3 rules to remember:

  1. Paywall link + #icanhazpdf + your email.
  2. Delete tweet when paper arrives.
  3. Don’t ask/Don’t tell.

Enjoy!

October 18, 2015

Requirements For A Twitter Client

Filed under: Curation,Data Mining,Twitter — Patrick Durusau @ 2:57 pm

Kurt Cagle writes of needed improvements to Twitter’s “Moments,” in Project Voyager and Moments: Close, but not quite there yet saying:

This week has seen a pair of announcements that are likely to significantly shake up social media as its currently known. Earlier this week, Twitter debuted its Moments, a news service where the highlights of the week are brought together into a curated news aggregator.

However, this is 2015. What is of interest to me – topics such as Data Science, Semantics, Astronomy, Climate Change and so forth, are likely not going to be of interest to others. Similarly, I really have no time for cute pictures of dogs (cats, maybe), the state of the World Series race, the latest political races or other “general” interest topics. In other words, I want to be able to curate content my way, even if the quality is not necessarily the highest, than I do have other people who I do not know decide to curate to the lowest possible denominator.

A very small change, on the other hand, could make a huge difference for Moments for myself and many others. Allow users to aggregate a set of hash tags under a single “Paper section banner” – #datascience, #data, #science, #visualization, #analytics, #stochastics, etc. – could all go under the Data Science banner. Even better yet, throw in a bit of semantics to find every topic within two hops topically to the central terms and use these (with some kind of weighting factor) as well. Rank these tweets according to fitness, then when I come to Twitter I can “read” my twitter paper just by typing in the appropriate headers (or have them auto-populate a list).

My exclusion list would include cats, shootings, bombings, natural disasters, general news and other ephemera that will be replaced by another screaming headline next week, if not tomorrow.

Starting with Kurt’s suggested improvements, a Twitter client should offer:

  • User-based aggregation based upon # tags
  • Learning semantics (Kurt’s two-hop for example)
  • Deduping tweets for user set period, day, week, month, other
  • User determined sorting of tweets by time/date, author, retweets, favorites
  • Exclusion of tweets without URLs
  • Filtering of tweets based on sender (included by # tags), etc. and perhaps regex

I have looked but not found any Twitter client that comes even close.

Other requirements?

October 9, 2015

Filter [Impersonating You]

Filed under: Filters,News,Twitter — Patrick Durusau @ 9:55 am

Filter

From the webpage:

Filter shows you the top stories from communities of Twitter users across a range of topics like climate change, bitcoin, and U.S. foreign policy.

With Filter, the only way you’ll miss something is if the entire community misses it too.

Following entire Twitter communities is a good idea but signing in with Twitter enables Filter to impersonate you.

This application will be able to:

  • Read Tweets from your timeline.
  • See who you follow, and follow new people.
  • Update your profile.
  • Post Tweets for you.

(emphasis added)

My complaint is general to all Sign in with Twitter applications and Filter is just an example I encountered this morning.

I can’t explore and report to you the features or shortcomings of Filter because I am happy with my current following list and have no desire to allow some unknown (read untrusted) third-party posting on my Twitter account.

If you encounter a review of Filter by someone who isn’t bothered by being randomly impersonated, send me a link. I would like to know more about the site.

Thanks!

September 22, 2015

Topic Modeling and Twitter

Filed under: Latent Dirichlet Allocation (LDA),Python,Twitter — Patrick Durusau @ 9:57 am

Alex Perrier has two recent posts of interest to Twitter users and topic modelers:

Topic Modeling of Twitter Followers

In this post, we explore LDA an unsupervised topic modeling method in the context of twitter timelines. Given a twitter account, is it possible to find out what subjects its followers are tweeting about?

Knowing the evolution or the segmentation of an account’s followers can give actionable insights to a marketing department into near real time concerns of existing or potential customers. Carrying topic analysis of followers of politicians can produce a complementary view of opinion polls.

Segmentation of Twitter Timelines via Topic Modeling

Following up on our first post on the subject, Topic Modeling of Twitter Followers, we compare different unsupervised methods to further analyze the timelines of the followers of the @alexip account. We compare the results obtained through Latent Semantic Analysis and Latent Dirichlet Allocation and we segment Twitter timelines based on the inferred topics. We find the optimal number of clusters using silhouette scoring.

Alex has Python code, an interesting topic, great suggestions for additional reading, what is there not to like?

LDA, machine learning types follow @alexip but privacy advocates should as well.

Consider this recent tweet by Alex:

In the end the best way to protect your privacy is to behave erratically so that the Machine Learning algo will detect you as an outlier!

Perhaps, perhaps, but I suspect outliers/outsiders are classed as dangerous by several government agencies in the US.

August 27, 2015

Twitter Doubles Down on Censorship

Filed under: Censorship,Twitter — Patrick Durusau @ 8:32 pm

Twitter muzzles Politwoops politician-tracking accounts by Lisa Vaas.

From the post:

When Twitter killed embarrassing-political-tweet archive Politwoops in June, the site’s founders probably looked to the 30 other countries where it was running and said, well, it might just be a matter of time before those are strangled in the crib.

Consider them strangled.

Twitter told the Open State Foundation on Friday that it had suspended API access to Diplotwoops and all remaining Politwoops sites in those 30 countries.

Part of Twitter’s explanation reads as follows:

Imagine how nerve-racking – terrifying, even – tweeting would be if it was immutable and irrevocable? No one user is more deserving of that ability than another. Indeed, deleting a tweet is an expression of the user’s voice.

Do you wonder if Twitter will use that justification when the NSA comes knocking?

I have to imagine that Twitter comes down on the side of the my-edited-history folks of the EU and recently the UK.

I find the idea that digital records will shift under our feet far more terrifying than tweets being “immutable and irrevocable.”

You?

June 11, 2015

Anewstip

Filed under: News,Tweets,Twitter — Patrick Durusau @ 4:23 pm

Anewstip

From the webpage:

Find journalists by what they tweet

Powered by all the tweets since 2006 from more than 1 million journalist & media outlets.

Search for relevant journalists

Search through 1 billion+ real-time and historical tweets (since 2006, when Twitter was born) from 1 million+ journalists and media outlets, to find out all the relevant media contacts that have talked about your product, your business, your competitors, or any other keywords in your industry.

Searches can be limited to tweets, journalists and outlets.

The advanced search interface looks useful:

anewstip-advanced

If you are mining twitter for news sources, this could prove to be very useful.

With the caveat that news sources tend to be highly repetitive. If the New York Times says the OPM hack originated in China, a large number of news lemmings will repeat that without a word of doubt or criticism. Still amounts to one unknown source cited by the New York Times. No matter how many times it is repeated.

June 5, 2015

Twitter As Censor

Filed under: Government,Politics,Twitter — Patrick Durusau @ 9:30 am

Twitter shut down a site that saved politicians’ deleted tweets by Colin Lecher

Colin reports that Politwoops (Sunlight Foundation) was shut down by Twitter. It’s crime? It saved tweets that politicians deleted. Horrors. A public statements that remain public statements. Can’t imagine why anyone would think that was reasonable.

No appeal, no coherent explanation, no review of the history of a discussion that has been going on since 2012. See Colin’s post for more details.

The Sunlight Foundation has its own reasons “honoring” the Twitter decision. However, I think the Twitter decision merits a more pointed response.

Script to detect deleted tweets? Anyone have a script they can post to search for deleted tweets? Assuming the starting point is an archive of tweets and the script checks to see if any have been deleted.

Polypoops or some similar title: Reddit? For user who detect deleted tweets to post them. Assuming that any site hosting edgy porn won’t be overly troubled by embarrassing politicians.

You may protest that such activities may be seen by Twitter as violating its “terms of service.” To be honest, I am not overly concerned with Twitter’s “dog in the manger” strategies when it comes to Twitter content.

The_Dog_in_the_Manger

Bounty for Internal Twitter Decision Making on Politwoops: If you are good with writing/managing Kickstarter campaigns, what do you think about a bounty for internal Twitter decision making documentation on the Politwoops issue? What do you think it would take? How would you authenticate a response?

Twitter management is within its legal rights to make arbitrary and capricious decisions about their terms of service.

The community is within its rights to make decisions as well.

The question is whether Twitter management wants to pull back its corporate hand or a nub.

June 4, 2015

The Archive Is Closed [Library of Congress Twitter Archive]

Filed under: Library,Tweets,Twitter — Patrick Durusau @ 1:57 pm

The Archive Is Closed by Scott McLemee.

From the post:

Five years ago, this column looked into scholarly potential of the Twitter archive the Library of Congress had recently acquired. That potential was by no means self-evident. The incensed “my tax dollars are being used for this?” comments practically wrote themselves, even without the help of Twitter bots.

For what — after all — is the value of a dead tweet? Why would anyone study 140-character messages, for the most part concerning mundane and hyperephemeral topics, with many of them written as if to document the lowest possible levels of functional literacy?
As I wrote at the time, papers by those actually doing the research treated Twitter as one more form of human communication and interaction. The focus was not on the content of any specific message, but on the patterns that emerged when they were analyzed in the aggregate. Gather enough raw data, apply suitable methods, and the results could be interesting. (For more detail, see the original discussion.)

The key thing was to have enough tweets on hand to grind up and analyze. So, yes, an archive. In the meantime, the case for tweet preservation seems easier to make now that elected officials, religious leaders and major media outlets use Twitter. A recent volume called Twitter and Society (Peter Lang, 2014) collects papers on how politics, journalism, the marketplace and (of course) academe itself have absorbed the impact of this high-volume, low-word-count medium.

As far as the Library of Congress archive, Scott reports:


The Library of Congress finds itself in the position of someone who has agreed to store the Atlantic Ocean in his basement. The embarrassment is palpable. No report on the status of the archive has been issued in more than two years, and my effort to extract one elicited nothing but a statement of facts that were never in doubt.

“The library continues to collect and preserve tweets,” said Gayle Osterberg, the library’s director of communications, in reply to my inquiry. “It was very important for the library to focus initially on those first two aspects — collection and preservation. If you don’t get those two right, the question of access is a moot point. So that’s where our efforts were initially focused and we are pleased with where we are in that regard.”

That’s as helpful as the responses I get about the secret ACM committee that determines the fate of feature requests for the ACM digital library. You can’t contact them directly nor can you find any record of their discussions/decisions.

Let’s hope greater attention and funding can move the Library of Congress Twitter Archive towards public access, for all the reasons enumerated by Scott.

One does have to wonder, given the role of the U.S. government in pushing for censorship of Twitter accounts, will the Library of Congress archive be complete and free from censorship? Or will it have dark spots depending upon the whims and caprices of the current regime?

May 28, 2015

Content Recommendation From Links Shared on Twitter Using Neo4j and Python

Filed under: Cypher,Graphs,Neo4j,Python,Twitter — Patrick Durusau @ 4:50 pm

Content Recommendation From Links Shared on Twitter Using Neo4j and Python by William Lyon.

From the post:

Overview

I’ve spent some time thinking about generating personalized recommendations for articles since I began working on an iOS reading companion for the Pinboard.in bookmarking service. One of the features I want to provide is a feed of recommended articles for my users based on articles they’ve saved and read. In this tutorial we will look at how to implement a similar feature: how to recommend articles for users based on articles they’ve shared on Twitter.

Tools

The main tools we will use are Python and Neo4j, a graph database. We will use Python for fetching the data from Twitter, extracting keywords from the articles shared and for inserting the data into Neo4j. To find recommendations we will use Cypher, the Neo4j query language.

Very clear and complete!

Enjoy!

May 21, 2015

Twitter As Investment Tool

Filed under: Social Media,Social Networks,Social Sciences,Twitter — Patrick Durusau @ 12:44 pm

Social Media, Financial Algorithms and the Hack Crash by Tero Karppi and Kate Crawford.

Abstract:

@AP: Breaking: Two Explosions in the White House and Barack Obama is injured’. So read a tweet sent from a hacked Associated Press Twitter account @AP, which affected financial markets, wiping out $136.5 billion of the Standard & Poor’s 500 Index’s value. While the speed of the Associated Press hack crash event and the proprietary nature of the algorithms involved make it difficult to make causal claims about the relationship between social media and trading algorithms, we argue that it helps us to critically examine the volatile connections between social media, financial markets, and third parties offering human and algorithmic analysis. By analyzing the commentaries of this event, we highlight two particular currents: one formed by computational processes that mine and analyze Twitter data, and the other being financial algorithms that make automated trades and steer the stock market. We build on sociology of finance together with media theory and focus on the work of Christian Marazzi, Gabriel Tarde and Tony Sampson to analyze the relationship between social media and financial markets. We argue that Twitter and social media are becoming more powerful forces, not just because they connect people or generate new modes of participation, but because they are connecting human communicative spaces to automated computational spaces in ways that are affectively contagious and highly volatile.

Social sciences lag behind the computer sciences in making their publications publicly accessible as well as publishing behind firewalls so I can report on is the abstract.

On the other hand, I’m not sure how much practical advice you could gain from the article as opposed to the volumes of commentary following the incident itself.

The research reminds me of Malcolm Gladwell, author of The Tipping Point and similar works.

While I have greatly enjoyed several of Gladwell’s books, including the Tipping Point, it is one thing to look back and say: “Look, there was a tipping point.” It is quite another to be in the present and successfully say: “Look, there is a tipping point and we can make it tip this way or that.”

In retrospect, we all credit ourselves with near omniscience when our plans succeed and we invent fanciful explanations about what we knew or realized at the time. Others, equally skilled, dedicated and competent, who started at the same time, did not succeed. Of course, the conservative media (and ourselves if we are honest), invent narratives to explain those outcomes as well.

Of course, deliberate manipulation of the market with false information, via Twitter or not, is illegal. The best you can do is look for a pattern of news and/or tweets that result in downward changes in a particular stock, which then recovers and then apply that pattern more broadly. You won’t make $millions off of any one transaction but that is the sort of thing that draws regulatory attention.

April 28, 2015

One Word Twitter Search Advice

Filed under: Search Behavior,Searching,Twitter — Patrick Durusau @ 6:16 pm

The one word journalists should add to Twitter searches that you probably haven’t considered by Daniel Victor.

Daniel takes you through five results without revealing how he obtained them. A bit long but you will be impressed when he reveals the answer.

He also has some great tips for other Twitter searching. Tips that you won’t see from any SEO.

Definitely something to file with your Twitter search tips.

April 20, 2015

Twitter cuts off ‘firehose’ access…

Filed under: Data,Twitter — Patrick Durusau @ 3:11 pm

Twitter cuts off ‘firehose’ access, eyes Big Data bonanza by Mike Wheatley.

From the post:

Twitter upset the applecart on Friday when it announced it would no longer license its stream of half a billion daily tweets to third-party resellers.

The social media site said it had decided to terminate all current agreements with third parties to resell its ‘firehose’ data – an unfiltered, full stream of tweets and all of the metadata that comes with them. For companies that still wish to access the firehose, they’ll still be able to do so, but only by licensing the data directly from Twitter itself.

Twitter’s new plan is to use its own Big Data analytics team, which came about as a result of its acquisition of Gnip in 2014, to build direct relationships with data companies and brands that rely on Twitter data to measure market trends, consumer sentiment and other metrics that can be best understood by keeping track of what people are saying online. The company hopes to complete the transition by August this year.

Not that I had any foreknowledge of Twitter’s plans but I can’t say this latest move is all that surprising.

What I hope also emerges from the “new plan” is a fixed pricing structure for smaller users of Twitter content. I’m really not interested in an airline pricing model where the price you pay has no rational relationship to the value of the product. If it’s the day before the end of a sales quarter I get a very different price for a Twitter feed than mid-way through the quarter. That sort of thing.

Along with being able to specify users to follow/searches and tweet streams in daily increments of 250,000, 500,000, 750,000, 1,000,000, where they are spooled for daily pickup over high speed connections (to put less stress on infrastructure).

I suppose renewable contracts would be too much to ask? 😉

April 15, 2015

Secretive Twitter Censorship Fairy Strikes Again!

Filed under: Censorship,Twitter — Patrick Durusau @ 10:34 am

Twitter shuts down 10,000 ISIS-linked accounts in one day by Lisa Vaas.

From the post:


A Twitter representative on Thursday confirmed to news outlets that its violations department had in fact suspended some 10,000 accounts on one day – 2 April – “for tweeting violent threats”.

The Twitter representative, who spoke on the condition of anonymity, attributed the wave of shutdowns to ISIS opponents who’ve been vigilant in reporting accounts for policy violation:

We received a large amount of reports.

In early March, Twitter acknowledged shutting down at least 2000 ISIS-linked accounts per week in recent months.

Fact 1: Twitter is a private service and can adopt and apply any “terms of service” it chooses in any manner it chooses.

Fact 2: The “abuse” reporting system of Twitter and its lack of transparency, not to mention missing any opportunity for a public hearing and appeal, create the opportunity for and appearance of, arbitrary and capricious application.

Fact 3: The organization sometimes known as ISIS and its supporters have been targeted for suppression of all their communications, which violate the “terms of service” of Twitter or not, without notice and a hearing, thereby depriving other Twitter users of the opportunity to hear their views on current subjects of world importance.

Twitter is under no legal obligation to avoid censorship but Twitter should take steps to reduce its role as censor:

Step 1: Twitter should alter its “abuse” policy to provide alleged abusers with notice of the alleged abuse and a reasonable amount of time to respond to the allegation of abuse. Both the notice of alleged abuse and response to the notice shall be and remain public documents hosted by Twitter and indexed under the account alleged to be used for abuse. Along with the Twitter resolution described in Step 2.

Step 2: Twitter staff should issue a written statement as to what was found to transgress its “terms of service” so that other users can avoid repeating the alleged “abuse” accidentally.

Step 3: Twitter should adopt a formal “hands-off” policy when it comes to comments by, for or against political entities or issues, including ISIS in particular. What is a “threat” in some countries is not a “threat” in others. Twitter should act as a global citizen and not a parochial organization based in rural Alabama.

I would not visit areas under the control of ISIS even if you offered me a free ticket. Support or non-support of ISIS isn’t the issue.

The issue is whether we will allow private and unregulated entities to control a common marketplace for the interchange of ideas. If Twitter likes an unregulated common marketplace then it had best make sure it maintains a transparent and fair common marketplace. Not one where some people or ideas are second-class citizens and who can be arbitrarily silenced, in secret, by unknown Twitter staff.

April 5, 2015

Building a complete Tweet index

Filed under: Indexing,Searching,Twitter — Patrick Durusau @ 10:46 am

Building a complete Tweet index by Yi Zhuang.

Since it is Easter Sunday in many religious traditions, what could be more inspirational than “…a search service that efficiently indexes roughly half a trillion documents and serves queries with an average latency of under 100ms.“?

From the post:

Today [11/8/2014], we are pleased to announce that Twitter now indexes every public Tweet since 2006.

Since that first simple Tweet over eight years ago, hundreds of billions of Tweets have captured everyday human experiences and major historical events. Our search engine excelled at surfacing breaking news and events in real time, and our search index infrastructure reflected this strong emphasis on recency. But our long-standing goal has been to let people search through every Tweet ever published.

This new infrastructure enables many use cases, providing comprehensive results for entire TV and sports seasons, conferences (#TEDGlobal), industry discussions (#MobilePayments), places, businesses and long-lived hashtag conversations across topics, such as #JapanEarthquake, #Election2012, #ScotlandDecides, #HongKong. #Ferguson and many more. This change will be rolling out to users over the next few days.

In this post, we describe how we built a search service that efficiently indexes roughly half a trillion documents and serves queries with an average latency of under 100ms.

The most important factors in our design were:

  • Modularity: Twitter already had a real-time index (an inverted index containing about a week’s worth of recent Tweets). We shared source code and tests between the two indices where possible, which created a cleaner system in less time.
  • Scalability: The full index is more than 100 times larger than our real-time index and grows by several billion Tweets a week. Our fixed-size real-time index clusters are non-trivial to expand; adding capacity requires re-partitioning and significant operational overhead. We needed a system that expands in place gracefully.
  • Cost effectiveness: Our real-time index is fully stored in RAM for low latency and fast updates. However, using the same RAM technology for the full index would have been prohibitively expensive.
  • Simple interface: Partitioning is unavoidable at this scale. But we wanted a simple interface that hides the underlying partitions so that internal clients can treat the cluster as a single endpoint.
  • Incremental development: The goal of “indexing every Tweet” was not achieved in one quarter. The full index builds on previous foundational projects. In 2012, we built a small historical index of approximately two billion top Tweets, developing an offline data aggregation and preprocessing pipeline. In 2013, we expanded that index by an order of magnitude, evaluating and tuning SSD performance. In 2014, we built the full index with a multi-tier architecture, focusing on scalability and operability.

If you are interested in scaling search issues, this is a must read post!

Kudos to Twitter Engineering!

PS: Of course all we need now is a complete index to Hilary Clinton’s emails. The NSA probably has a copy.

You know, the NSA could keep the same name, National Security Agency, and take over providing backups and verification for all email and web traffic, including the cloud. Would have to work on who could request copies but that would resolve the issue of backups of the Internet rather neatly. No more deleted emails, tweets, etc.

That would be a useful function, as opposed to harvesting phone data on the premise that at some point in the future it might prove to be useful, despite having not proved useful in the past.

March 16, 2015

Bias? What Bias?

Filed under: Bias,Facebook,Social Media,Social Sciences,Twitter — Patrick Durusau @ 6:09 pm

Scientists Warn About Bias In The Facebook And Twitter Data Used In Millions Of Studies by Brid-Aine Parnell.

From the post:

Social media like Facebook and Twitter are far too biased to be used blindly by social science researchers, two computer scientists have warned.

Writing in today’s issue of Science, Carnegie Mellon’s Juergen Pfeffer and McGill’s Derek Ruths have warned that scientists are treating the wealth of data gathered by social networks as a goldmine of what people are thinking – but frequently they aren’t correcting for inherent biases in the dataset.

If folks didn’t already know that scientists were turning to social media for easy access to the pat statistics on thousands of people, they found out about it when Facebook allowed researchers to adjust users’ news feeds to manipulate their emotions.

Both Facebook and Twitter are such rich sources for heart pounding headlines that I’m shocked, shocked that anyone would suggest there is bias in the data! 😉

Not surprisingly, people participate in social media for reasons entirely of their own and quite unrelated to the interests or needs of researchers. Particular types of social media attract different demographics than other types. I’m not sure how you could “correct” for those biases, unless you wanted to collect better data for yourself.

Not that there are any bias free data sets but some are so obvious that it hardly warrants mentioning. Except that institutions like the Brookings Institute bump and grind on Twitter data until they can prove the significance of terrorist social media. Brookings knows better but terrorism is a popular topic.

Not to make data carry all the blame, the test most often applied to data is:

Will this data produce a result that merits more funding and/or will please my supervisor?

I first saw this in a tweet by Persontyle.

March 7, 2015

The ISIS Twitter Census

Filed under: Social Media,Social Networks,Twitter — Patrick Durusau @ 8:38 pm

The ISIS Twitter Census: Defining and describing the population of ISIS supporters on Twitter by J.M. Berger and Jonathon Morgan.

This is the Brookings Institute report that I said was forthcoming in: Losing Your Right To Decide, Needlessly.

From the Executive Summary:

The Islamic State, known as ISIS or ISIL, has exploited social media, most notoriously Twitter, to send its propaganda and messaging out to the world and to draw in people vulnerable to radicalization.

By virtue of its large number of supporters and highly organized tactics, ISIS has been able to exert an outsized impact on how the world perceives it, by disseminating images of graphic violence (including the beheading of Western journalists and aid workers and more recently, the immolation of a Jordanian air force pilot), while using social media to attract new recruits and inspire lone actor attacks.

Although much ink has been spilled on the topic of ISIS activity on Twitter, very basic questions remain unanswered, including such fundamental issues as how many Twitter users support ISIS, who they are, and how many of those supporters take part in its highly organized online activities.

Previous efforts to answer these questions have relied on very small segments of the overall ISIS social network. Because of the small, cellular nature of that network, the examination of particular subsets such as foreign fighters in relatively small numbers, may create misleading conclusions.

My suggestion is that you skim the “group think” sections on ISIS and move quickly to Section 3, Methodology. That will put you into a position to evaluate the various and sundry claims about ISIS and what may or may not be supported by their methodology.

I am still looking for a metric for “successful” use of social media. So far, no luck.

February 9, 2015

Twitter can solve harassment right now…

Filed under: Governance,Twitter — Patrick Durusau @ 9:51 am

Twitter can solve harassment right now with verified accounts by Jason Calacanis.

Jason’s proposal to stop harassment on Twitter is simplicity itself. Twitter would add a forth privacy option that limits the tweets you see to users who have been “verified.” Where “verified” means they have a “real world” address and identity. Easier to hold them responsible for harassment. Twitter’s incentive is a nominal annual fee for the verification option.

Jason extols the many benefits of his proposal so see the original post.

Jason doesn’t mention demand for the verified option. If offered to all Twitter users at once, demand would outstrip their ability to respond. Better to offer “verification” to blocks of users and maintain a high quality experience.

Let’s get Twitter’s attention on Jason’s post. Let’s make it a trending topic on Twitter!

December 30, 2014

Twitter and CS Departments (Part 1)

Filed under: Computer Science,Twitter — Patrick Durusau @ 7:06 pm

I don’t spend all my time as Dylan says:

I’m on the pavement. Thinking about the government.

😉

Over the weekend I was looking at: The 50 Most Innovative Computer Science Departments in the U.S. in terms of how to gather information from those departments together.

One of the things that I haven’t seen is a curated list of faculty who have twitter accounts.

What follows are the top two CS departments as a proof-of-concept only and to seek your advice on a format for a complete set.

Massachusetts Institute of Technology:

Stanford:

The names and locations, where available, are from the user profiles maintained by Twitter. As you can see, there is no common location that would suffice to capture all the faculty for either of these departments. In fact, some of these were identified only by pursuing links on Twitter profiles that identified the individuals as faculty at non-Twitter sites.

Building the data set out, once I have a curated set of faculty members for the top fifty (50) institutions, such as following, followers, etc. will be a matter of querying Twitter.

On the curated set of faculty members, any preference for format? I was thinking of something simple, like a CSV file with TwitterHandle, Full Name (as appears in Twitter profile), URI of department. Does that work for everyone? (Faculty as listed by the CS department)

Suggestions? Comments?

« Newer PostsOlder Posts »

Powered by WordPress