Archive for the ‘Facebook’ Category

Countering Inaccurate/Ineffectual Sierra Club Propaganda

Sunday, February 26th, 2017

This Sierra Club ad is popular on Facebook:

First problem, it is inaccurate to the point of falsehood.

“…about to start their chainsaws…. …trying to clearcut America’s largest forest, the Tongass National Forest in Alaska…. (emphasis added)”

Makes you think clearcutting is about to start in the Tongass National Forest in Alaska. Yes?

Wrong!

If you go to Forest Management Reports and Accomplishments for the Tongass, you will find Forest Service reports for logging in the Tongass that start in 1908. Cut History 1908 to Present.

The first inaccuracy/lie of the Sierra ad is that logging isn’t already ongoing in the Tongass.

The Sierra ad and its links also fail to mention (in millions of board feet) harvesting from the Tongass:

Calendar Year Board Feet
2016 44,076,800
2010 35,804,970
2000 119,480,750
1990 473,983,320
1980 453,687,320
1970 560,975,120

A drop from 560,975,120 board feet to 44,076,800 board feet looks like the Forestry Service is moving in the right direction.

But you don’t have to take my word for it. Unlike the Sierra Club that wants to excite alarm without giving you the data to decide for yourself, I have included links with the data I cite and data I don’t. Explore the data on your own.

I say the Sierra Club propaganda is “ineffectual” because it leaves you with no clue as to who is logging in Tongass?

Once again the Forestry Service rides to the rescue with Timber Volume Under Contract (sorry, no separate hyperlink from Forest Management Reports and Accomplishments), but look for it on that page and I picked Current Calendar Year Through: (select Jan).

That returns a spreadsheet that lists (among other things), ranger district, unit ID, contract form, purchaser, etc.

A word about MBF. The acronym MBF stands for thousand, as in Roman numberals, M = 1,000. So to read line 4, which starts with Ranger District “Thorne Bay,” read across to “Current Qty Est (MBF)”, the entry “6.00” represents 6,000 board feet. Thus, line 23, starts with “Juneau,” and “Current Qty Est (MBF)”, reads “3,601.00” represents 3,601,000 board feet. And so on. (I would have never guess that meaning without assistance from the forestry service.)

The Sierra Club leaves you with no clue as to who is harvesting the timber?, who is purchasing the timber from the harvesters?, who is using the timber for what products?, etc. The second and third steps removed the Forestry Service can’t provide but the harvesters gives you a starting point for further research.

A starting point for further research enables actions like boycotts of products made from Tongass timber, choosing products NOT made from Tongass timber and a whole host of other actions.

Oh, but none of those require you to be a member of the Sierra Club. My bad, it’s your dues and not the fate of the Tongass that is at issue.

If the Sierra Club wants to empower consumers, it should provide links to evidence about the Tongass that consumers can use to develop more evidence and effective means of reducing the demand for Tongass timber.

BTW, I’m not an anti-environmentalist. All new factory construction should be underground in negative-pressure enclaves where management is required to breath the same air as all workers. No discharges of any kind that don’t match the outside environment prior to its construction.

That would spur far better pollution control than any EPA regulation.

Republican Regime Creates New Cyber Market – Burner Twitter/Facebook Accounts

Thursday, February 9th, 2017

The current Republican regime has embarked upon creating a new cyber market, less than a month after taking office.

Samatha Dean (Tech Times) reports:

Planning a visit to the U.S.? Your passport is not the only thing you may have to turn in at the immigration counter, be prepared to relinquish your social media account passwords as well to the border security agents.

That’s right! According to a new protocol from the Homeland Security that is under consideration, visitors to the U.S. may have to give their Twitter and Facebook passwords to the border security agents.

The news comes close on the heels of the Trump administration issuing the immigration ban, which resulted in a massive state of confusion at airports, where several people were debarred from entering the country.

John F. Kelly, the Homeland Security Secretary, shared with the Congress on Feb. 7 that the Trump administration was considering this option. The measure was being weighed as a means to sieve visa applications and sift through refugees from the Muslim majority countries that are under the 90-day immigration ban.

I say burner Twitter/Facebook accounts, if you plan on making a second trip to the US, you will need to have the burner accounts maintained over the years.

The need for burner Twitter/Facebook accounts, ones you can freely disclose to border security agents, presents a wide range of data science issues.

In no particular order:

  • Defeating Twitter/Facebook security on a large scale. Not trivial but not the hard part either
  • Creating accounts with the most common names
  • Automated posting to accounts in their native language
  • Posts must be indistinguishable from human user postings, i.e., no auto-retweets of Sean Spicer
  • Profile of tweets/posts shows consistent usage

I haven’t thought about burner bank account details before but that certainly should be doable. Especially if you have a set of banks on the Net that don’t have much overhead but exist to keep records one to the other.

Burner bank accounts could be useful to more than just travelers to the United States.

Kudos to the new Republican regime and their market creation efforts!

Three More Reasons To Learn R

Friday, January 6th, 2017

Three reasons to learn R today by David Smith.

From the post:

If you're just getting started with data science, the Sharp Sight Labs blog argues that R is the best data science language to learn today.

The blog post gives several detailed reasons, but the main arguments are:

  1. R is an extremely popular (arguably the most popular) data progamming language, and ranks highly in several popularity surveys.
  2. Learning R is a great way of learning data science, with many R-based books and resources for probability, frequentist and Bayesian statistics, data visualization, machine learning and more.
  3. Python is another excellent language for data science, but with R it's easier to learn the foundations.

Once you've learned the basics, Sharp Sight also argues that R is also a great data science to master, even though it's an old langauge compared to some of the newer alternatives. Every tool has a shelf life, but R isn't going anywhere and learning R gives you a foundation beyond the language itself.

If you want to get started with R, Sharp Sight labs offers a data science crash course. You might also want to check out the Introduction to R for Data Science course on EdX.

Sharp Sight Labs: Why R is the best data science language to learn today, and Why you should master R (even if it might eventually become obsolete)

If you need more reasons to learn R:

  • Unlike Facebook, R isn’t a sinkhole of non-testable propositions.
  • Unlike Instagram, R is rarely NSFW.
  • Unlike Twitter, R is a marketable skill.

Glad to hear you are learning R!

Defeating “Fake News” Without Mark Zuckerberg

Sunday, January 1st, 2017

Despite a lack of proof that “fake news” is a problem, Mark Zuckerberg and others, have taken up the banner of public censors on behalf of us all. Whether any of us are interested in their assistance or not.

In countering calls for and toleration of censorship, you may find it helpful to point out that “fake news” isn’t new.

There are any number of spot instances of fake news. Michael J. Socolow reports in: Reporting and punditry that escaped infamy:


As the day wore on, real reporting receded, giving way to more speculation. Right-wing commentator Fulton Lewis Jr. told an audience five hours after the attack that he shared the doubts of many American authorities that the Japanese were truly responsible. He “reported” that US military officials weren’t convinced Japanese pilots had the skills to carry out such an impressive raid. The War Department, he said, is “concerned to find out who the pilots of these planes are—whether they are Japanese pilots. There is some doubt as to that, some skepticism whether they may be pilots of some other nationality, perhaps Germans, perhaps Italians,” he explained. The rumor that Germans bombed Pearl Harbor lingered on the airwaves, with NBC reporting, on December 8, that eyewitnesses claimed to have seen Nazi swastikas painted on some of the bombers.

You may object that it was much confusion, the pundits weren’t trying to deceive, any number of other excuses. And you can repeat those for other individual instances of “fake news.” They simply don’t compare to the flood of intentionally “fake” publications available today.

I disagree but point taken. Let’s look back to an event that, like the internet, enabled a comparative flood of information to be available to readers, the invention of the printing press.

Elizabeth Eisenstein in The Printing Revolution in Early Modern Europe characterizes the output of the first fifty years of printing presses saying:

…it seems necessary to qualify the assertion that the first half-century of printing gave “a great impetus to wide dissemination of accurate knowledge of the sources of Western thought, both classical and Christian.” The duplication of the hermetic writings, the sibylline prophecies, the hieroglyphics of “Horapollo” and many other seemingly authoritative, actually fraudulent esoteric writings worked in the opposite direction, spreading inaccurate knowledge even while paving the way for purification of Christian sources later on.
…(emphasis added) (page 48)

I take Eisenstein to mean that knowingly fraudulent materials were being published, which seems to be the essence of the charge against the authors of “fake news” today.

As far as the quantity of the printing press equivalent to “fake news,” she remarks:


Compared to the large output of unscholarly venacular materials, the number of trilingual dictionaries and Greek or even Latin editions seems so small that one wonders whether the term “wide dissemination” ought to be applied to the latter case at all.
… (page 48)

To be fair, “unscholarly venacular materials” includes both intended to be accurate as well as “fake” texts.

The Printing Revolution in Early Modern Europe is the abridged version of Eisentein’s The printing press as an agent of change : communications and cultural transformations in early modern Europe, which has the footnotes and references to enable more precision on early production figures.

Suffice it to say, however, that no 15th equivalent to Mark Zuckerberg arrived upon the scene to save everyone from “…actually fraudulent esoteric writings … spreading inaccurate knowledge….

The world didn’t need Mark Zuckerberg’s censoring equivalent in the 15th century and it doesn’t need him now.

Facebook’s Censoring Rules (Partial)

Wednesday, December 21st, 2016

Facebook’s secret rules of deletion by Till Krause and Hannes Grassegger.

From the post:

Facebook refuses to disclose the criteria that deletions are based on. SZ-Magazin has gained access to some of these rules. We show you some excerpts here – and explain them.

Introductory words

These are excerpts of internal documents that explain to content moderators what they need to do. To protect our sources, we have made visual edits to maintain confidentiality. While the rules are constantly changing, these documents provide the first-ever insights into the current guidelines that Facebook applies to delete contents.

Insight into a part of the byzantine world of Facebook deletion/censorship rules.

Pointers to more complete leaks of Facebook rules please!

Achtung! Germany Hot On The Censorship Trail

Tuesday, December 20th, 2016

Germany threatens to fine Facebook €500,000 for each fake news post by Mike Murphy.

Mike reports that fears are spreading that fake news could impact German parliamentary elections set for 2017.

One source of those fears is the continued sulking of Clinton campaign staff who fantasize that “fake news” cost Sec. Clinton the election.

Anything is possible as they say but to date, other than accusations of fake news impacting the election, between sobs and sighs, there has been no proof offered that “fake news” or otherwise had any impact on the election at all.

Do you seriously think the “fake news” that the Pope had endorsed Trump impacted the election? Really?

If “fake news” something other than an excuse for censorship (United States, UK, Germany, etc.), collect the “fake news” stories that you claim impacted the election.

Measure the impact of that “fake news” on volunteers following standard social science protocols.

Or do “fake news” criers fear the factual results of such a study?

PS: If you realize that “fake news,” isn’t something new but quite tradition, you will enjoy ‘Fake News’ in America: Homegrown, and Far From New by Chris Hedges.

Facebook Patents Tool To Think For You

Wednesday, December 7th, 2016

My apologies but Facebook thinks you are too stupid to detect “fake news.” Facebook will compensate for your stupidity with a process submitted for a US patent. For free!

Facebook is patenting a tool that could help automate removal of fake news by Casey Newton.

From the post:

As Facebook works on new tools to stop the spread of misinformation on its network, it’s seeking to patent technology that could be used for that purpose. This month the US Trademark and Patent Office published Facebook’s application for Patent 0350675: “systems and methods to identify objectionable content.” The application, which was filed in June 2015, describes a sophisticated system for identifying inappropriate text and images and removing them from the network.

As described in the application, the primary purpose of the tool is to improve the detection of pornography, hate speech, and bullying. But last month, Zuckerberg highlighted the need for “better technical systems to detect what people will flag as false before they do it themselves.” The patent published Thursday, which is still pending approval, offers some ideas for how such a system could work.

A Facebook spokeswoman said the company often seeks patents for technology that it never implements, and said this patent should not be taken as an indication of the company’s future plans. The spokeswoman declined to comment on whether it was now in use.

The system described in the application is largely consistent with Facebook’s own descriptions of how it currently handles objectionable content. But it also adds a layer of machine learning to make reporting bad posts more efficient, and to help the system learn common markers of objectionable content over time — tools that sound similar to the anticipatory flagging that Zuckerberg says is needed to combat fake news.

If you substitute “user” for “administrator” where it appears in the text, Facebook would be enabling users to police the content they view.

Why Facebook finds users making decisions about the content they view objectionable isn’t clear. Suggestions on that question?

The process doesn’t appear to be either accountable and/or transparent.

If I can’t see the content that is removed by Facebook, how do I make judgments about why it was removed and/or how that compares to content about to be uploaded to Facebook?

Urge Facebook users to demand empowering them to make decisions about the content they view.

Urge Facebook shareholders to pressure management to abandon this quixotic quest to be an internet censor.

China Gets A Facebook Filter, But Not You

Thursday, November 24th, 2016

Facebook ‘quietly developing censorship tool’ for China by Bill Camarda.

From the post:


That’s one take on the events that might have led to today’s New York Times expose: it seems Facebook has tasked its development teams with “quietly develop[ing] software to suppress posts from appearing in people’s news feeds in specific geographic areas”.

As “current and former Facebook employees” told the Times, Facebook wouldn’t do the suppression themselves, nor need to. Rather:

It would offer the software to enable a third party – in this case, most likely a partner Chinese company – to monitor popular stories and topics that bubble up as users share them across the social network… Facebook’s partner would then have full control to decide whether those posts should show up in users’ feeds.

This is a step beyond the censorship Facebook has already agreed to perform on behalf of governments such as Turkey, Russia and Pakistan. In those cases, Facebook agreed to remove posts that had already “gone live”. If this software were in use, offending posts could be halted before they ever appeared in a local user’s news feed.

You can’t filter your own Facebook timeline or share your filter with other Facebook users, but the Chinese government can filter the timelines of 721,000,000+ internet users?

My proposal for Facebook filters would generate income for Facebook, filter writers and enable the 3,600,000,000+ internet users around the world to filter their own content.

All of Zuckerberg’s ideas:

Stronger detection. The most important thing we can do is improve our ability to classify misinformation. This means better technical systems to detect what people will flag as false before they do it themselves.

Easy reporting. Making it much easier for people to report stories as fake will help us catch more misinformation faster.

Third party verification. There are many respected fact checking organizations and, while we have reached out to some, we plan to learn from many more.

Warnings. We are exploring labeling stories that have been flagged as false by third parties or our community, and showing warnings when people read or share them.

Related articles quality. We are raising the bar for stories that appear in related articles under links in News Feed.

Disrupting fake news economics. A lot of misinformation is driven by financially motivated spam. We’re looking into disrupting the economics with ads policies like the one we announced earlier this week, and better ad farm detection.

Listening. We will continue to work with journalists and others in the news industry to get their input, in particular, to better understand their fact checking systems and learn from them.

Enthrone Zuckerman as Censor of the Internet.

His blinding lust to be Censor of the Internet*, is responsible for Zuckerman passing up $millions if not $billions in filtering revenue.

Facebook shareholders should question this loss of revenue at every opportunity.

* Zuckerberg’s “lust” to be “Censor of the Internet” is an inference based on the Facebook centered nature of his “ideas” for dealing with “fake news.” Unpaid censorship instead of profiting from user-centered filtering is a sign of poor judgment and/or madness.

Preserving Ad Revenue With Filtering (Hate As Renewal Resource)

Monday, November 21st, 2016

Facebook and Twitter haven’t implemented robust and shareable filters for their respective content streams for fear of disturbing their ad revenue streams.* The power to filter feared as the power to exclude ads.

Other possible explanations include: Drone employment, old/new friends hired to discuss censoring content; Hubris, wanting to decide what is “best” for others to see and read; NIH (not invented here), which explains silence concerning my proposals for shareable content filters; others?

* Lest I be accused of spreading “fake news,” my explanation for the lack of robust and shareable filters on content on Facebook and Twitter is based solely on my analysis of their behavior and not any inside leaks, etc.

I have a solution for fearing filters as interfering with ad revenue.

All Facebook posts and Twitter tweets, will be delivered with an additional Boolean field, ad, which defaults to true (empty field), meaning the content can be filtered. (following Clojure) When the field is false, that content cannot be filtered.

Filters being registered and shared via Facebook and Twitter, testing those filters for proper operation (and not applying them if they filter ad content) is purely an algorithmic process.

Users pay to post ad content, a step where the false flag can be entered, resulting in no more ad freeloaders being free from filters.

What’s my interest? I’m interested in the creation of commercial filters for aggregation, exclusion and creating a value-add product based on information streams. Moreover, ending futile and bigoted attempts at censorship seems like a worthwhile goal to me.

The revenue potential for filters is nearly unlimited.

The number of people who hate rivals the number who want to filter the content seen by others. An unrestrained Facebook/Twitter will attract more hate and “fake news,” which in turn will drive a great need for filters.

Not a virtuous cycle but certainly a profitable one. Think of hate and the desire to censor as renewable resources powering that cycle.

PS: I’m not an advocate for hate and censorship but they are both quite common. Marketing is based on consumers as you find them, not as you wish they were.

Successful Hate Speech/Fake News Filters – 20 Facts About Facebook

Friday, November 18th, 2016

After penning Monetizing Hate Speech and False News yesterday, I remembered non-self-starters will be asking:

Where are examples of successful monetized filters for hate speech and false news?

Of The Top 20 Valuable Facebook Statistics – Updated November 2016, I need only two to make the case for monetized filters.

1. Worldwide, there are over 1.79 billion monthly active Facebook users (Facebook MAUs) which is a 16 percent increase year over year. (Source: Facebook as of 11/02/16)

15. Every 60 seconds on Facebook: 510 comments are posted, 293,000 statuses are updated, and 136,000 photos are uploaded. (Source: The Social Skinny)

(emphasis in the original)

By comparison, Newsonomics: 10 numbers on The New York Times’ 1 million digital-subscriber milestone [2015], the New York Times has 1 million digital subscribers.

If you think about it, the New York Times is a hate speech/fake news filter, although it has a much smaller audience than Facebook.

Moreover, the New York Times is spending money to generate content whereas on Facebook, content is there for the taking or filtering.

If the New York Times can make money as a filter for hate speech/fake news carrying its overhead, imagine the potential for profit from simply filtering content generated and posted by others. Across a market of 1.79 billion viewers. Where “hate,” and “fake” varies from audience to audience.

Content filters at Facebook and the ability to “follow” those filters for on timelines is all that is missing. (And Facebook monetizing the use of those filters.)

Petition Mark Zuckerberg and Facebook for content filters today!

Monetizing Hate Speech and False News

Thursday, November 17th, 2016

Eli Pariser has started If you were Facebook, how would you reduce the influence of fake news? on GoogleDocs.

Out of the now seventeen pages of suggestions, I haven’t noticed any that promise a revenue stream to Facebook.

I view ideas to filter “false news” and/or “hate speech” that don’t generate revenue for Facebook as non-starters. I suspect Facebook does as well.

Here is a broad sketch of how Facebook can monetize “false news” and “hate speech,” all while shaping Facebook timelines to diverse expectations.

Monetizing “false news” and “hate speech”

Facebook creates user defined filters for their timelines. Filters can block other Facebook accounts (and any material from them), content by origin, word and I would suggest, regex.

User defined filters apply only to that account and can be shared with twenty other Facebooks users.

To share a filter with more than twenty other Facebook users, Facebook charges an annual fee, scaled on the number of shares.

Unlike the many posts on “false news” and “hate speech,” being a filter isn’t free beyond twenty other users.

Selling Subscriptions to Facebook Filters

Organizations can sell subscriptions to their filters, Facebook, which controls the authorization of the filters, contracts for a percentage of the subscription fee.

Pro tip: I would not invoke Facebook filters from the Washington Post and New York Times at the same time. It is likely they exclude each other as news sources.

Advantages of Monetizing Hate Speech and False News

First and foremost for Facebook, it gets out of the satisfying every point of view game. Completely. Users are free to define as narrow or as broad a point of view as they desire.

If you see something you don’t like, disagree with, etc., don’t complain to Facebook, complain to your Facebook filter provider.

That alone will expose the hidden agenda behind most, perhaps not all, of the “false news” filtering advocates. They aren’t concerned with what they are seeing on Facebook but they are very concerned with deciding what you see on Facebook.

For wannabe filters of what other people see, beyond twenty other Facebook users, that privilege is not free. Unlike the many proposals with as many definitions of “false news” as appear in Eli’s document.

It is difficult to imagine a privilege people would be more ready to pay for than the right to attempt to filter what other people see. Churches, social organizations, local governments, corporations, you name them and they will be lining up to create filter lists.

The financial beneficiary of the “drive to filter for others” is of course Facebook but one could argue the filter owners profit by spreading their worldview and the unfortunates that follow them, well, they get what they get.

Commercialization of Facebook filters, that is selling subscriptions to Facebook filters creates a new genre of economic activity and yet another revenue stream for Facebook. (That two up to this point if you are keeping score.)

It isn’t hard to imagine the Economist, Forbes, professional clipping services, etc., creating a natural extension of their filtering activities onto Facebook.

Conclusion: Commercialization or Unfunded Work Assignments

Preventing/blocking “hate speech” and “false news,” for free has been, is and always will be a failure.

Changing Facebook infrastructure isn’t free and by creating revenue streams off of preventing/blocking “hate speech” and “false news,” creates incentives for Facebook to make the necessary changes and for people to build filters off of which they can profit.

Not to mention that filtering enables everyone, including the alt-right, alt-left and the sane people in between, to create the Facebook of their dreams, and not being subject to the Facebook desired by others.

Finally, it gets Facebook and Mark Zuckerberg out of the fantasy island approach where they are assigned unpaid work by others. New York Times, Mark Zuckerberg Is in Denial. (It’s another “hit” piece by Zeynep Tufekci.)

If you know Mark Zuckerberg, please pass this along to him.

Hacking Any Facebook Account – SS7 Weakness

Friday, June 17th, 2016

How to Hack Someones Facebook Account Just by Knowing their Phone Numbers by Swati Khandelwal.

From the post:

Hacking Facebook account is one of the major queries on the Internet today. It’s hard to find — how to hack Facebook account, but researchers have just proven by taking control of a Facebook account with only the target’s phone number and some hacking skills.

Yes, your Facebook account can be hacked, no matter how strong your password is or how much extra security measures you have taken. No joke!

Hackers with skills to exploit the SS7 network can hack your Facebook account. All they need is your phone number.

The weaknesses in the part of global telecom network SS7 not only let hackers and spy agencies listen to personal phone calls and intercept SMSes on a potentially massive scale but also let them hijack social media accounts to which you have provided your phone number.

Swati’s post has the details and a video of the hack in action.

Of greater interest than hacking Facebook accounts, however, is the weakness in the SS7 network. Hacking Facebook accounts is good for intelligence gathering, annoying the defenseless, etc., but fundamental weaknesses in telecom network is something different.

Swaiti quotes a Facebook clone as saying:

“Because this technique [SSL exploitation] requires significant technical and financial investment, it is a very low risk for most people,”

Here’s the video from Swati’s post (2:42 in length):

Having watched it, can you point out the “…significant technical and financial investment…” involved in that hack?

What investment would you make for a hack that opens up Gmail, Twitter, WhatsApp, Telegram, Facebook, any service that uses SMS, to attack?

Definitely a hack for your intelligence gathering toolkit.

Is Conduct/Truth A Defense to Censorship?

Saturday, February 27th, 2016

While Twitter sets up its Platonic panel of censors (Plato’s Republic, Books 2/3)*, I am wondering if conduct/truth will be a defense to censorship for accounts that make positive posts about the Islamic State?

I ask because of a message suggesting accounts (Facebook?) might be suspended for posts following these rules:

  • Do no use foul language and try to not get in a fight with people
  • Do not write too much for people to read
  • Make your point easy as not everyone has the same knowledge as you about the Islamic state and/or Islam
  • Use a VPN…
  • Use an account that you don’t really need because this is like a martydom operation, your account will probably be banned
  • Post images supporting the Islamic state
  • Give positive facts about the Islamic state
  • Share Islamic state video’s that show the mercy and kindness of the Islamic state towards Muslims, and/or showing Muslim’s support towards the Islamic state. Or any videos that will attract people to the Islamic state
  • Prove rumors about the Islamic state false
  • Give convincing Islamic information about topics discussed like the legitimacy of the khilafa, killing civilians of the kuffar, the takfeer made on Arab rules, etc.
  • Or simply just post a short quick comment showing your support like “dawlat al Islam baqiaa” or anything else (make sure ppl can understand it
  • Remember to like all the comments you see that are supporting the Islamic state with all your accounts!

Posted (but not endorsed) by J. Faraday on 27 February 2016.

If we were to re-cast those as rule of conduct, non-Islamic State specific, where N is the issue under discussion:

  • Do no use foul language and try to not get in a fight with people
  • Do not write too much for people to read
  • Make your point easy [to understand] as not everyone has the same knowledge as you about N
  • Post images supporting N
  • Give positive facts about N
  • Share N videos that show the mercy and kindness of N, and/or showing A support towards N. Or any videos that will attract people to N
  • Prove rumors about N false
  • Give convincing N information about topics discussed
  • Or simply just post a short quick comment showing your support or anything else (make sure ppl can understand it
  • Remember to like all the comments you see that are supporting N with all your accounts!

Is there something objectionable about those rules when N = Islamic State?

As far as being truthful, say for example claims by the Islamic State that Arab governments are corrupt, we can’t use a corruption index that lists Qatar at #22 (Denmark is #1 as the least corrupt) and Saudi Arabia at #48, when Bloomberg lists Qatar and Saudi Arabia as scoring zero (0) on budget transparency.

There are more corrupt governments than Qatar and Saudi Arabia, the failed state of Somalia for example, and perhaps the Sudan. Still, I wouldn’t ban anyone for saying both Qatar and Saudi Arabia are cesspools of corruption. They don’t match the structural corruption in Washington, D.C. but it isn’t for lack of trying.

Here the question for Twitter’s Platonic guardians (Trust and Safety Council):

Can an account that follows the rules of behavior outlined above be banned for truthful posts?

I think we all know the answer but I’m interested in seeing if Twitter will admit to censoring factually truthful information.

* Someone very dear to me objected to my reference to Twitterists (sp?) as Stalinists. It was literary hyperbole and so not literally true. Perhaps “Platonic guardians” will be more palatable. Same outcome, just a different moniker.

Kindermädchen (Nanny) Court Protects Facebook Users – Hunting Down Original Sources

Friday, January 22nd, 2016

Facebook’s Friend Finder found unlawful by Germany’s highest court by Lisa Vaas.

From the post:

Reuters reports that a panel of the Federal Court of Justice has ruled that Facebook’s Friend Finder feature, used to encourage users to market the social media network to their contacts, constituted advertising harassment in a case that was filed in 2010 by the Federation of German Consumer Organisations (VZBV).

Friends Finder asks users for permission to snort the e-mail addresses of their friends or contacts from their address books, thereby allowing the company to send invitations to non-Facebook users to join up.

There was a time when German civil law and the reasoning of its courts were held in high regard. I regret to say it appear that may not longer be the case.

This decision on Facebook asking users to spread the use of Facebook being a good example.

From the Reuters account, it appears that sending of unsolicited email is the key to the court’s decision.

It’s difficult to say much more about the court’s decision because finding something other than re-tellings of the Reuters report is difficult.

You can start with the VZBV press release on the decision: Wegweisendes BGH-Urteil: Facebooks Einladungs-E-Mails waren unlautere Werbung, but it too is just a summary.

Unlike the Reuters report, it at least has: Auf anderen Webseiten Pressemitteilung des BGH, which takes you to: Bundesgerichtshof zur Facebook-Funktion “Freunde finden,” a press release by the court about its decision. 😉

The court’s press release offers: Siehe auch: Urteil des I. Zivilsenats vom 14.1.2016 – I ZR 65/14 –, which links to a registration facility to subscribe for a notice of the opinion of the court when it is published.

No promises on when the decision will appear. I subscribed today, January 22nd and the decision was made on January 14, 2016.

I did check Aktuelle Entscheidungen des Bundesgerichtshofes (recent decisions), but it refers you back to the register for the opinion to appear in the future.

Without the actual decision, it’s hard to tell if the court is unaware of the “delete” key on German keyboards or if there is some other reason to inject itself into a common practice on social media sites.

I will post a link to the decision when it becomes available. (The German court makes its decisions available for free to the public and charges a document fee for profit making services, or so I understand the terms of the site.)

PS: For journalists, researchers, bloggers, etc. I consider it a best practice to always include pointers to original sources.

PPS: The German keyboard does include a delete key (Entf) if you had any doubts:

880px-German-T2-Keyboard-Prototype-May-2012

(Select the image to display a larger version.)

Exposure to Diverse Information on Facebook [Skepticism]

Saturday, May 9th, 2015

Exposure to Diverse Information on Facebook by Eytan Bakshy, Solomon Messing, Lada Adamicon.

From the post:

As people increasingly turn to social networks for news and civic information, questions have been raised about whether this practice leads to the creation of “echo chambers,” in which people are exposed only to information from like-minded individuals [2]. Other speculation has focused on whether algorithms used to rank search results and social media posts could create “filter bubbles,” in which only ideologically appealing content is surfaced [3].

Research we have conducted to date, however, runs counter to this picture. A previous 2012 research paper concluded that much of the information we are exposed to and share comes from weak ties: those friends we interact with less often and are more likely to be dissimilar to us than our close friends [4]. Separate research suggests that individuals are more likely to engage with content contrary to their own views when it is presented along with social information [5].

Our latest research, released today in Science, quantifies, for the first time, exactly how much individuals could be and are exposed to ideologically diverse news and information in social media [1].

We found that people have friends who claim an opposing political ideology, and that the content in peoples’ News Feeds reflect those diverse views. While News Feed surfaces content that is slightly more aligned with an individual’s own ideology (based on that person’s actions on Facebook), who they friend and what content they click on are more consequential than the News Feed ranking in terms of how much diverse content they encounter.

The Science paper: Exposure to Ideologically Diverse News and Opinion

The definition of an “echo chamber” is implied in the authors’ conclusion:


By showing that people are exposed to a substantial amount of content from friends with opposing viewpoints, our findings contrast concerns that people might “list and speak only to the like-minded” while online [2].

The racism of the Deep South existed in spite of interaction between whites and blacks. So “echo chamber” should not be defined as association of like with like, at least not entirely. The Deep South was a echo chamber of racism but not for a lack of diversity in social networks.

Besides lacking a useful definition of “echo chamber,” the author’s ignore the role of confirmation bias (aka “backfire effect”) when confronted with contrary thoughts or evidence. To some readers seeing a New York Times editorial disagreeing with their position, can make them feel better about being on the “right side.”

That people are exposed to diverse information on Facebook is interesting, but until there is a meaningful definition of “echo chambers,” the role Facebook plays in the maintenance of “echo chambers” remains unknown.

Bias? What Bias?

Monday, March 16th, 2015

Scientists Warn About Bias In The Facebook And Twitter Data Used In Millions Of Studies by Brid-Aine Parnell.

From the post:

Social media like Facebook and Twitter are far too biased to be used blindly by social science researchers, two computer scientists have warned.

Writing in today’s issue of Science, Carnegie Mellon’s Juergen Pfeffer and McGill’s Derek Ruths have warned that scientists are treating the wealth of data gathered by social networks as a goldmine of what people are thinking – but frequently they aren’t correcting for inherent biases in the dataset.

If folks didn’t already know that scientists were turning to social media for easy access to the pat statistics on thousands of people, they found out about it when Facebook allowed researchers to adjust users’ news feeds to manipulate their emotions.

Both Facebook and Twitter are such rich sources for heart pounding headlines that I’m shocked, shocked that anyone would suggest there is bias in the data! 😉

Not surprisingly, people participate in social media for reasons entirely of their own and quite unrelated to the interests or needs of researchers. Particular types of social media attract different demographics than other types. I’m not sure how you could “correct” for those biases, unless you wanted to collect better data for yourself.

Not that there are any bias free data sets but some are so obvious that it hardly warrants mentioning. Except that institutions like the Brookings Institute bump and grind on Twitter data until they can prove the significance of terrorist social media. Brookings knows better but terrorism is a popular topic.

Not to make data carry all the blame, the test most often applied to data is:

Will this data produce a result that merits more funding and/or will please my supervisor?

I first saw this in a tweet by Persontyle.

Airbnb open sources SQL tool built on Facebook’s Presto database

Friday, March 6th, 2015

Airbnb open sources SQL tool built on Facebook’s Presto database by Derrick Harris.

From the post:

Apartment-sharing startup Airbnb has open sourced a tool called Airpal that the company built to give more of its employees access to the data they need for their jobs. Airpal is built atop the Presto SQL engine that Facebook created in order to speed access to data stored in Hadoop.

Airbnb built Airpal about a year ago so that employees across divisions and roles could get fast access to data rather than having to wait for a data analyst or data scientist to run a query for them. According to product manager James Mayfield, it’s designed to make it easier for novices to write SQL queries by giving them access to a visual interface, previews of the data they’re accessing, and the ability to share and reuse queries.

It sounds a little like the types of tools we often hear about inside data-driven companies like Facebook, as well as the new SQL platform from a startup called Mode.

At this point, Mayfield said, “Over a third of all the people working at Airbnb have issued a query through Airpal.” He added, “The learning curve for SQL doesn’t have to be that high.”

From the GitHub page:

Airpal is a web-based, query execution tool which leverages Facebook’s PrestoDB to make authoring queries and retrieving results simple for users. Airpal provides the ability to find tables, see metadata, browse sample rows, write and edit queries, then submit queries all in a web interface. Once queries are running, users can track query progress and when finished, get the results back through the browser as a CSV (download it or share it with friends). The results of a query can be used to generate a new Hive table for subsequent analysis, and Airpal maintains a searchable history of all queries run within the tool.

Features

  • Optional Access Control
  • Syntax highlighting
  • Results exported to a CSV for download or a Hive table
  • Query history for self and others
  • Saved queries
  • Table finder to search for appropriate tables
  • Table explorer to visualize schema of table and first 1000 rows

Requirements

  • Java 7 or higher
  • MySQL database
  • Presto 0.77 or higher
  • S3 bucket (to store CSVs)
  • Gradle 2.2 or higher

I understand to some degree the need to make SQL “simpler” but fail to see how simpler controls translate into a solution. The controls may be obvious enough but if I don’t know the semantics of the column headers, the simplicity of the interface won’t be terribly helpful.

Or to put it another way, users seem to be assumed to know the semantics of the tables they encounter. True/False?

Facebook open sources tools for bigger, faster deep learning models

Saturday, January 17th, 2015

Facebook open sources tools for bigger, faster deep learning models by Derrick Harris.

From the post:

Facebook on Friday open sourced a handful of software libraries that it claims will help users build bigger, faster deep learning models than existing tools allow.

The libraries, which Facebook is calling modules, are alternatives for the default ones in a popular machine learning development environment called Torch, and are optimized to run on Nvidia graphics processing units. Among the modules are those designed to rapidly speed up training for large computer vision systems (nearly 24 times, in some cases), to train systems on potentially millions of different classes (e.g., predicting whether a word will appear across a large number of documents, or whether a picture was taken in any city anywhere), and an optimized method for building language models and word embeddings (e.g., knowing how different words are related to each other).

“‘[T]here is no way you can use anything existing” to achieve some of these results, said Soumith Chintala, an engineer with Facebook Artificial Intelligence Research.

How very awesome! Keeping abreast of the latest releases and papers on deep learning is turning out to be a real chore. Enjoyable but a time sink none the less.

Derrick’s post and the release from Facebook have more details.

Apologies for the “lite” posting today but I have been proofing related specifications where one defines a term and the other uses the term, but doesn’t cite the other specification’s definition or give its own. Do those mean the same thing? Probably the same thing but users outside the process may or may not realize that. Particularly in translation.

I first saw this in a tweet by Kirk Borne.

Everything You Need To Know About Social Media Search

Sunday, December 14th, 2014

Everything You Need To Know About Social Media Search by Olsy Sorokina.

From the post:

For the past decade, social networks have been the most universally consistent way for us to document our lives. We travel, build relationships, accomplish new goals, discuss current events and welcome new lives—and all of these events can be traced on social media. We have created hashtags like #ThrowbackThursday and apps like Timehop to reminisce on all the past moments forever etched in the social web in form of status updates, photos, and 140-character phrases.

Major networks demonstrate their awareness of the role they play in their users’ lives by creating year-end summaries such as Facebook’s Year in Review, and Twitter’s #YearOnTwitter. However, much of the emphasis on social media has been traditionally placed on real-time interactions, which often made it difficult to browse for past posts without scrolling down for hours on end.

The bias towards real-time messaging has changed in a matter of a few days. Over the past month, three major social networks announced changes to their search functions, which made finding old posts as easy as a Google search. If you missed out on the news or need a refresher, here’s everything you need to know.

I suppose Olsy means in addition to search in general sucking.

Interested tidbit on Facebook:


This isn’t Facebook’s first attempt at building a search engine. The earlier version of Graph Search gave users search results in response to longer-form queries, such as “my friends who like Game of Thrones.” However, the semantic search never made it to the mobile platforms; many supposed that using complex phrases as search queries was too confusing for an average user.

Does anyone have any user research on the ability of users to use complex phrases as search queries?

I ask because if users have difficulty authoring “complex” semantics and difficulty querying with “complex” semantics, it stands to reason they may have difficulty interpreting “complex” semantic results. Yes?

If all three of those are the case, then how do we impart the value-add of “complex” semantics without tripping over one of those limitations?

Osly also covers Instagram and Twitter. Twitter’s advanced search looks like the standard include/exclude, etc. type of “advanced” search. “Advanced” maybe forty years ago in the early OPACs but not really “advanced” now.

Catch up on these new search features. They will provide at least a minimum of grist for your topic map mill.

Introducing osquery

Wednesday, October 29th, 2014

Introducing osquery by Mike Arpaia.

From the post:

Maintaining real-time insight into the current state of your infrastructure is important. At Facebook, we’ve been working on a framework called osquery which attempts to approach the concept of low-level operating system monitoring a little differently.

Osquery exposes an operating system as a high-performance relational database. This design allows you to write SQL-based queries efficiently and easily to explore operating systems. With osquery, SQL tables represent the current state of operating system attributes, such as:

  • running processes
  • loaded kernel modules
  • open network connections

SQL tables are implemented via an easily extendable API. Several tables already exist and more are being written. To best understand the expressiveness that is afforded to you by osquery, consider the following examples….

I haven’t installed osquery, yet, but suspect that most of the data it collects is available now through a variety of admin tools. But not through a single tool that enables you to query across tables to combine that data. That is the part that intrigues me.

Code and documentation on Github.

Facebook teaches you exploratory data analysis with R

Monday, May 12th, 2014

Facebook teaches you exploratory data analysis with R by David Smith.

From the post:

Facebook is a company that deals with a lot of data — more than 500 terabytes a day — and R is widely used at Facebook to visualize and analyze that data. Applications of R at Facebook include user behaviour, content trends, human resources and even graphics for the IPO prospectus. Now, four R users at Facebook (Moira Burke, Chris Saden, Dean Eckles and Solomon Messing) share their experiences using R at Facebook in a new Udacity on-line course, Exploratory Data Analysis.

The more data you explore, the better data explorer you will be!

Enjoy!

I first saw this in a post by David Smith.

“Credibility” As “Google Killer”?

Sunday, May 4th, 2014

Nancy Baym tweets: “Nice article on flaws of ”it’s not our fault, it’s the algorithm” logic from Facebook with quotes from @TarletonG” pointing to: Facebook draws fire on ‘related articles’ push.

From the post:

A surprise awaited Facebook users who recently clicked on a link to read a story about Michelle Obama’s encounter with a 10-year-old girl whose father was jobless.

Facebook responded to the click by offering what it called “related articles.” These included one that alleged a Secret Service officer had found the president and his wife having “S*X in Oval Office,” and another that said “Barack has lost all control of Michelle” and was considering divorce.

A Facebook spokeswoman did not try to defend the content, much of which was clearly false, but instead said there was a simple explanation for why such stories are pushed on readers. In a word: algorithms.

The stories, in other words, apparently are selected by Facebook based on mathematical calculations that rely on word association and the popularity of an article. No effort is made to vet or verify the content.

Facebook’s explanation, however, is drawing sharp criticism from experts who said the company should immediately suspend its practice of pushing so-called related articles to unsuspecting users unless it can come up with a system to ensure that they are credible. (emphasis added)

Just imagine the hue and outcry had that last line read:

Imaginary Quote Google’s explanation of search results, however, is drawing sharp criticism from experts who said the company should immediately suspend its practice of pushing so-called related articles to unsuspecting users unless it can come up with a system to ensure that they are credible. End Imaginary Quote

Is demanding “credibility” of search results the long sought after “Google Killer?”

“Credibility” is closely related to the “search” problem but I think it should be treated separately from search.

In part because the “credibility” question is one that can require multiple searches upon the author of search result content, searches for reviews and comments on search result content, searches of other sources of data on the content in the search result and then a collation of that additional content to make a credibility judgement on the search result content. The procedure isn’t always that elaborate but the main point is that it requires additional searching and evaluation of content to even begin to answer a credibility question.

Not to mention why the information is being sought has a bearing on credibility. If I want to find examples of nutty things said about President Obama to cite, then finding the cases mentioned above is not only relevant (the search question) but also “credible” in the sense that Facebook did not make they up. They are published nutty statements about the current President.

What if a user wanted to search for “coffee and bagels?” The top hit on one popular search engine today is: Coffee Meets Bagel: Free Online Dating Sites, along with numerous other links to information on the first link. Was this relevant to my search? No, but search results aren’t always predictable. They are relevant to someone’s search using “coffee and bagels.”

It is the responsibility of every reader to decide for themselves what is relevant, credible, useful, etc. in terms of content, whether it is hard copy or digital.

Any other solution takes us to Plato‘s Republic, which was great to read about, would not want to live there.

Faceboook Gets Smarter with Graph Engine Optimization

Saturday, April 12th, 2014

Faceboook Gets Smarter with Graph Engine Optimization by Alex Woodie.

From the post:

Last fall, the folks in Facebook’s engineering team talked about how they employed the Apache Giraph engine to build a graph on its Hadoop platform that can host more than a trillion edges. While the Graph Search engine is capable of massive graphing tasks, there were some workloads that remained outside the company’s technical capabilities–until now.

Facebook turned to the Giraph engine to power its new Graph Search offering, which it unveiled in January 2013 as a way to let users perform searches on other users to determine, for example, what kind of music their Facebook friends like, what kinds of food they’re into, or what activities they’ve done recently. An API for Graph Search also provides advertisers with a new revenue source for Facebook. It’s likely the world’s largest graph implementation, and a showcase of what graph engines can do.

The company picked Giraph because it worked on their existing Hadoop implementation, including HDFS and its MapReduce infrastructure stack (known as Corona). Compared to running the computation workload on Hive, an internal Facebook test of a 400-billion edge graph ran 126x faster on Giraph, and had a 26x performance advantage, as we explained in a Datanami story last year.

When Facebook scaled its internal test graph up to 1 trillion edges, they were able to keep the processing of each iteration of the graph under four minutes on a 200-server cluster. That amazing feat was done without any optimization, the company claimed. “We didn’t cheat,” Facebook developer Avery Ching declared in a video. “This is a random hashing algorithm, so we’re randomly assigning the vertices to different machines in the system. Obviously, if we do some separation and locality optimization, we can get this number down quite a bit.”

High level view with technical references on how Facebook is optimizing its Apache Giraph engine.

If you are interested in graphs, this is much more of a real world scenario than building “big” graphs out of uniform time slices.

WebScaleSQL

Thursday, March 27th, 2014

WebScaleSQL

From the webpage:

What is WebScaleSQL?

WebScaleSQL is a collaboration among engineers from several companies that face similar challenges in running MySQL at scale, and seek greater performance from a database technology tailored for their needs.

Our goal in launching WebScaleSQL is to enable the scale-oriented members of the MySQL community to work more closely together in order to prioritize the aspects that are most important to us. We aim to create a more integrated system of knowledge-sharing to help companies leverage the great features already found in MySQL 5.6, while building and adding more features that are specific to deployments in large scale environments. In the last few months, engineers from all four companies have contributed code and provided feedback to each other to develop a new, more unified, and more collaborative branch of MySQL.

But as effective as this collaboration has been so far, we know we’re not the only ones who are trying to solve these particular challenges. So we will keep WebScaleSQL open as we go, to encourage others who have the scale and resources to customize MySQL to join in our efforts. And of course we will welcome input from anyone who wants to contribute, regardless of what they are currently working on.

Who is behind WebScaleSQL?

WebScaleSQL currently includes contributions from MySQL engineering teams at Facebook, Google, LinkedIn, and Twitter. Together, we are working to share a common base of code changes to the upstream MySQL branch that we can all use and that will be made available via open source. This collaboration will expand on existing work by the MySQL community, and we will continue to track the upstream branch that is the latest, production-ready release (currently MySQL 5.6).

Correct me if I’m wrong but don’t teams from Facebook, Google, LinkedIn and Twitter know a graph when they see one? 😉

Even people who recognize graphs may need an SQL solution every now and again. Besides, solutions should not drive IT policy.

Requirements and meeting those requirements should drive IT policy. You are less likely to own very popular, expensive and ineffectual solutions when requirements rule. (Even iterative requirements in the agile approach are requirements.)

A reminder that MySQL/WebScaleSQL compiles from source with:

A working ANSI C++ compiler. GCC 4.2.1 or later, Sun Studio 10 or later, Visual Studio 2008 or later, and many current vendor-supplied compilers are known to work. (INSTALL-SOURCE)

Which makes it a target, sorry, subject for analysis of any vulnerabilities with joern.

I first saw this in a post by Derrick Harris, Facebook — with help from Google, LinkedIn, Twitter — releases MySQL built to scale.

Under the Hood: [of RocksDB]

Sunday, November 24th, 2013

Under the Hood: Building and open-sourcing RocksDB by Dhruba Borthakur.

From the post:

Every time one of the 1.2 billion people who use Facebook visits the site, they see a completely unique, dynamically generated home page. There are several different applications powering this experience–and others across the site–that require global, real-time data fetching.

Storing and accessing hundreds of petabytes of data is a huge challenge, and we’re constantly improving and overhauling our tools to make this as fast and efficient as possible. Today, we are open-sourcing RocksDB, an embeddable, persistent key-value store for fast storage that we built and use here at Facebook.

Why build an embedded database?

Applications traditionally access their data via remote procedure calls over a network connection, but that can be slow–especially when we need to power user-facing products in real time. With the advent of flash storage, we are starting to see newer applications that can access data quickly by managing their own dataset on flash instead of accessing data over a network. These new applications are using what we call an embedded database.

There are several reasons for choosing an embedded database. When database requests are frequently served from memory or from very fast flash storage, network latency can slow the query response time. Accessing the network within a data center can take about 50 microseconds, as can fast-flash access latency. This means that accessing data over a network could potentially be twice as slow as an application accessing data locally.

Secondly, we are starting to see servers with an increasing number of cores and with storage-IOPS reaching millions of requests per second. Lock contention and a high number of context switches in traditional database software prevents it from being able to saturate the storage-IOPS. We’re finding we need new database software that is flexible enough to be customized for many of these emerging hardware trends.

Like most of you, I don’t have 1.2 billion people visiting my site. 😉

However, understanding today’s “high-end” solutions will prepare you for tomorrow’s “middle-tier” solution and day after tomorrow’s desktop solution.

A high level overview of RocksDB.

Other resources to consider:

RocksDB Facebook page.

RocksDB on Github.


Update: Igor Canadi has posted to the Facebook page a proposal to add the concept of ColumnFamilies to RocksDB. https://github.com/facebook/rocksdb/wiki/Column-Families-proposal Comments? (Direct comments on that proposal to the Facebook page for RocksDB.)

Are You A Facebook Slacker? (Or, “Don’t “Like” Me, Support Me!”)

Sunday, November 10th, 2013

Their title reads: The Nature of Slacktivism: How the Social Observability of an Initial Act of Token Support Affects Subsequent Prosocial Action by Kirk Kristofferson, Katherine White, John Peloza. (Kirk Kristofferson, Katherine White, John Peloza. The Nature of Slacktivism: How the Social Observability of an Initial Act of Token Support Affects Subsequent Prosocial Action. Journal of Consumer Research, 2013; : 000 DOI: 10.1086/674137)

Abstract:

Prior research offers competing predictions regarding whether an initial token display of support for a cause (such as wearing a ribbon, signing a petition, or joining a Facebook group) subsequently leads to increased and otherwise more meaningful contributions to the cause. The present research proposes a conceptual framework elucidating two primary motivations that underlie subsequent helping behavior: a desire to present a positive image to others and a desire to be consistent with one’s own values. Importantly, the socially observable nature (public vs. private) of initial token support is identified as a key moderator that influences when and why token support does or does not lead to meaningful support for the cause. Consumers exhibit greater helping on a subsequent, more meaningful task after providing an initial private (vs. public) display of token support for a cause. Finally, the authors demonstrate how value alignment and connection to the cause moderate the observed effects.

From the introduction:

We define slacktivism as a willingness to perform a relatively costless, token display of support for a social cause, with an accompanying lack of willingness to devote significant effort to enact meaningful change (Davis 2011; Morozov 2009a).

From the section: The Moderating Role of Social Observability: The Public versus Private Nature of Support:

…we anticipate that consumers who make an initial act of token support in public will be no more likely to provide meaningful support than those who engaged in no initial act of support.

Four (4) detailed studies and an extensive review of the literature are offered to support the author’s conclusions.

The only source that I noticed missing was:

10 Two men went up into the temple to pray; the one a Pharisee, and the other a publican.

11 The Pharisee stood and prayed thus with himself, God, I thank thee, that I am not as other men are, extortioners, unjust, adulterers, or even as this publican.

12 I fast twice in the week, I give tithes of all that I possess.

13 And the publican, standing afar off, would not lift up so much as his eyes unto heaven, but smote upon his breast, saying, God be merciful to me a sinner.

14 I tell you, this man went down to his house justified rather than the other: for every one that exalteth himself shall be abased; and he that humbleth himself shall be exalted.

King James Version, Luke 18: 10-14.

The authors would reverse the roles of the Pharisee and the publican, to find the Pharisee contributes “meaningful support,” and the publican has not.

We contrast token support with meaningful support, which we define as consumer contributions that require a significant cost, effort, or behavior change in ways that make tangible contributions to the cause. Examples of meaningful support include donating money and volunteering time and skills.

If you are trying to attract “meaningful support” for your cause or organization, i.e., avoid slackers, there is much to learn here.

If you are trying to move beyond the “cheap grace” (Bonhoeffer)* of “meaningful support” and towards “meaningful change,” there is much to be learned here as well.

Governments, corporations, ad agencies and even your competitors are manipulating the public understanding of “meaningful support” and “meaningful change.” And acceptable means for both.

You can play on their terms and lose, or you can define your own terms and roll the dice.

Questions?


* I know the phrase “cheap grace” from Bonhoeffer but in running a reference to ground, I saw a statement in Wikipedia that Bonhoeffer learned that phrase from Adam Clayton Powell, Sr.. Homiletics have never been a strong interest of mine but I will try to run down some sources on sermons by Adam Clayton Powell, Sr.

Facebook’s Presto 10X Hive Speed (mostly)

Friday, November 8th, 2013

Facebook open sources its SQL-on-Hadoop engine, and the web rejoices by Derrick Harris.

From the post:

Facebook has open sourced Presto, the interactive SQL-on-Hadoop engine the company first discussed in June. Presto is Facebook’s take on Cloudera’s Impala or Google’s Dremel, and it already has some big-name fans in Dropbox and Airbnb.

Technologically, Presto and other query engines of its ilk can be viewed as faster versions of Hive, the data warehouse framework for Hadoop that Facebook created several years ago. Facebook and many other Hadoop users still rely heavily on Hive for batch-processing jobs such as regular reporting, but there has been a demand for something letting users perform ad hoc, exploratory queries on Hadoop data similar to how they might do them using a massively parallel relational database.

Presto is 10 times faster than Hive for most queries, according to Facebook software engineer Martin Traverso in a blog post detailing today’s news.

I think my headline is the more effective one. 😉

You won’t know anything until you download Presto, read the documentation, etc.

Presto homepage.

The first job is to get your attention, then you have to get the information necessary to be informed.

From Derrick’s post, which points to other SQL-on-Hadoop options, interesting times are ahead!

Scaling Apache Giraph to a trillion edges

Friday, September 13th, 2013

Scaling Apache Giraph to a trillion edges by Avery Ching.

From the post:

Graph structures are ubiquitous: they provide a basic model of entities with connections between them that can represent almost anything. Flight routes connect airports, computers communicate to one another via the Internet, webpages have hypertext links to navigate to other webpages, and so on. Facebook manages a social graph that is composed of people, their friendships, subscriptions, and other connections. Open graph allows application developers to connect objects in their applications with real-world actions (such as user X is listening to song Y).

Analyzing these real world graphs at the scale of hundreds of billions or even a trillion (10^12) edges with available software was impossible last year. We needed a programming framework to express a wide range of graph algorithms in a simple way and scale them to massive datasets. After the improvements described in this article, Apache Giraph provided the solution to our requirements.

In the summer of 2012, we began exploring a diverse set of graph algorithms across many different Facebook products as well as academic literature. We selected a few representative use cases that cut across the problem space with different system bottlenecks and programming complexity. Our diverse use cases and the desired features of the programming framework drove the requirements for our system infrastructure. We required an iterative computing model, graph-based API, and fast access to Facebook data. Based on these requirements, we selected a few promising graph-processing platforms including Apache Hive, GraphLab, and Apache Giraph for evaluation.

For your convenience:

Apache Giraph

Apache Hive

GraphLab

Your appropriate scale is probably less than a trillion edges but everybody likes a great scaling story.

This is a great scaling story.

Fun with Facebook in Neo4j [Separation from Edward Snowden?]

Sunday, June 23rd, 2013

Fun with Facebook in Neo4j by Rik Van Bruggen.

From the post:

Ever since Facebook promoted its “graph search” methodology, lots of people in our industry have been waking up to the fact that graphs are über-cool. Thanks to the powerful query possibilities, people like Facebook, Twitter, LinkedIn, and let us not forget, Google have been providing us with some of the most amazing technologies. Specifically, the power of the “social network” is tempting many people to get their feet wet, and to start using graph technology. And they should: graphs are fantastic at storing, querying and exploiting social structures, stored in a graph database.

So how would that really work? I am a curious, “want to know” but “not very technical” kind of guy, and I decided to get my hands dirty (again), and try some of this out by storing my own little part of Facebook – in neo4j. Without programming any kind of production-ready system – because I don’t know how – but with enough real world data to make us see what it would be like.

Rik walks you through obtaining data from Facebook, munging it in a spreadsheet and loading it into Neo4j.

Can’t wait for Facebook graph to support degrees of separation from named individuals, like Edward Snowden.

Complete with the intervening people of course.

What’s privacy compared to a media-driven witch hunt for anyone “connected” to the latest “face” on the TV?

If Facebook does that for Snowden, they should do it for NSA chief, Keith Alexander as well.

Presto is Coming!

Sunday, June 9th, 2013

Facebook unveils Presto engine for querying 250 PB data warehouse by Jordan Novet.

From the post:

At a conference for developers at Facebook headquarters on Thursday, engineers working for the social networking giant revealed that it’s using a new homemade query engine called Presto to do fast interactive analysis on its already enormous 250-petabyte-and-growing data warehouse.

More than 850 Facebook employees use Presto every day, scanning 320 TB each day, engineer Martin Traverso said.

“Historically, our data scientists and analysts have relied on Hive for data analysis,” Traverso said. “The problem with Hive is it’s designed for batch processing. We have other tools that are faster than Hive, but they’re either too limited in functionality or too simple to operate against our huge data warehouse. Over the past few months, we’ve been working on Presto to basically fill this gap.”

Facebook created Hive several years ago to give Hadoop some data warehouse and SQL-like capabilities, but it is showing its age in terms of speed because it relies on MapReduce. Scanning over an entire dataset could take many minutes to hours, which isn’t ideal if you’re trying to ask and answer questions in a hurry.

With Presto, however, simple queries can run in a few hundred milliseconds, while more complex ones will run in a few minutes, Traverso said. It runs in memory and never writes to disk, Traverso said.

Traverso goes onto say that Facebook will opensource Presto this coming Fall.

See my prior post on a more technical description of Presto: Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices.

Bear in mind that getting an answer from 250 PB of data quickly isn’t the same thing as getting a useful answer quickly.