Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

February 17, 2016

Rectal and Other Data

Filed under: R,Visualization — Patrick Durusau @ 3:35 pm

Hadley Wickham has posted neiss:

The neiss package provides access to all data (2009-2014) from the National Electronic Injury Surveillance System, which is a sample of all accidents reported to emergency rooms in the US.

You will recall this is the data set used by Nathan Yau in NSFW: Million to One Shot, Doc,, an analysis of rectal injuries.

A lack of features in the data prevents some types of analysis, such as the type of objects plotted as a function of weight, for example.

I’m sure there are other patterns, seasonal?, that you can derive from the data.

Enjoy!

PS: R library.

For What It’s Worth: CIA Releases Declassified Documents to National Archives

Filed under: Government,Government Data — Patrick Durusau @ 3:19 pm

CIA Releases Declassified Documents to National Archives

From the webpage:

Today, CIA released about 750,000 pages of declassified intelligence papers, records, research files and other content which are now accessible through CIA’s Records Search Tool (CREST) at the National Archives in College Park, MD. This release will include nearly 100,000 pages of analytic intelligence publication files, and about 20,000 pages of research and development files from CIA’s Directorate of Science and Technology, among others.

The newly available documents are being released in partnership with the National Geospatial Intelligence Agency (NGA) and are available by accessing CREST at the National Archives. This release continues CIA’s efforts to systematically review and release documents under Executive Order 13526. With this release, the CIA collection of records on the CREST system increases to nearly 13 million declassified pages.

That was posted on 16 February 2016.

Disclaimer: No warranty express or implied is made with regards to the accuracy of the notice quoted above or as to the accuracy of anything you may or may not find in the released documents, if they in fact exist.

I merely report that the quoted material was posted to the CIA website at the location and on the date recited.

27 Reasons to Attend Clojure/West 2016 + Twitter List

Filed under: Clojure,Conferences,Functional Programming — Patrick Durusau @ 3:11 pm

Clojure/West 2016 Speakers

I extracted the speaker list plus twitter accounts where available from the speakers list for Clojure/West 2016.

Now you have twenty-seven reasons to attend! Stack those up against any to not attend.

Register: April 15th-16th, Seattle Marriott Waterfront

April 15th is “tax day” in the United States.

Wouldn’t you rather be having fun with Clojure than grubbing around with smudged and/or lost receipts? I thought so. Register today!

Whether the government picks you pocket one day or the next makes little difference.

File early and attend while the tax hounds try to decide if Thai numerals printed in Braille are a legitimate tax return. 😉

  1. Matthias Felleisen Types are like the Weather, Type Systems are like Weathermen
  2. Alex Kehayias Functional Game Engine Design for the Web
  3. Allison Carter, From Fluxus to Functional: A Journey Through Interactive Art
  4. Amie Kuttruff, Deepen and Diversify the Clojure Community with Jr Engineers
  5. Aysylu Greenberg, (+ Loom (years 2))
  6. Bryce Covert, USE lisp WITH game – Making an Adventure Game with Clojure
  7. Christopher Small, Datalog all the way down
  8. Claire Alvis, Creating DSLs – a tale of spec-tacular success and failure
  9. Daniel Higginbotham, Parallel Programming, Fork/Join, and Reducers
  10. Devon Peticolas, One Million Clicks per Minute with Kafka and Clojure
  11. Donevan Dolby, Managing one of the world’s largest Clojure code bases
  12. Gerred Dillon, ClojureScript and Lambda: A Case Study
  13. Ghadi Shayban, Parsing Text with a Virtual Machine
  14. Jack Dubie, Fast full stack testing in om.next
  15. Jonathan Boston, Caleb Phillips, Building a Legal Data Service with Clojure
  16. Katherine Fellows, Anna Pawlicka, ClojureBridge in Practice
  17. Mario Aquino, The Age of Talkies
  18. Michael Drogalis, Inside Onyx
  19. Michał Marczyk, defrecord/deftype in Clojure and ClojureScript
  20. Mikaela Patella, Web Development is Distributed Systems Programming
  21. Nathan Marz, Specter: powerful and simple data structure manipulation
  22. Nathan Sorenson, Hybrid Automata and the Continuous Life
  23. Patrick O’Brien, Braid Chat: Reifying Online Group Conversations
  24. Paula Gearon, Production Rules in Datomic
  25. Peter Schuck Hash Maps: more room at the bottom
  26. Priyatam Mudivarti, Caching half a billion user transactions
  27. Stuart Sierra, The Joys and Perils of Interactive Development

PS: Be careful how you use the term “weathermen.” The professionally paranoid in government remember a different meaning that what you may intend. As do some of the rest of us.

February 16, 2016

Katia – rape screening in R

Filed under: Image Processing,Image Recognition,R — Patrick Durusau @ 8:54 pm

Katia – rape screening in R

From the webpage:

It’s Not Enough to Condemn Violence Against Women. We Need to End It.

All 12 innocent female victims above were atrociously killed, sexually assaulted, or registered missing after meeting strangers on mainstream dating, personals, classifieds, or social networking services.

INTRODUCTION TO THE KATIA RAPE SCREEN

Those 12 beautiful faces in the gallery above, are our sisters and daughters. Looking at their pictures is like looking through a tiny pinhole onto an unprecedented rape and domestic violence crisis that is destroying the American family unit.

Verified by science, the KATIA rape screen, coded in the computer programming language, R, can provably stop a woman from ever meeting her attacker.

The technology is named after a RAINN-counseled first degree aggravated rape survivor named Katia.

It is based on the work of a Google engineer from the Reverse Image Search project and a RAINN (Rape, Abuse & Incest National Network) counselor, with a clinical background in mathematical statistics, who has over a period of 15 years compiled a linguistic pattern analysis of the messages that rapists use to lure women online.

Learn more about the science behind Katia.

This project is taking concrete steps to reduce violence against women.

What more is there to say?

Is Failing to Attempt to Replicate, “Just Part of the Whole Science Deal”?

Filed under: Bioinformatics,Replication,Science — Patrick Durusau @ 8:08 pm

Genomeweb posted this summary of Stuart Firestein’s op-ed on failure to replicate:

Failure to replicate experiments is just part of the scientific process, writes Stuart Firestein, author and former chair of the biology department at Columbia University, in the Los Angeles Times. The recent worries over a reproducibility crisis in science are overblown, he adds.

“Science would be in a crisis if it weren’t failing most of the time,” Firestein writes. “Science is full of wrong turns, unconsidered outcomes, omissions and, of course, occasional facts.”

Failures to repeat experiments and the struggle to figure out what went wrong has also fed a number of discoveries, he says. For instance, in 1921, biologist Otto Loewi studied beating hearts from frogs in saline baths, one with the vagus nerve removed and one with it still intact. When the solution from the heart with the nerve still there was added to the other bath, that heart also slowed, suggesting that the nerve secreted a chemical that slowed the contractions.

However, Firestein notes Loewi and other researchers had trouble replicating the results for nearly six years. But that led the researchers to find that seasons can affect physiology and that temperature can affect enzyme function: Loewi’s first experiment was conducted at night and in the winter, while the follow-up ones were done during the day in heated buildings or on warmer days. This, he adds, also contributed to the understanding of how synapses fire, a finding for which Loewi shared the 1936 Nobel Prize.

“Replication is part of [the scientific] process, as open to failure as any other step,” Firestein adds. “The mistake is to think that any published paper or journal article is the end of the story and a statement of incontrovertible truth. It is a progress report.”

You will need to read Firestein’s comments in full: just part of the scientific process, to appreciate my concerns.

For example, Firestein says:


Absolutely not. Science is doing what it always has done — failing at a reasonable rate and being corrected. Replication should never be 100%. Science works beyond the edge of what is known, using new, complex and untested techniques. It should surprise no one that things occasionally come out wrong, even though everything looks correct at first.

I don’t know, would you say an 85% failure to replicate rate is significant? Drug development: Raise standards for preclinical cancer research, C. Glenn Begley & Lee M. Ellis, Nature 483, 531–533 (29 March 2012) doi:10.1038/483531a. Or over half of psychology studies? Over half of psychology studies fail reproducibility test. just to name two studies on replication.

I think we can agree with Firestein that replication isn’t at 100% but at what level are the attempts to replicate?

From what Firestein says,

“Replication is part of [the scientific] process, as open to failure as any other step,” Firestein adds. “The mistake is to think that any published paper or journal article is the end of the story and a statement of incontrovertible truth. It is a progress report.”

Systematic attempts at replication (and its failure) should be part and parcel of science.

Except…, that it’s obviously not.

If it were, there would have been no earth shaking announcements that fundamental cancer research experiments could not be replicated.

Failures to replicate would have been spread out over the literature and gradually resolved with better data, methods, if not both.

Failure to replicate is a legitimate part of the scientific method.

Not attempting to replicate, “I won’t look too close at your results if you don’t look too closely at mine,” isn’t.

There an ugly word for avoiding looking too closely at your own results or those of others.

Choose Memory over Memorex (sex tape stolen)

Filed under: Security — Patrick Durusau @ 5:56 pm

If you have never seen Ella Fitzgerald in the classic “Is it live, or is it Memorex?” commercial, do take a moment to view the video.

With regard to sex acts between consenting adults, you should choose memory over Memorex or any other type of recording.

Lisa Vaas covers the consequences of recording such acts in Teacher’s sex tape stolen from hacked Dropbox, posted on school site.

Given the option to remember an occasion or to record it, choose human memory.

Thus far, human memory cannot be accessed without you or a live witness, thus far.

Breach Fatigue? (Safe for Work)

Filed under: Cybersecurity,Security — Patrick Durusau @ 2:57 pm

Sorry! After my report of Nathan’s Million to One Shot, Doc post, I could not resist titling this post with “Breach Fatigue.”

Sarah Kuranda reports expected lower spending on security with this quote:

Wright said some customers interviewed by Technology Business Research also cited what some are calling “breach fatigue” as a reason behind lower security spending. Year after year of mega breaches have caused massive jumps in reactionary security spending, Wright said companies are now saying, “There’s not much more I can do.” (emphasis added) [Is The Security Spending Party Over?]

“…[M}assive jumps in reactionary security spending…” have benefited the security services/software vendors but not appreciably increased enterprise security. That much is known.

What remains unknown is why companies say:

There’s not much more I can do.

Post this scenario to your nearest business manager/executive:

Assume that all the locks are broken on your new Lexus and it isn’t possible to remove the ignition key:

2016-Lexus

Here are the options enterprises have followed to protect the Lexus:

  1. Surround the Lexus with a chain-link fence, with missing sections. (defective security software)
  2. Surround the Lexus with a chain-link fence, with a gate-lock with the key in it. (defective security software design)
  3. Staff the gate with personnel who can’t recognized authorized users. (poor security training)
  4. Purchase broken/insecure solutions to protect a broken/insecure vehicle. (poor strategy)

No doubt, enterprises can continue to throw money at defective software to protect defective software, with continuing mega-breach results.

To that extent, realizing throwing good money after bad is a positive sign. Sort of.

What more enterprises can do: Invest/require secure software. More costly but layering broken software on top of broken software has failed.

Why not try something more plausible?

NSFW: Million to One Shot, Doc

Filed under: Humor,Visualization — Patrick Durusau @ 1:56 pm

Million to One Shot, Doc – All the things that get stuck. by Nathan Yau.

Nathan downloaded emergency room data from 2009 to 2014 and filtered the data to reveal:

…an estimated 17,968 emergency room visits for foreign bodies stuck in a rectum. About three-quarters of patients were male, and as you might expect, many of the foreign bodies were sex toys. But, perhaps unexpectedly, about 60 percent of those foreign bodies were not sex toys.

Nathan has created a click-through visualization of objects and ER doctor comments.

I offer this as a counter-example to the claim that all business data has value. 😉

You probably should forward the link to your home computer.

Enjoy!

PS: Is anyone working on a cross-cultural comparison on such data?

February 15, 2016

networkD3: D3 JavaScript Network Graphs from R

Filed under: D3,Graphs,Javascript,Networks,R — Patrick Durusau @ 5:41 pm

networkD3: D3 JavaScript Network Graphs from R by Christopher Gandrud, JJ Allaire, & Kent Russell.

From the post:

This is a port of Christopher Gandrud’s R package d3Network for creating D3 network graphs to the htmlwidgets framework. The htmlwidgets framework greatly simplifies the package’s syntax for exporting the graphs, improves integration with RStudio’s Viewer Pane, RMarkdown, and Shiny web apps. See below for examples.

It currently supports three types of network graphs:

I haven’t compared this to GraphViz but the Sankey diagram option is impressive!

BMG Seeks to Violate Privacy Rights – Cox Refuses to Aid and Abet

Filed under: Cybersecurity,Intellectual Property (IP),Privacy,Security — Patrick Durusau @ 4:58 pm

Cox Refuses to Spy on Subscribers to Catch Pirates by Ernesto Van der Sar.

From the post:

Last December a Virginia federal jury ruled that Internet provider Cox Communications was responsible for the copyright infringements of its subscribers.

The ISP was found guilty of willful contributory copyright infringement and must pay music publisher BMG Rights Management $25 million in damages.

The verdict was a massive victory for the music company and a disaster for Cox, but the case is not closed yet.

A few weeks ago BMG asked the court to issue a permanent injunction against Cox Communications, requiring the Internet provider to terminate the accounts of pirating subscribers and share their details with the copyright holder.

In addition BMG wants the Internet provider to take further action to prevent infringements on its network. While the company remained vague on the specifics, it mentioned the option of using invasive deep packet inspection technology.

Last Friday, Cox filed a reply pointing out why BMG’s demands go too far, rejecting the suggestion of broad spying and account termination without due process.

“To the extent the injunction requires either termination or surveillance, it imposes undue hardships on Cox, both because the order is vague and because it imposes disproportionate, intrusive, and punitive measures against households and businesses with no due process,” Cox writes (pdf).

Read the rest of Ernesto’s post for sure but here’s a quick summary:

Cox.com is spending money to protect your privacy.

I don’t live in a Cox service area but if you do, sign up with Cox and say their opposition to BMG is driving your new subscription. Positive support always rings louder than protesters with signs and litter.

BMG.com is spending money to violate your privacy.

BMG is a subsidiary of Bertelsmann, which claims 112,037 employees.

I wonder how many of those employees have signed off on the overreaching and abusive positions of BMG?

Perhaps members of the public oppressed by BMG and/or Bertelsmann should seek them out to reason with them.

Bearing in mind that “rights” depend upon rules you choose to govern your discussions/actions.

People NOT Technology Produce Data ROI

Filed under: BigData,Data,Data Science,Data Silos — Patrick Durusau @ 4:00 pm

Too many tools… not enough carpenters! by Nicholas Hartman.

From the webpage:

Don’t let your enterprise make the expensive mistake of thinking that buying tons of proprietary tools will solve your data analytics challenges.

tl;dr = The enterprise needs to invest in core data science skills, not proprietary tools.

Most of the world’s largest corporations are flush with data, but frequently still struggle to achieve the vast performance increases promised by the hype around so called “big data.” It’s not that the excitement around the potential of harvesting all that data was unwarranted, but rather these companies are finding that translating data into information and ultimately tangible value can be hard… really hard.

In your typical new tech-based startup the entire computing ecosystem was likely built from day one around the need to generate, store, analyze and create value from data. That ecosystem was also likely backed from day one with a team of qualified data scientists. Such ecosystems spawned a wave of new data science technologies that have since been productized into tools for sale. Backed by mind-blowingly large sums of VC cash many of these tools have set their eyes on the large enterprise market. A nice landscape of such tools was recently prepared by Matt Turck of FirstMark Capital (host of Data Driven NYC, one of the best data science meetups around).

Consumers stopped paying money for software a long time ago (they now mostly let the advertisers pay for the product). If you want to make serious money in pure software these days you have to sell to the enterprise. Large corporations still spend billions and billions every year on software and data science is one of the hottest areas in tech right now, so selling software for crunching data should be a no-brainer! Not so fast.

The problem is, the enterprise data environment is often nothing like that found within your typical 3-year-old startup. Data can be strewn across hundreds or thousands of systems that don’t talk to each other. Devices like mainframes are still common. Vast quantities of data are generated and stored within these companies, but until recently nobody ever really envisioned ever accessing — let alone analyzing — these archived records. Often, it’s not initially even clear how the all data generated by these systems directly relates to a large blue chip’s core business operations. It does, but a lack of in-house data scientists means that nobody is entirely even sure what data is really there or how it can be leveraged.

I would delete “proprietary” from the above because non-proprietary tools create data problems just as easily.

Thus I would re-write the second quote as:

Tools won’t replace skilled talent, and skilled talent doesn’t typically need many particular tools.

I substituted “particular” tools to avoid religious questions about particular non-proprietary tools.

Understanding data, recognizing where data integration is profitable and where it is a dead loss, creating tests to measure potential ROI, etc., are all tasks of a human data analyst and not any proprietary or non-proprietary tool.

That all enterprise data has some intrinsic value that can be extracted if it were only accessible is an article of religious faith, not business ROI.

If you want business ROI from data, start with human analysts and not the latest buzzwords in technological tools.

Twitter Suspension Tracker

Filed under: Censorship,Tweets,Twitter — Patrick Durusau @ 2:36 pm

Twitter Suspension Tracker by Lee Johnstone.

From the about page:

This site (Twitter Suspension Monitor) was created to do one purpose, log and track suspended twitter accounts.

The system periodically checks marked suspended accounts for possible reactivation and remarks them accordingly. This allows the system to start tracking how many hours, days or even weeks and months a users twitter account got suspended for. Ontop of site submitted entrys Twitter Suspension Monitor also scrapes data directly from twitter in hope to find many more suspended accounts.

Not transparency but some reflected light on the Twitter account suspension process.

Tweets from suspended accounts disappear.

Stalin would have felt right at home with Twitter’s methods if not its ideology.

Here’s a photo of Stalin for the webpage of the Twitter Trust & Safety Council:

220px-CroppedStalin1943

Members of the Twitter Trust & Safety Council should use it as their twitter profile image. Enable all of us to identify Twitter censorship collaborators.

However urgent current hysteria, censors are judged only one way in history.

Is that what you want for your legacy? Twitter, same question.

Automating Family/Party Feud

Filed under: Natural Language Processing,Reddit,Vectors — Patrick Durusau @ 11:19 am

Semantic Analysis of the Reddit Hivemind

From the webpage:

Our neural network read every comment posted to Reddit in 2015, and built a semantic map using word2vec and spaCy.

Try searching for a phrase that’s more than the sum of its parts to see what the model thinks it means. Try your favourite band, slang words, technical things, or something totally random.

Lynn Cherny suggested in a tweet to use “actually.”

If you are interested in the background on this tool, see: Sense2vec with spaCy and Gensim by Matthew Honnibal.

From the post:

If you were doing text analytics in 2015, you were probably using word2vec. Sense2vec (Trask et al., 2015) is a new twist on word2vec that lets you learn more interesting, detailed and context-sensitive word vectors. This post motivates the idea, explains our implementation, and comes with an interactive demo that we’ve found surprisingly addictive.

Polysemy: the problem with word2vec

When humans write dictionaries and thesauruses, we define concepts in relation to other concepts. For automatic natural language processing, it’s often more effective to use dictionaries that define concepts in terms of their usage statistics. The word2vec family of models are the most popular way of creating these dictionaries. Given a large sample of text, word2vec gives you a dictionary where each definition is just a row of, say, 300 floating-point numbers. To find out whether two entries in the dictionary are similar, you ask how similar their definitions are – a well-defined mathematical operation.

Certain to be a hit at technical conferences and parties.

SGML wasn’t mentioned even once during 2015 in Reddit Comments.

Try some your favorites words and phrases.

Enjoy!

February 14, 2016

More Bad News For EC Brain Project Wood Pigeons

Filed under: Algorithms,EU,Machine Learning,Neural Networks,Neuroinformatics — Patrick Durusau @ 9:03 pm

I heard the story of how the magpie tried to instruct other birds, particularly the wood pigeon, on how to build nests in a different form but the lesson was much the same.

The EC Brain project reminds me of the wood pigeon hearing “…take two sticks…” and running off to build its nest.

With no understanding of the human brain, the EC set out to build one, on a ten year deadline.

Byron Spice’s report in: Project Aims to Reverse-engineer Brain Algorithms, Make Computers Learn Like Humans casts further doubt upon that project:

Carnegie Mellon University is embarking on a five-year, $12 million research effort to reverse-engineer the brain, seeking to unlock the secrets of neural circuitry and the brain’s learning methods. Researchers will use these insights to make computers think more like humans.

The research project, led by Tai Sing Lee, professor in the Computer Science Department and the Center for the Neural Basis of Cognition (CNBC), is funded by the Intelligence Advanced Research Projects Activity (IARPA) through its Machine Intelligence from Cortical Networks (MICrONS) research program. MICrONS is advancing President Barack Obama’s BRAIN Initiative to revolutionize the understanding of the human brain.

“MICrONS is similar in design and scope to the Human Genome Project, which first sequenced and mapped all human genes,” Lee said. “Its impact will likely be long-lasting and promises to be a game changer in neuroscience and artificial intelligence.”

Artificial neural nets process information in one direction, from input nodes to output nodes. But the brain likely works in quite a different way. Neurons in the brain are highly interconnected, suggesting possible feedback loops at each processing step. What these connections are doing computationally is a mystery; solving that mystery could enable the design of more capable neural nets.

My goodness! Unknown loops in algorithms?

The Carnegie Mellon project is exploring potential algorithms, not trying to engineer the unknown.

If the EC had titled its project the Graduate Assistant and Hospitality Industry Support Project, one could object to the use of funds for travel junkets but it would otherwise be intellectually honest.

February 13, 2016

You Can Confirm A Gravity Wave!

Filed under: Physics,Python,Science,Signal Processing,Subject Identity,Subject Recognition — Patrick Durusau @ 5:35 pm

Unless you have been unconscious since last Wednesday, you have heard about the confirmation of Einstein’s 1916 prediction of gravitational waves.

An very incomplete list of popular reports include:

Einstein, A Hunch And Decades Of Work: How Scientists Found Gravitational Waves (NPR)

Einstein’s gravitational waves ‘seen’ from black holes (BBC)

Gravitational Waves Detected, Confirming Einstein’s Theory (NYT)

Gravitational waves: breakthrough discovery after a century of expectation (Guardian)

For the full monty, see the LIGO Scientific Collaboration itself.

Which brings us to the iPython notebook with the gravitational wave discovery data: Signal Processing with GW150914 Open Data

From the post:

Welcome! This ipython notebook (or associated python script GW150914_tutorial.py ) will go through some typical signal processing tasks on strain time-series data associated with the LIGO GW150914 data release from the LIGO Open Science Center (LOSC):

To begin, download the ipython notebook, readligo.py, and the data files listed below, into a directory / folder, then run it. Or you can run the python script GW150914_tutorial.py. You will need the python packages: numpy, scipy, matplotlib, h5py.

On Windows, or if you prefer, you can use a python development environment such as Anaconda (https://www.continuum.io/why-anaconda) or Enthought Canopy (https://www.enthought.com/products/canopy/).

Questions, comments, suggestions, corrections, etc: email losc@ligo.org

v20160208b

Unlike the toadies at the New England Journal of Medicine, Parasitic Re-use of Data? Institutionalizing Toadyism, Addressing The Concerns Of The Selfish, the scientists who have labored for decades on the gravitational wave question are giving their data away for free!

Not only giving the data away, but striving to help others learn to use it!

Beyond simply “doing the right thing,” and setting an example for other scientists, this is a great opportunity to learn more about signal processing.

Signal processing being an important method of “subject identification” when you stop to think about it in a large number of domains.

Detecting a gravity wave is beyond your personal means but with the data freely available…, further analysis is a matter of interest and perseverance.

Clojure for the Brave and True (Google Group)

Filed under: Clojure,Functional Programming,Programming — Patrick Durusau @ 5:02 pm

Clojure for the Brave and True (Google Group) by Daniel Higginbotham.

First there was the website: Clojure for the Brave and True.

Then there was the book: Clojure for the Brave and True.

But before the book there was Daniel’s twitter account: @nonrecursive.

There’s no truth to the rumor of a free print trade publication with the title: Clojure for the Brave and True so please direct your questions and answers to:

Clojure for the Brave and True (Google Group)

Enjoy!

‘You Were There!’ Historical Evidence Of Participation

Filed under: History,Verification,Video — Patrick Durusau @ 3:53 pm

Free: British Pathé Puts Over 85,000 Historical Films on YouTube by Jonathan Crow.

From the post:

British Pathé was one of the leading producers of newsreels and documentaries during the 20th Century. This week, the company, now an archive, is turning over its entire collection — over 85,000 historical films – to YouTube.

The archive — which spans from 1896 to 1976 – is a goldmine of footage, containing movies of some of the most important moments of the last 100 years. It’s a treasure trove for film buffs, culture nerds and history mavens everywhere. In Pathé’s playlist “A Day That Shook the World,” which traces an Anglo-centric history of the 20th Century, you will find clips of the Wright Brothers’ first flight, the bombing of Hiroshima and Neil Armstrong’s walk on the moon, alongside footage of Queen Victoria’s funeral and Roger Bannister’s 4-minute mile. There’s, of course, footage of the dramatic Hindenburg crash and Lindbergh’s daring cross-Atlantic flight. And then you can see King Edward VIII abdicating the throne in 1936, Hitler’s first speech upon becoming the German Chancellor in 1933 and the eventual Pearl Harbor attack in December 1941 (above).

But the really intriguing part of the archive is seeing all the ephemera from the 20th Century, the stuff that really makes the past feel like a foreign country – the weird hairstyles, the way a city street looked, the breathtakingly casual sexism and racism. There’s a rush in seeing history come alive. Case in point, this documentary from 1967 about the wonders to be found in a surprisingly monochrome Virginia.

A treasure trove of over 85,000 historical films!

With modern face recognition technology, imagine mining these films and matching faces up against other photographic archives.

Rather than seeing George Wallace, for example, as a single nasty piece of work during the 1960’s, we may identify the followers of such “leaders.”

Those who would discriminate on the basis of race, gender, religion, sexual orientation, ethnic origin, language, etc. are empowered by those of similar views.

One use of this historical archive would be to “out” the followers of such bigots.

To protect “former” fascists supporters on the International Olympic Committee, the EU will protest any search engine that reports such results.

You should judge the IOC by their supporters as well. (Not the athletes, but the IOC.)

Valentine’s Day Hearts

Filed under: Graphics,TeX/LaTeX — Patrick Durusau @ 8:42 am

If you have an appropriate other to send Valentine’s cards, greetings, etc., consider:

Can we make a love heart with LaTeX?

A few of the images you can customize:

heart-text

heart-birthday

heart-3D

For searches like “valentines day hearts TeX” and “valentines day hearts LaTeX,” you really wish that Google was less “helpful.”

As you know, TeX “corrects” to text and LaTeX, well, you know how that is corrected. 😉

Even if you convince Google that you really meant “TeX,” the returns remain mostly garbage.

Here a search that returns 74 “hits” that Google dedupes down to 18 (most of which are dupes):

valentine heart site:tex.stackexchange.com

But 18 “hits” are manageable:

drawing water droplets with tikz mentions Example: Valentine heart at TeXample.net.

Then you will find 13 “hits” that include this sentence:

We have questions about Christmas trees and Hearts for Valentines but we have no questions that specialize in Halloween or Dia de los Muertos art.

Why Google doesn’t dedupe those isn’t known.

I tried several of the better known TeX/LaTeX sites with “valentine” and the site name. Not anything like a comprehensive survey but there were several zero search results.

Is it the case that the TeX/LaTeX communities don’t have much interest in Valentine heart drawing? 😉

You will fare even worse if you search for heart limited to the domain processing.org.

On the other hand, SVG and valentine “searches” fairly well.

Here’s one from Wiki Commons:

529px-Love_Heart_SVG.svg

Credit your sources (discretely) on any artwork you reproduce.

Enjoy!

PS: Now all I have to do is corral an old inkjet color printer into working as a local printer, pray the color cartridge hasn’t dried up, etc. Happy Valentine’s Day!

February 12, 2016

Manhandled

Filed under: #gamergate,Ethics — Patrick Durusau @ 8:50 pm

Manhandled by Robert C. Martin.

From the post:

Warning: Possible sexual abuse triggers.

One of my regular bike-riding podcasts is Astronomy Cast, by Dr. Pamela Gay and Fraser Cain. Indeed, if you go to cleancodeproject.com you’ll see that Astronomy Cast is one of the charities on my list of favorites. Make a contribution and I will send you a green Clean Code wristband, or coffee cup, or sweatshirt. If you listen to Astronomy Cast you’ll also find that I am a sponsor.

This podcast is always about science; and the science content is quite good. It’s techie. It’s geeky. It’s right up my alley. I’ve listened to almost every one of the 399 episodes. If you like science — especially science about space and astronomy, this is a great resource.

But yesterday was different. Yesterday was episode 399; and it was not about science at all. It was entitled: Women in Science; and it was about — sexual harassment.

Not the big kind that gets reported. Not the notorious kind that gets people fired. Not that kind — though there’s enough of that to go around. No, this was about commonplace, everyday, normal sexual harassment.

Honestly, I didn’t know there was such a thing. I’ve always thought that sexual harassment was anomalous behavior perpetrated by a few disgusting, arrogant men in positions of power. It never occurred to me that sexual harassment was an everyday, commonplace, run-of-the-mill, what-else-is-new occurrence. But I listened, aghast, as I heard Dr. Gay recount tales of it. Tales of the kind of sexual harassment that women in Science regularly encounter; and have simply come to expect as a normal fact of life.

You need to read Bob’s post in full but in particular his concluding advice:

  • You never lay your hands on someone with sexual intent without their explicit permission. It does not matter how drunk you are. It does not matter how drunk they are. You never, ever manhandle someone without their very explicit consent. And if they work for you, or if you have power over them, then you must never make the advance, and must never accept the consent.
  • What’s more: if you see harassment in progress, or even something you suspect is harassment, you intervene! You stop it! Even if it means you’ll lose a friend, or your job, you stop it!

Bob makes those points as a matter of “professionalism” for programmers but being considerate of others, is part and parcel of being a decent human being.

Overlay Journals – Community-Based Peer Review?

Filed under: Open Access,Peer Review,Publishing — Patrick Durusau @ 8:31 pm

New Journals Piggyback on arXiv by Emily Conover.

From the post:

A non-traditional style of scientific publishing is gaining ground, with new journals popping up in recent months. The journals piggyback on the arXiv or other scientific repositories and apply peer review. A link to the accepted paper on the journal’s website sends readers to the paper on the repository.

Proponents hope to provide inexpensive open access publication and streamline the peer review process. To save money, such “overlay” journals typically do away with some of the services traditional publishers provide, for example typesetting and copyediting.

Not everyone is convinced. Questions remain about the scalability of overlay journals, and whether they will catch on — or whether scientists will demand the stamp of approval (and accompanying prestige) that the established, traditional journals provide.

The idea is by no means new — proposals for journals interfacing with online archives appeared as far back as the 1990s, and a few such journals are established in mathematics and computer science. But now, say proponents, it’s an idea whose time has come.

The newest such journal is the Open Journal of Astrophysics, which began accepting submissions on December 22. Editor in Chief Peter Coles of the University of Sussex says the idea came to him several years ago in a meeting about the cost of open access journals. “They were talking about charging thousands of pounds for making articles open access,” Coles says, and he thought, “I never consult journals now; I get all my papers from the arXiv.” By adding a front end onto arXiv to provide peer review, Coles says, “We can dispense with the whole paraphernalia with traditional journals.”

Authors first submit their papers to arXiv, and then input the appropriate arXiv ID on the journal’s website to indicate that they would like their paper reviewed. The journal follows a standard peer review process, with anonymous referees whose comments remain private.

When an article is accepted, a link appears on the journal’s website and the article is issued a digital object identifier (DOI). The entire process is free for authors and readers. As APS News went to press, Coles hoped to publish the first batch of half-dozen papers at the end of January.

My Archive for the ‘Peer Review’ Category has only a few of the high profile failures of peer review over the last five years.

You are probably familiar with at least twice as many reports as I have reported in this blog on the brokenness of peer review.

If traditional peer review is a known failure, why replicate it even for overlay journals?

Why not ask the full set of peers in a discipline? That is the readers of articles posted in public repositories?

If a book/journal article goes uncited, isn’t that evidence that it:

Did NOT advance the discipline in a way meaningful to their peers?

What other evidence would you have that it did advance the discipline? The opinions of friends of the editor? That seems too weak to even suggest.

Citation analysis isn’t free from issues, Are 90% of academic papers really never cited? Searching citations about academic citations reveals the good, the bad and the ugly, but it has the advantage of drawing on the entire pool of talent that comprises a discipline.

Moreover, peer review would not be limited to a one time judgment of traditional peer reviewers but on the basis of how a monograph or article fits into the intellectual development of the discipline as a whole.

Which is more persuasive: That editors and reviewers at Science or Nature accept a paper or that in the ten years following publication, an article is cited by every other major study in the field?

Citation analysis obviates the overhead costs that are raised about organizing peer review on a massive scale. Why organize peer review at all?

Peers are going to read and cite good literature and more likely than not, skip the bad. Unless you need to create positions for gate keepers and other barnacles on the profession, opt for citation based peer review based on open repositories.

I’m betting on the communities that silently vet papers and books in spite of the formalized and highly suspect mechanisms for peer review.

Overlay journals could publish preliminary lists of articles that are of interest in particular disciplines and as community-based peer review progresses, they can publish “best of…” series as the community further filters the publications.

Community-based peer review is already operating in your discipline. Why not call it out and benefit from it?

Tufte in R

Filed under: R,Visualization — Patrick Durusau @ 7:41 pm

Tufte in R by Lukasz Piwek.

From the post:

The idea behind Tufte in R is to use R – the most powerful open-source statistical programming language – to replicate excellent visualisation practices developed by Edward Tufte. It’s not a novel approach – there are plenty of excellent R functions and related packages wrote by people who have much more expertise in programming than myself. I simply collect those resources in one place in an accessible and replicable format, adding a few bits of my own coding discoveries.

Piwek says his idea isn’t novel but I am sure this will be of interest to both R and Tufte fans!

Is anyone else working through the Tufte volumes in R or Processing?

Those would be great projects to have bookmarked.

Google Embeds 2D & 3D Plots on Search!

Filed under: Graphs,Mathematics — Patrick Durusau @ 5:17 pm

Jake Vanderplas tweeted:

Whoah… @google now embeds interactive 2D & 3D plots when you search for a function! https://goo.gl/KOGdBq

Seeing is believing (with controls no less):

sin(x)+cos(y)

sin+cos

or, sin(x+y)

sin-x-y-2

In case you want to know more:

Go to: Calculator & unit converter and select How to graph equations and Geometry Calculator.

If your browser supports WebGL, Google will render 3d graphs.

What functions have you used Google to render?

Brick an iOS Device with Date Setting (local or remote)

Filed under: Cybersecurity,Security — Patrick Durusau @ 4:33 pm

iOS bug warning: Setting this date on your iPhone or iPad will kill your device permanently by Justin Ferris.

From the post:

No one is quite sure yet why this happens, and Apple is still looking into it. However, the best guess is that iOS sees the date January 1, 1970, as either zero or a negative number, and that causes some or all of the iOS functions that require a date to crash.

Now, you might be thinking this isn’t a big deal, because you’d never set your gadget to this date. And it actually is a long process to do it. However, maybe you have a friend who’s a prankster or an ex-friend with a grudge that has access to your gadget. Or it could be done remotely, in the right circumstances.

Not all iOS devices, see: http://www.komando.com/happening-now/347426/ios-date-bug-kills-iphone-and-ipad for the models affected.

Imagine that, bricking an iOS device with a date setting.

What will Apple think of next?

It may be the case that no one could have guessed this would be an issue.

However, entry of January 1, 1970 that bricks any device sold after 12 February 2016, should be treated as a case of strict liability against the manufacturer. At a minimum.

XSLT 3.0 Workshop – #XMLPrague

Filed under: XSLT — Patrick Durusau @ 10:56 am

XSLT 3.0 Workshop – submit your questions in advance.

Apologies for the short notice but I saw a tweet by Abel Braaksma reminding everyone of the XSLT 3.0 workshop tomorrow at #XMLPrague and the need to submit questions in advance!

What questions do you want to ask?

February 11, 2016

Designing with Data (Clojure)

Filed under: Clojure,Design,Functional Programming — Patrick Durusau @ 9:00 pm

Designing with Data by Michael Dorgalis.

Slides from keynote at Clojure Remote 2016.

Without the video of the presentation, the usual difficulties of using slides in isolation obtain.

On the plus side, you have to work harder for the information and that may enhance your retention/comprehension.

Enjoy!

The Hitchhiker’s Guide to the Galaxy Game – 30th Anniversary Edition

Filed under: Games,Humor — Patrick Durusau @ 8:44 pm

The Hitchhiker’s Guide to the Galaxy Game – 30th Anniversary Edition

From the webpage:

A word of warning

The game will kill you frequently. It’s a bit mean like that.

If in doubt, before you make a move, please save your game by typing “Save” then enter. You can then restore your game by typing “Restore” then enter. This should make it slightly less annoying getting killed all the time as you can go back to where you were before it happened.

You’ll need to be signed in for this to work. You can sign in or register by clicking the BBCiD icon next to the BBC logo in the top navigation bar. Signing in will also allow you to tweet about your achievements, and to add a display name so you can get onto the high score tables.

Take fair warning, you can lose hours if not days playing this game.

The graphics may help orient yourself in the various locations. That was missing in the original game.

If you maintain focus on the screen, you can use your keyboard for data entry.

Graphics are way better now but how do you compare the game play to current games?

Enjoy!

More details:

About the game

Game Technical FAQ

How to play

Legislative Data Demo Day [24 February 2016]

Filed under: Government,Law,Law - Sources — Patrick Durusau @ 8:18 pm

Legislative Data Demo Day Hosted by Rep. Seth Moulton and Rep. David Brat.

From the post:

February 24, 2016, Washington, D.C. 4:00pm – 5:00pm, location TBD

Congress is poised to transform its legislative information from outdated documents into open, searchable data. If the House and Senate adopted a consistent data format for all bills, amendments, passed laws, and legal compilations, then new software could bring better transparency and more efficient lawmaking. The bipartisan Statutes at Large Modernization Act, introduced by Reps. Brat and Moulton, takes a giant step toward a data-driven future by setting up a structured data format for the Statutes at Large. Together with similar reforms for other legislative materials, the Statutes at Large Modernization Act will enable automatic redlining between bills and the laws they amend; electronic crosswalks from appropriations to the final disbursement of taxpayer funds; and cheaper, easier legal research.

At the Legislative Data Demo Day, Reps. Moulton and Brat will preview demonstrations of the technologies that can modernize laws and lawmaking – if Congress embraces the transformation from documents into data.

Heads up for what could be a very good event!

The event is free but it looks like physical attendance is required.

Sci-Hub Tip: Converting Paywall DOIs to Public Access

Filed under: Open Access,Open Data,Publishing — Patrick Durusau @ 7:46 pm

In a tweet Jon Tenn@nt points out that:

Reminder: add “.sci-hub.io” after the .com in the URL of pretty much any paywalled paper to gain instant free access.

BTW, I tested Jon’s advice with:

http://dx.doi.org/10.****/*******

re-cast as:

http://dx.doi.org.sci-hub.io/10.****/*******

And it works!

With a little scripting, you can convert your paywall DOIs into public access with sci-hub.io.

This “worked for me” so if you encounter issues, please ping me so I can update this post.

Happy reading!

UK Parliament Reports on the Draft Investigatory Powers Bill

Filed under: Cybersecurity,Government,Privacy,Security — Patrick Durusau @ 7:32 pm

I have stumbled on several “news” reports about the Investigatory Powers Bill in the UK.

Reports from committees in Parliament have started appearing, but are those reports linked in breathless accounts of the horrors of the Investigatory Powers bill?

You already know the answer to that question!

I did find UK surveillance bill condemned by a Parliamentary committee, for the third time by Cory Doctorow, which pointed to the Joint Select Committee recommendations for changes in the IP Bill.

For two other reports, Cory relied on no originals reporting in Wired.co.uk, which quoted many sources but failed to link to the reports themselves.

To get you started with the existing primary criticisms of the Investigatory Powers Bill:

There was a myth the Internet (later the WWW) would provide greater access to information, along the lines of the Memex.

Deep information is out there and when you find it, please insert a link to it.

You and everyone who reads your paper, post, tweet, etc. will be better off for it.

Who Do You Love? (Visualizing Relationships/Associations)

Filed under: Associations,Visualization — Patrick Durusau @ 2:41 pm

This Chart Shows Who Marries CEOs, Doctors, Chefs and Janitors by Adam Pearce and Dorothy Gambrell.

From the post:

When it comes to falling in love, it’s not just fate that brings people together—sometimes it’s their jobs. We scanned data from the U.S. Census Bureau’s 2014 American Community Survey—which covers 3.5 million households—to find out how people are pairing up. Some of the matches seemed practical (the most common marriage is between grade-school teachers), and others had us questioning Cupid’s aim (why do female dancers have a thing for male welders?). High-earning women (doctors, lawyers) tend to pair up with their economic equals, while middle- and lower-tier women often marry up. In other words, female CEOs tend to marry other CEOs; male CEOs are OK marrying their secretaries.

The listing of occupations and spousal relationship is interactive on mouse-over and you can type in the name of a profession. (Warning: On typing in the profession name, it must be a case match for the term in this listing.

Here’s a sample for Librarians:

who-do-you-love

The relationships are gender-coded:

gender-coding

Try to guess which occupations have “marries within occupation” and those which do not.

For each of the following, what is your guess about marrying within the occupation or not?

  • Ambulance Drivers
  • Atmospheric and Space Scientists
  • Economists
  • Postal Service

This looks like a great browsing technique for exploring relationships (associations).

« Newer PostsOlder Posts »

Powered by WordPress