Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

May 8, 2018

Extracting Data From FBI Reports – No Waterboarding Required!

Filed under: FBI,Government,Government Data,R — Patrick Durusau @ 1:01 pm

Wrangling Data Table Out Of the FBI 2017 IC3 Crime Report

From the post:

The U.S. FBI Internet Crime Complaint Center was established in 2000 to receive complaints of Internet crime. They produce an annual report, just released 2017’s edition, and I need the data from it. Since I have to wrangle it out, I thought some folks might like to play long at home, especially since it turns out I had to use both tabulizer and pdftools to accomplish my goal.

Concepts presented:

  • PDF scraping (with both tabulizer and pdftools)
  • asciiruler
  • general string manipulation
  • case_when() vs ifelse() for text cleanup
  • reformatting data for ggraph treemaps

Let’s get started! (NOTE: you can click/tap on any image for a larger version)

Freeing FBI data from a PDF prison, is a public spirited act.

Demonstrating how to free FBI data from PDF prisons, is a virtuous act!

Enjoy!

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress