When hunting copyright trolls, well, trolls of any kind, the smaller the number to be hunted the better.
The Copyright Office is conducting the Section 512 Study, which it describes as:
The United States Copyright Office is undertaking a public study to evaluate the impact and effectiveness of the safe harbor provisions contained in section 512 of title 17, United States Code.
Enacted in 1998 as part of the Digital Millennium Copyright Act (“DMCA”), section 512 established a system for copyright owners and online entities to address online infringement, including limitations on liability for compliant service providers to help foster the growth of internet-based services. Congress intended for copyright owners and internet service providers to cooperate to detect and address copyright infringements. To qualify for protection from infringement liability, a service provider must fulfill certain requirements, generally consisting of implementing measures to expeditiously address online copyright infringement.
While Congress understood that it would be essential to address online infringement as the internet continued to grow, it may have been difficult to anticipate the online world as we now know it, where each day users upload hundreds of millions of photos, videos and other items, and service providers receive over a million notices of alleged infringement. The growth of the internet has highlighted issues concerning section 512 that appear ripe for study. Accordingly, as recommended by the Register of Copyrights, Maria A. Pallante, in testimony and requested by Ranking Member Conyers at an April 2015 House Judiciary Committee hearing, the Office is initiating a study to evaluate the impact and effectiveness of section 512 and has issued a Notice of Inquiry requesting public comment. Among other issues, the Office will consider the costs and burdens of the notice-and-takedown process on large- and small-scale copyright owners, online service providers, and the general public. The Office will also review how successfully section 512 addresses online infringement and protects against improper takedown notices.
The Office received over 92,000 written submissions by the April 1, 2016 deadline for the first round of public comments. The Office then held public roundtables on May 2nd and 3rd in New York and May 12th and 13th in San Francisco to seek further input on the section 512 study. Transcripts of the New York and San Francisco roundtables are now available online. Additional written public comments are due by 11:59 pm EST on February 21, 2017 and written submissions of empirical research are due by 11:59 pm EST on March 22, 2017.
You can see the comments at: Requests for Public Comments: Digital Millennium Copyright Act Safe Harbor Provisions, all 92,398 of them.
You can even export them to a CSV file, which runs a little over 33.5 MB in size.
It is likely that the same copyright trolls who provoked this review with non-pubic comments to the Copyright Office and others posted comments, but how to find them in a sea of 92,398 comments?
Some simplifying assumptions:
No self-respecting copyright troll will use the public comment template.
grep -v "Template Form Comment" DOCKET_COLC-2015-0013.csv | wc -l
grep with the
-v means it does NOT return matching lines. That is only lines without “Template Form Comment” will be returned.
We modify that to read:
grep -v "Template Form Comment" DOCKET_COLC-2015-0013.csv > no-form.csv
The > pipe adds the lines without “Template Form Comment” to the file no-form.csv.
Next, scanning the file we notice, “no last name/No Last Name.”
grep -iv "no last name" no-form.csv | wc -l
Where grep has -i and -v, means case is ignored for the search string “no last name” and in the file, no-form.csv. The -v option gives us only line without “no last name.”
The count without “no last name:” 3359.
A lot better than 92,398 but not really good enough.
Nearing hand-editing so I resorted to LibreOffice at this point.
Sort on column D (out of A to AI) organization. If you scroll down, line 123 has N/A for organization. The entry just prior to it is “musicnotes.” What? Where did Sony, etc., go?
Ah, LibreOffice sorted organizations and counted “N/A” as an organization’s name.
Let’s see, from row 123 to row 3293, inclusive.
Well, deleting those rows leaves us with: 183 rows.
I continued by deleting comments by anonymous, individuals, etc., and my final total is 146 rows.
Check out troll-mining.zip!
Not all copyright trolls mind you, I need to remove the Internet Archive, EFF and other people on the right side of the section 512 issue.
Who else should I remove?
Couple of reasons for a clean copyright troll list.
First, it leads to FOIA requests about other communications to the Copyright Office by the trolls in the list. Can’t ask if you don’t have a troll list.
Second, it provides locations for protests and other ways to call unwanted attention to these trolls.
Third, well, you know, unfortunate things happen to trolls. It’s a natural consequence of a life predicated upon harming others.