You may use some other definition of “file corruption” but that’s mine and I’m sticking to it.
😉
The following are all the files that failed against my script and the actions I took to proceed with parsing the files. Not today but I will make a sed
script to correct these files as future accumulations of emails appear.
13544 00047141.eml
Date string parse failed:
Date: Wed, 17 Dec 2008 12:35:42 -0700 (GMT-07:00)
Deleted (GMT-07:00).
15431 00059196.eml
Date string parse failed:
Date: Tue, 22 Sep 2015 06:00:43 +0800 (GMT+08:00)
Deleted (GMT+8:00).
155 00049680.eml
Date string parse failed:
Date: Mon, 27 Jul 2015 03:29:35 +0000
Assuming, as the email reports, info@centerpeace.org was the sender and podesta@law.georgetown.edu was the intended receiver, then the offset from UT is clearly wrong (+0000).
Deleted +0000.
6793 00059195.eml
Date string parse fail:
Date: Tue, 22 Sep 2015 05:57:54 +0800 (GMT+08:00)
Deleted (GTM+08:00).
9404 0015843.eml DKIM failure
All of the DKIM parse failures take the form:
Traceback (most recent call last):
File “test-clinton-script-24Oct2016.py”, line 18, in
verified = dkim.verify(data)
File “/usr/lib/python2.7/dist-packages/dkim/__init__.py”, line 604, in verify
return d.verify(dnsfunc=dnsfunc)
File “/usr/lib/python2.7/dist-packages/dkim/__init__.py”, line 506, in verify
validate_signature_fields(sig)
File “/usr/lib/python2.7/dist-packages/dkim/__init__.py”, line 181, in validate_signature_fields
if int(sig[b’x’]) < int(sig[b't']):
KeyError: 't'
I simply deleted the DKIM-Signature in question. Will go down that rabbit hole another day.
21960 00015764.eml
DKIM signature parse failure.
Deleted DKIM signature.
23177 00015850.eml
DKIM signature parse failure.
Deleted DKIM signature.
23728 00052706.eml
Invalid character in RFC822 header.
I discovered an errant ‘”‘ (double quote mark) at the start of a line.
Deleted the double quote mark.
And deleted ^M line endings.
25040 00015842.eml
DKIM signature parse failure.
Deleted DKIM signature.
26835 00015848.eml
DKIM signature parse failure.
Deleted DKIM signature.
28237 00015840.eml
DKIM signature parse failure.
Deleted DKIM signature.
29052 0001587.eml
DKIM signature parse failure.
Deleted DKIM signature.
29099 00015759.eml
DKIM signature parse failure.
Deleted DKIM signature.
29593 00015851.eml
DKIM signature parse failure.
Deleted DKIM signature.
Here’s an odd pattern for you, all nine (9) of the fails to parse the DKIM signatures were on mail originating from:
From: Gene Karpinski
But there are approximately thirty-three (33) emails from Karpinski so it doesn’t fail every time.
The file numbers are based on the 1-18 distribution of Podesta emails created by Michael Best, @NatSecGeek, at: Podesta Emails (zipped).