Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 5, 2012

Are Texts Unstructured Data? [Text Series]

Filed under: Structured Data,Text Series,Texts,Unstructured Data — Patrick Durusau @ 9:50 am

I ask because there is a pejorative tinge to “unstructured” when applied to texts. As though texts lack structure and can be improved by various schemes and designs.

Before reaching other aspects of such claims, I wanted to test the notion that texts are “unstructured.” If that the case, then, the Gettysburg Address written:

Four score and seven years ago our fathers brought forth on this continent a new nation, conceived in liberty, and dedicated to the proposition that all men are created equal.

Now we are engaged in a great civil war, testing whether that nation, or any nation, so conceived and so dedicated, can long endure. We are met on a great battle-field of that war. We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live. It is altogether fitting and proper that we should do this.

But, in a larger sense, we can not dedicate, we can not consecrate, we can not hallow this ground. The brave men, living and dead, who struggled here, have consecrated it, far above our poor power to add or detract. The world will little note, nor long remember what we say here, but it can never forget what they did here. It is for us the living, rather, to be dedicated here to the unfinished work which they who fought here have thus far so nobly advanced. It is rather for us to be here dedicated to the great task remaining before us—that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion—that we here highly resolve that these dead shall not have died in vain—that this nation, under God, shall have a new birth of freedom—and that government of the people, by the people, for the people, shall not perish from the earth.

Should be equivalent to the Gettysburg Address Scrambled (via the 3by3by3 Text Scrambler):

not are or freedom—and great a consecrated gave by as devotion they created in portion great and fought forth lives that are should we have that is nor died our to not a struggled the testing here, equal.

Now nation, little this to all civil highly we in these remaining say who we work perish nation, endure. engaged brave resolve that for here. the The of shall Four remember take here we fitting we liberty, forget did to It living poor that above and have honored ground. consecrate, place that is they the for have nation, the nation, for so ago what here, score not us—that it that us dead for who to we those from resting here can fathers this.

But, the that here It war. a is far never do years not what of proper new this brought and shall last earth. conceived their be of nation to detract. men dedicated here battle-field dedicated men, and in come are altogether the cause of great can which whether nobly living, this to which dead, rather to birth that who a field, will advanced. add so proposition people, long The the they power new It any us long increased and on we met or sense, for might hallow dedicate these far devotion—that to not note, in a the We dead before on full government dedicated we have from that war, the conceived God, a thus measure can gave of be people, that here We continent so a it, world people, vain—that dedicate, under can shall live. but larger task have dedicated, unfinished final our rather, can seven

It may just be me but I don’t get the same semantics from the second version as the first.

You?

My premise going forward is that texts are structured.

3 Comments

  1. Yes, but…

    Structure and semantics are in the eye of the beholder. True, *you* don’t get the same semantics from the scrambled version as from the original. But I can at least imagine a person who, in at least some state of mind (an LSD overdose, perhaps) would derive exactly the same semantics from both.

    De gustibus non disputandum est. But that doesn’t mean that one’s own appreciation of the meaning of some structured symbols is valueless. On the contrary, those semantics are the *only* semantics that are *ever* actually communicated.

    Lincoln was such a great writer. One of his lesser known works is the following letter, which now reportedly hangs on a wall at Brasenose College, Oxford University, as a treasure of English prose. The last clause has echoed in my mind for many years, and it still gives me a chill to read it. For me, that chill is the very essence of this letter, and it comes straight from Lincoln’s pen to my spine. The grieving addressee may well have derived some healing from it, knowing that her loss was well-understood, even in the halls of power.

    Executive Mansion
    Washington, Nov 21, 1864

    To Mrs. Brady, Boston, Mass.

    Dear Madam

    I have been shown in the files of the War Department a statement of
    the Adjutant General of Massachusetts that you are the mother of five
    sons who have died gloriously on the field of battle. I feel how weak
    and fruitless must be any word of mine which should attempt to beguile
    you from the grief of a loss so overwhelming. But I cannot refrain
    from tendering you the consolation that may be found in the thanks of
    the republic they died to save. I pray that our Heavenly Father may
    assuage the anguish of your bereavement, and leave you only the
    cherished memory of the loved and lost, and the solemn pride that must
    be yours to have laid so costly a sacrifice upon the altar of freedom.

    Yours very sincerely and respectfully

    A. Lincoln

    Comment by Steve Newcomb — November 5, 2012 @ 2:31 pm

  2. I agree they’re structured, but as a student of Allen Renear I have to point out that we haven’t quite figured out what the structure is yet. =)

    http://www.stg.brown.edu/resources/stg/monographs/ohco.html

    Comment by marijane — November 5, 2012 @ 6:42 pm

  3. @Steve Newcomb: +1! about the eye of the beholder but I suspect the “unstructured data” riff conceals the true issue, which is the data doesn’t have a structure the viewer can easily parse. Not the same thing as having no structure.

    The last clause is moving but I remember similar rhetoric for service in Vietnam and any number of other places where freedom wasn’t the issue. No member of the armed services has died in my lifetime defeating threats to my freedom.

    Make no mistake, I think the rank and file of the military serve with honor and merit our support. I would not say the same for the leadership that spends their lives in support of corporate interests and expanded markets. And falsely calls it “the altar of freedom.”

    @marijane: Allen has been beating that horse for a number of years. Being an aficionado of Stanley Fish (http://en.wikipedia.org/wiki/Stanley_Fish), I disagree there is any universal structure to find. The structures we find are the ones we expect to find. And each of us may find different ones.

    The principal failing of OHCO was that is sought to privilege one view of the structure of texts as correct. The authors then encountered texts that were inconsiderate enough to not obey the pronounced rules for their structure.

    For some purposes it may be useful to take the OHCO view (XSL-FO comes to mind). But that is the only basis for adopting any view of the structure of a text, its utility for some purpose of the reader.

    Yes, I argue for various textual structures, produce “evidence” in favor of those arguments, but I can’t appeal to the text as judge. I have to persuade the audience to adopt my view of the text. Not the same thing.

    Comment by Patrick Durusau — November 6, 2012 @ 9:37 am

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress