Looking at a Plaintext Lucene Index by Florian Hopf.
From the post:
The Lucene file format is one of the reasons why Lucene is as fast as it is. An index consist of several binary files that you can’t really inspect if you don’t use tools like the fantastic Luke.
Starting with Lucene 4 the format for these files can be configured using the Codec API. Several implementations are provided with the release, among those the SimpleTextCodec that can be used to write the files in plaintext for learning and debugging purposes.
Good starting point for learning more about Lucene indexes.