Apache Ranger Audit Framework by Madhan Neethiraj.
From the post:
Apache Ranger provides centralized security for the Enterprise Hadoop ecosystem, including fine-grained access control and centralized audit mechanism, all essential for Enterprise Hadoop. This blog covers various details of Apache Ranger’s audit framework options available with Apache Ranger Release 0.4.0 in HDP 2.2 and how they can be configured.
…
From the Ranger homepage:
Apache Ranger offers a centralized security framework to manage fine-grained access control over Hadoop data access components like Apache Hive and Apache HBase. Using the Apache Ranger console, security administrators can easily manage policies for access to files, folders, databases, tables, or column. These policies can be set for individual users or groups and then enforced within Hadoop.
Security administrators can also use Apache Ranger to manage audit tracking and policy analytics for deeper control of the environment. The solution also provides an option to delegate administration of certain data to other group owners, with the aim of securely decentralizing data ownership.
Apache Ranger currently supports authorization, auditing and security administration of following HDP components:
And you are going to document the semantics of the settings, events and other log information….where?
Oh, aha, you know what those settings, events and other log information mean and…, not planning on getting hit by a bus are we? Or planning to stay in your present position forever?
No joke. I know someone training their replacements in ten year old markup technologies. Systems built on top of other systems. And they kept records. Lots of records.
Test your logs on a visiting Hadoop systems administrator. If they don’t get 100% correct on your logging, using whatever documentation you have, you had better start writing.
I hadn’t thought about the visiting Hadoop systems administrator idea before but that would be a great way to test the documentation for Hadoop ecosystems. Better to test it that way instead of after a natural or unnatural disaster.
Call it the Hadoop Ecosystem Documentation Audit. Give a tester tasks to perform, which must be accomplished with existing documentation. No verbal assistance. I suspect a standard set of tasks could be useful in defining such a process.