Permission Resolution With Neo4j – Part 1 by Max De Marzi.
From the post:
People produce a lot of content. Messages, text files, spreadsheets, presentations, reports, financials, etc, the list goes on. Usually organizations want to have a repository of all this content centralized somewhere (just in case a laptop breaks, gets lost or stolen for example). This leads to some kind of grouping and permission structure. You don’t want employees seeing each other’s HR records, unless they work for HR, same for Payroll, or unreleased quarterly numbers, etc. As this data grows it no longer becomes easy to simply navigate and a search engine is required to make sense of it all.
But what if your search engine returns 1000 results for a query and the user doing the search is supposed to only have access to see 4 things? How do you handle this? Check the user permissions on each file realtime? Slow. Pre-calculate all document permissions for a user on login? Slow and what if new documents are created or permissions change between logins? Does the system scale at 1M documents, 10M documents, 100M documents?
Max addresses the scaling issue by checking only the results from a search. So to that extent, the side of the document store becomes irrelevant.
At least if you have a smallish number of results from the search.
I haven’t seen part 2 but another scale tactic would be to limit access to indexes by permissions. Segregating human resources, accounting, etc.
Looking forward to where Max takes this one.