Spreadsheets are graphs too! by Felienne Hermans.

Presentation with transcript.

Felienne starts with a great spreadsheet story:

When I was in grad school, I worked with an investment bank doing spreadsheet research. On my first day, I went to the head of the Excel team.

I said, ‘Hello, can I have a list of all your spreadsheets?’

There was no such thing.

‘We don’t have a list of all the spreadsheets,’ he said. ‘You could ask Frank in Accounting or maybe Harry over at Finance. He’s always talking about spreadsheets. I don’t really know, but I think we might have 10,000 spreadsheets.’

10,000 spreadsheets was a gold mine of research, so I went to the IT department and conducted my first spreadsheet scan with root access in Windows Explorer.

Within one second, it had already found 10,000 spreadsheets. Within an hour, it was still finding more, with over one million Excel files located. Eventually, we found 2.5 million spreadsheets.

In short, spreadsheets run the world.

She continues to outline spreadsheet horror stories and then demonstrates how complex relationships between cells can be captured by Neo4j.

Which are much easier to query with Cypher than SQL!

While I applaud:

I realized that spreadsheet information is actually very graphy. All the cells are connected to references to each other and they happen to be in a worksheet or on the spreadsheet, but that’s not really what matters. What matters is the connections.

I would be more concerned with the identity of the subjects between which connections have been made.

Think of it as documenting the column headers from a five year old spreadsheet, that you are now using by rote.

Knowing the connections between cells is a big step forward. Knowing what the cells are supposed to represent is an even bigger one.

