Ex-KPMG AI Head Builds ‘Base Data Layer for Corporates’ with Microsoft

Engine B, a company created by Shamus Rae, previously KPMG’s innovation and AI head, is working with Microsoft, universities and professional services firms to build the equivalent of Open Banking for corporate data. This in turn will allow legal tech companies and firms to build much more powerful and cross-referenced contract analysis systems.

Or as Rae told Artificial Lawyer: ‘This is building the base data layer across all corporate activity, not just one narrow vertical of data, but across all data stores in a business.’

This matters because, as Rae stressed: ‘If you can’t exchange data then we aren’t going to get anywhere.’

That is to say, without a joined up approach to the data that companies produce, its analysis and real world meaning will remain trapped within silos and so have less benefits.

Such a siloed data environment is not a true reflection of how a business really works, which in day to day life sees all parts of a commercial enterprise connected together. It also means we only get a small part of the picture, which is like looking into a house through a keyhole. Engine B is seeking to open the door so you can see everything there is data for.

In terms of Microsoft, Rae told Artificial Lawyer: ‘Microsoft is working with us on several fronts, data models, dynamics and unstructured data. The flow works like this: Birmingham University Law School – is working on the processes used for different types of contracts and the data flows, and the scope of content within the contract; that then passes to Imperial College – who are looking at entity extraction techniques especially using smaller data training sets; and finally this connects to our data scientists and Microsoft’s.’

They’re also using Microsoft’s latest version of Luis AI, the giant company’s NLP platform, for some of the contract analysis work. Although, Rae told Artificial Lawyer they’d be happy to also work with other companies, such as Kira Systems, that are well-known in the market for their NLP capabilities.

Below is a detailed AL TV Product Walk Through of Engine B – and it really is worth watching to get the gist of things. (Approx half on the walk through, half Q&A – 17 mins total).

AL TV Productions, Oct 2020.

Rae’s key point is essentially this: what if you could build a data analysis methodology that would work across all types of unstructured and structured data collections of information? Then you could link them all together and cross-reference that data to build a much more powerful picture of what is really going on across a business.

For example, what if you had a data analysis system that could read property contracts and also bring to bear payment information, and occupancy data, and information from land registry databases, all at the same time? You’d have a working picture of what is really happening – not just what contracts in one data stack in one departmental silo of a business is saying.

Once you have these ‘mind maps’ that connect everything you create a ‘base data layer’ upon which you can build new analytical tools to gain ever greater insights.

And, this is how the company explains what it does, ‘Engine B provides: A set of data models, which are open source and available to all. These data models can be installed with clients to capture and house client information in an intelligent data access platform.

‘Client data can then be interrogated and analysed through dynamic knowledge graphs. Knowledge Graphs are a programmatic way to model information. They can uncover hidden data and relationships that are too complex for human cognition through applying various graph-computing techniques and algorithms.’

All very exciting. Final question: how will they make money? Rae explained that they will charge micro-payments for the use of the knowledge graphs they develop.

What next? This is a work in progress, as you can imagine given its scale and complexity. But Rae concluded by saying they are hoping the various projects they have in motion will result in a wider commercial roll-out next year.