Blue Morpho Manifesto for implementing Knowledge Graphs, Ontologies and LLMs in the enterprise

Jérémy Thomas

18 Oct 2024 — 8 min read

In many ways, building an early-stage startup feels like going through an existential crisis. “Why do we exist? Why us? Why now? Who are we? What are the things that we are after that truly matter?” Embracing doubts and uncertainties, testing things are parts of the process to emerge with educated faith, not blind belief.

The goal at this very early stage is to gain extreme clarity on core hypotheses: What are our intuitions, our fundamental bets on the future? What are the impactful realities that we see and most others don’t see yet? What makes us different? What are we doing that is hard?

*It’s all about finding the right balance between thinking and doing.* *Source*

I have listed here just some of our core beliefs, that Adrien, Aloÿs, more recently Thomas, and myself have been battle-testing for a few months now and gained strong conviction in. Which means we’re now building, heads down! :)

A few things we @ Blue Morpho believe in: our Manifesto for implementing Knowledge Graphs, Ontologies and LLMs in the enterprise

First belief: Organization’s data is not ready for the AI revolution that is coming, because it lacks connectivity and a shared structure

Organizations have countless complex applications, multiple relational databases comprising columns with diverse terminologies, and an untapped reservoir of unstructured documents. In order for AI systems to know where to look for to perform certain tasks, they need:

A layer of well-structured metadata, where all data - structured and unstructured - is mapped to shared semantic concepts, finely categorized, and standardized within each category.
Rich connections between data items, allowing seamless navigation and contextual analysis across related data points.

Second belief: Knowledge Graphs and Ontologies solve this, they are the next evolution of knowledge representation

They align domain-specific data from various sources within a semantic framework consistent with an organization’s worldview, missions, and use cases. For example, a pharmaceutical company may focus on Drugs, Diseases, Clinical Trials, and Patients. Building a graph involves clearly defining these concepts and their hierarchy (e.g., Clinical Trial >> Phase I Trial, Phase II Trial), their properties (e.g., trial outcomes), relationships (e.g., Treatments cure Diseases), and encoding business rules applying to these objects (e.g., elderly patients should avoid drugs with heart failure risks).

Currently, this logic resides in SaaS applications, BI tools, people’s minds, and documents. Documents in particular have caught our attention @ Blue Morpho. We believe documents, written in natural language, are the foundation of the enterprise knowledge graph. The first brick of our platform focuses on translating domain knowledge and business logic from documents - such as clinical study reports, regulatory submissions, and contracts - into formal structures and rules stored in knowledge graphs. While we think documents are the best starting point, our broader vision is to encompass both structured and unstructured data.

Third belief: Building Knowledge Graphs and Ontologies in the enterprise requires a great deal of automation…

Though they’ve existed for a long time, Knowledge Graphs and Ontologies haven’t become ubiquitous yet because building, scaling, and evolving them was nearly impossible—until LLMs. Take the financial domain for example. No people or team of people in the world can hold the complexity of the financial domain in their heads, no matter how experts they are. There are so many concepts involved, relations between them, geographical subtleties, regulatory updates, new financial products…

We believe LLMs are the missing piece, enabling automation of much of the work needed to build and maintain ontologies at enterprise scale. With LLMs, ontologies can be constructed from the bottom-up, starting from the data they need to describe, rather than top-down, starting from people’s minds. LLMs can read data, extract entities, classify them into ontologies, and suggest changes to existing ontologies. With version control, changes can be dry-run, rolled back, and tested for robustness and coherence against a certain set of rules, ensuring an easy comparison between ontology versions.

Fourth belief: [Automation]… As well as users interactions to incorporate an organization’s unique view of the world

There is not one universal ontology of the world: ontologies reflect specific views of the world, that’s why building them is inherently an interactive process. An ontology is a data model after all and like any model, it is wrong but it might be useful, depending on the use case.

While much of the process can - and must - be automated to achieve enterprise scale, we believe user input remains crucial for very specific decisions that cannot (neither today nor tomorrow) be made automatically. Unlike companies that see user feedback as a temporary step to train their algorithms, hoping user interaction won’t be needed someday, we want users to be in the loop, but just as needed to capture their unique perspective, the one that sets their organization apart from competitors.

Fifth belief: A neuro-symbolic architecture is needed to achieve this

As anyone who has played with them knows, LLMs are naughty kids. We don’t believe in architectures where LLMs converse with other LLMs without strong guardrails.

Neuro-symbolic architectures combine the creativity of neural networks with the rigor and determinism of symbolic systems, which handle logical rules, knowledge representation, and reasoning. Neuro-symbolic loops can be applied everywhere. For instance, when extracting relationships between entities (neural step), if some nodes are isolated (symbolic step), a warning is raised so that the LLM can revise its output.

Sixth belief: The implications of this for the future of AI in the enterprise are absolutely huge

RAG that works!

As some papers have recently demonstrated (Microsoft GraphRAG, StructRAG), intelligently structuring data greatly improves Retrieval Augmented Generation (RAG) results in terms of accuracy and contextual relevance. It’s even more than that: in an enterprise context, with complex queries and thousands of heterogeneous documents, RAG will not work without structuring data first.

Improving RAG with graphs is a marvelous way to get started and bring value quickly. But graphs aren’t just a better way to do RAG. There is much more to it, that we think we can build a multi-B$ company upon.

Improving reliability and explainability of Gen AI

A Knowledge Graph serves as a single source of truth that can be audited, explained, fixed if there are errors in it, and updated incrementally when new data arrives. These features make it ideal as an external memory for LLMs. Additionally, the structured nature of Knowledge Graphs allows for automatic memory tidying, such as detecting obsolete or contradictory information (e.g. if a company has two CEOs, a symbolic warning is raised and LLM agents are asked to clarify the situation).

These capabilities are not just nice-to-haves, they are essential prerequisites to enter the large enterprise segment we aim to serve.

Data integration & systems interoperability

This challenge has persisted for decades already. Despite efforts to centralize data in data lakes, warehouses, and lake-houses, true integration isn’t achieved just by storing data in one place. Knowledge Graphs provide a shared semantic layer between LLMs and proprietary data, linking applications through common semantics and business logic. With an ontology in place, data integration involves mapping raw data to the ontology, while application development focuses on creating interactions with ontological objects.

In the future, Knowledge Graphs will be used not only to describe the business, but also to operationalize it

Because they are structured, Knowledge Graphs are machine-readable (the objects they contain can be instantiated!). They also contain standardized business logic. Bob Muglia, former CEO @ Snowflake and a huge fan of Knowledge Graphs, says:

“Eventually, it is possible for [the rules they contain] to be fully executable and actually run aspects of the business within the database and the knowledge graph because relational knowledge graphs have the ability to execute business logic.”

“When we have an economical, usable relational knowledge graph that takes the power of the relational mathematics and applies it to data of any shape and size, we’ll be able to model business. For the first time we will be able to actually create a digital twin of the business.”

Bob Muglia

Knowledge Graphs will change user interfaces forever; and no, we're not talking about raw graphs visualization!

While natural language interfaces are useful for some tasks, they are often overrated, and raw graphs make for an even worse UI. But graphs can be used to generate structured reports with a dynamic, locally-relevant structure. Let us explain.

Imagine being an analyst at an investment bank tasked with analyzing Mongolian Mining Corporation, a coal mining company listed in Hong Kong. With no background in the coal industry, you are overwhelmed by hundreds of pages of market research and annual reports.

The problem is, you don’t even know where to start and what relevant questions to ask. Instead of struggling with random queries and poorly structured answers, Blue Morpho can structure the documents into a knowledge graph and ontology. The resulting interface wouldn’t be a raw graph visualization however, but structured reports, similar to enriched encyclopedia pages, with the graph powering the content in the background.

For example, the Mongolian Mining Corporation page would be organized around key concepts from the ontology like Business Segments or Subsidiaries, like other company pages. But it would also contain more specific concepts, like Mines and Coal Reserves (because the company’s mining licenses and coal reserves are related to it in the graph, and classified in the ontology as such >> local information is injected from the nearest neighbors). The text of the report would be generated offline by LLMs, and if in a sentence you stumble on a term you don’t know, such as “metallurgical coal”, chances are it is also a node in your graph, meaning you can click on it and keep exploring.

This navigation pattern emulates the speed (milliseconds!) and connectivity of the web for internal documents.

Wrapping up - Last but not least: our organizational beliefs for building a high-performing team

Our vision is big and wild, and we are building a team of exceptional and determined people to get there.

Everyone who joins the team has the power to change the trajectory of Blue Morpho. That’s the way we think about hiring, and that’s how high our bar is.

We are optimizing the company for the most extraordinary outcomes: we won’t settle for small gains or average results.

We believe in small teams.

We believe in sharing strategic information to everyone internally, irrespective of their job title, because great ideas can come from anywhere, provided that team members are given the right context to form valid ideas. We believe in openly discussing strategic topics, but we’re not looking for internal consensus either, because consensus too often yields average results. For every strategic consideration, there’s one and only one person in charge. The strategy is discussed openly but in the end, the person in charge makes the decision, takes responsibility for it, and everyone else 100% commits to it, whatever they think about it.

We are generous in stock options: we want team members to own a share in Blue Morpho’s success.

If this resonates with you, drop us a line on LinkedIn ;)

Blue Morpho Manifesto for implementing Knowledge Graphs, Ontologies and LLMs in the enterprise

Jérémy Thomas

Read more

Announcing the launch of Blue Morpho!