The topic of Agile Architecture for software development has a broad spectrum. Multiple authors have come to different proposals for an architecture capable of making software changes and maintenance easy, quick, and cheap. The following paragraphs extract concepts from my book Scrum in AI and present a practical example of Agile Architecture with an exciting Open Source project called “Cheshire Cat.”
What means Agile Architecture?
I’d like to start with Alistar Cockburn’s definition of Agile:
"The ability to move and change direction quickly and with ease."
With this definition in mind, an Agile Architecture is the one that makes it easy to change and improve the product or, like Craig Larman often repeats, "to turn on a dime for a dime." This means that Design for Change is more important than design for perfection at the first strike.
So, if the architecture is fully emergent, in some way evolutionary, this means that shouldn't you make any up-front design decision? The truth is that Architecture, in Agile terms, is "the set of all the decisions that you cannot not take."
The typical Agile approach encapsulates these design decisions that you cannot not take into modules. Let's define a product module as a means to deliver value the customer perceives. A component, instead, would be a convenient way to separate elements, given their technical nature or the way it is distributed.
Organizational Architecture and Product Architecture.
When scaling up an Agile Organization from a single Scrum Team to an organization of multiple teams and dozens or hundreds of people, you must consider multiple constraints. The first critical constraint is the relationship between the organization and the product architecture, defined by Conway's Law 1967:
Any organization that designs a system will produce a design whose structure is a copy of the organization's communication structure.
If you have four groups working on a compiler, you'll get a 4-pass compiler.
Given Conway's Law, evaluating an organizational architecture without having the product architecture on the side and vice-versa is conceptually wrong. The two will influence each other, and the strongest will drive the other. Since the purpose of any company is to deliver frictionless value to customers, the organization should be the simplest human structure to deliver value through a product.
Agile Architecture and Artificial Intelligence.
Coaching a Team working with Natural Language Processing, I used the same Faraday Future video described before as a metaphor. I liked this metaphor a lot because, in electric cars, the batteries power the engine and the vehicle, but the engine chargeback the batteries in deceleration. This bi-directional relationship sounded like the process of Machine Learning model training (batteries powering the engine) and data enrichment (engine charging back the batteries). A clever UX designer from the team had the idea of visualizing the modules on Canvas to facilitate the User Stories' clarification and design in a canvas-like format. Following a re-elaboration of the original NLP Architecture Canvas for generic use:
Why do we consider these three parts of the architecture to be modules and not components?
- Each module could have its users, and we could combine a different set of Batteries and Engines to serve other Vehicles.
- The vehicle could also represent existing applications and not only new AI-specific developments. At the same time, Batteries and Engines could be internal developments of external tools defined only by their interface.
- Development and Lifecycle were expected to be different; we hypothesize the eventuality of rebuilding a module without significantly changing the others.
With this Modular Architecture, we assessed people skills to map the relationships between Product Architecture and Teams Architecture.
This example would probably be too theoretical and insufficient for you to implement this concept in your company. So, to give you more details without violating the NDA with this company, I’d like to describe the same concept of an AI Agile Architecture using an open-source example: “Cheshire Cat.”
Cheshire Cat, a practical example
The Cheshire Cat is an open-source framework that allows you to develop intelligent agents on top of many Large Language Models (LLM). The Cheshire Cat embeds a long-term memory system to save the user's input and answer informed by the context of previous conversations. You can also feed documents into the memory system to enrich the agent's contextual information. You can extend the Agent capabilities by writing Python plugins to execute custom functions or call external services.
You can think of Cheshire Cat as the “WordPress for Natural Language Processing with LLMs.”
Cheshire Cat Architecture
Cheshire Cat shows very elegant software architecture, detailed in the documentation:
Following the NLP Canvas previously mentioned, Cheshire Cat architecture could be summarized like the following:
Cheshire Cat is distributed as two Docker containers: cheshire-cat-core and cheshire-cat-admin. These two containers represent the Vehicle (cheshire-cat-admin) and Batteries (cheshire-cat-core). To be fair, with the complex Cheshire Cat agent architecture, it’s an understatement to call it Batteries; it should be more “batteries and electronics.”
The stock Web application works as an Admin console, where you can configure and chat with the Cheshire Cat to test the behavior. As far I understand, it’s not an application meant to be customized by people who want to build their applications, rather than just an example of how to interface with the Cheshire Cat Core.
To make the development of client applications easier, I developed some adapters, available here:
This collection of adapters includes a Chat through Rest API, a Command Line Interface, a single page HTML, and an Apple Siri interface.
Interview with Piero Savastano, founder of Cheshire Cat.
Thank Piero, for your time. How did you start your career as a Data Scientist and AI expert?
I've been following Artificial Intelligence topics since I was 19 years old, now it's been 20 years, coming from a background in Psychology, Neuroscience, and Computer Science, I've been quite interdisciplinary. About Psychology, I was very interested in understanding the patterns of brain functioning, so neural networks got me very passionate. Once I graduated, I started as a researcher at CNR (The Italian Research Center) and worked a lot on neural networks. Then I continued as a consultant on these topics, which I am passionate about on a personal level and not only as a professional.
How did you get the idea of the Cheshire Cat?
What I have noticed over time is the continued centralization of these techniques and the predominance of the brute force approach. This has led to the phenomenon that prevents the creation of models in-house, which has materialized both in computer vision and with language.
In wondering what I can do as a consultant and trainer, but also as a former researcher, to enrich this ecosystem and make it a little less centralized, I have come to the idea of doing something to benefit even the small players. I would like Cheshire Cat to become something to what WordPress has been in the Web arena, which has allowed professionals and small businesses to create their online presence from a simple website to a complex and scalable platform.
The idea also is to create a system that assumes that LLM is not the entire artificial intelligence but only a part of it. It is, therefore, a cognitive architecture in which the LLM performs the function that Broca’s area performs in the human brain. In the human brain, Broca's area is the area of articulated language, which translates cognitive phenomena occurring elsewhere into language.
And finally, I would like us to be able to leverage the competition that happens among big tech in creating more and more comprehensive and less expensive LLMs and, by making an adapter to talk to everybody, make a bit of a slap in the face of the centralization and monopoly of big companies and dominant countries by bringing control back to the users.
In the future, slowly, as consumer hardware manages to reach an acceptable level, we can propose to Cheshire Cat an on-premise model and completely disengage from the on-cloud big corporations and avoid sending our data outside our corporate or private perimeters.
How did you come up with the idea of treating the Cheshire Cat prompt in two steps: process a hypothetical prompt, enrich it with information, and then create the actual prompt?
The idea comes from the "Information Retrieval" literature, in which instead of searching directly, an assumption is made about a potential answer before searching. The original paper, which came out a few months ago, was titled "Precise Zero-Shot Dense Retrieval without Relevance Labels." in which they propose to pivot through Hypothetical Document Embeddings (HyDE): https://arxiv.org/abs/2212.10496
Yes, LangChain can do this, but I reimplemented it entirely because I wanted the ability to insert Hooks and build customization plugins, and LangChain's sophisticated abstraction prevented me from doing so.
What do you see in the future for Cheshire Cat?
From the technical point of view, we will focus on increasing the Cheshire Cat’s simplicity, extensibility, and robustness. Much work will need to be done with documentation as I want to grow the ecosystem of contributors and derivative projects. Above all, I would like to grow the community and make it a structured entity where contributing is enjoyable.
Finally, my goal is to give back to the Open Source community, from which I have benefited greatly in my career, and I think it is time for me to contribute.