The Limits of Generative AI

The Evolving Promise of AI

For decades, Artificial Intelligence has captivated organizations, promising to redefine what's possible. It has continuously pushed the boundaries of what machines could achieve, with new announcements of AI progress consistently capturing the imagination. However, the practical applications didn't always live up to those grand possibilities.

Now, we have a truly transformative leap: Generative AI. Unlike symbolic AI that followed explicit rules, or even traditional machine learning focused on narrow predictions, Generative AI not just understands patterns, but excels at creating entirely new things. Crucially, it also learned to generalize. A single, powerful model can now tackle many different problems, dramatically simplifying the complexity of older machine learning approaches. This isn't about classifying a dog; it's about drawing a new dog that's never existed, writing a story from scratch, or summarizing a document on a topic it has never seen before.

At their core, these generative AI models are trained on truly vast amounts of diverse data—from publicly available web pages to licensed content and open-access datasets. This allows them to internalize broad patterns of language, visuals, logic, and world knowledge. Their impressive breadth lets them feel like instant experts across countless domains.

Understanding and Operationalizing Generative AI

So how do these remarkable models actually work? To truly harness their power, we need to look under the hood at their training, their structure, and how they're put to use.

The journey of a generative AI model begins with its foundational training. These colossal models require truly staggering volumes of diverse data to internalize the patterns of language, visuals, and logic. This data is fed to powerful learning algorithms during a training process that is an immense undertaking, typically running on massive clusters of specialized, high-performance computers for days, weeks, or even months at a time. This sustained computational effort builds the intricate, multi-layered connections that define these vast models.

Today, the two most prominent types of generative AI models specialize in distinct domains:

Transformers, the powerful Large Language Models (LLMs) that underpin chatbots like ChatGPT, excel with language. Their core idea is to master language by predicting the next word in a sequence. Given an initial phrase, the model predicts the most probable next word, adds it to the sequence, and then predicts the word after that, building coherent text one word at a time. By mastering this recursive process, they become incredibly adept at generating fluent, remarkably human-like text.

Diffusion models specialize in generating stunning images, videos, and even audio from simple text descriptions. Their core idea is surprisingly intuitive: imagine taking a clear image and progressively adding random static until it’s pure noise. A diffusion model is then trained to meticulously reverse that process—removing the noise step-by-step until a clear, generated image emerges from the chaos.

Following this foundational training, models undergo a critical refinement phase. When first trained on the unfiltered expanse of the public internet, they are, in a sense, 'wild.' They absorb everything—the good, the bad, and the ugly. A raw model might generate harmful, biased, or dangerous content; for instance, it might respond accurately to a question on how to build a chemical bomb. This is akin to bringing home a powerful, energetic puppy: without consistent guidance, it might grow up to do unpredictable, even destructive, things. This is where specialized techniques, often involving human feedback, are used to 'teach' the model to behave. This refinement process guides the model to refuse harmful requests, avoid bias, and follow ethical guidelines, transforming it from a raw capability into a controlled, reliable tool.

Key Challenges for Organizational Use

Once trained and refined, understanding a model's internal organization is key to using it effectively. This is where the major challenges for business implementation appear.

The 'Black Box' Knowledge Problem

Perhaps the most counterintuitive aspect of these models is how they "know" things. The information isn't stored in any logical, organized, or structured way. It's a true 'black box.' There are no neatly compiled lists of countries or organized tables of presidents stored internally. Instead, their vast knowledge is represented as billions of intricate connections and numerical 'weights' within a colossal network of artificial neurons.

To picture this, imagine a world-class chef who consistently creates perfect, complex dishes. Their culinary intuition is flawless. But if asked for the exact, written recipe, they might struggle or give a slightly different version each time. Their expertise isn't stored as a cookbook in their head, but as countless patterns of taste, touch, and experience encoded into their intuition—their own neural network.

That's precisely how these AI models hold knowledge: as embedded patterns, not as retrievable facts. They are powerful statistical engines, incredibly good at predicting patterns, but they don't 'know' objective truth. When asked a question, they aren't accessing a factual database; they're calculating the most probable answer based on the patterns they've seen. This is why they can generate plausible-sounding but entirely incorrect information—a phenomenon known as 'hallucination.' This presents a fundamental challenge for any business, as consistently accurate and verifiable answers are paramount for critical operations.

The Deployment Dilemma

With an understanding of the models, the next consideration is how to deploy them. A large enterprise might consider installing these powerful AIs on its own servers, but the reality is that these foundational models are truly colossal. Managing the required computer clusters is an incredibly complex and expensive undertaking. Furthermore, with new, better models emerging every few months, the investment would be outdated almost as soon as it's made.

This is why most companies access these models remotely through a provider's API. This allows companies to integrate cutting-edge AI without managing the underlying machinery. However, this means a company's data, including sensitive proprietary information, must leave the safety of its own digital walls, a prospect that makes many organizations uncomfortable.

The 'Temporary Memory' Bottleneck

Finally, the most critical limitation is the model's 'temporary memory.' When you interact with a generative AI model, any specific data it needs to form its answer—a particular document, an email thread, or a set of numbers—must be sent along with your prompt.

This input has to fit within the model's small, temporary working memory for that single interaction. This memory, while growing, remains severely limited. It can hold only a tiny fraction of the information an enterprise possesses, akin to describing an entire library by showing someone just a single, small index card. For the model to generate a relevant response, the information sent in the prompt must be highly curated and precise. The model processes this small, temporary batch, generates its response, and then immediately forgets it for the next interaction.

This fleeting memory severely limits the model's ability to build a comprehensive, contextual understanding of your unique business operations. Its utility remains severely restricted without a robust external system designed to manage and feed it the right knowledge on demand.

The Path Forward

Generative AI acts as half of the brain. While it offers unprecedented power, its 'black box' nature, immense operational costs, and critical memory limitations create a formidable barrier for enterprise use. The single greatest challenge is bridging the gap between this powerful, generalist technology and the specific, proprietary knowledge of a business.

To address this challenge the other half of the brain is required. A structured organizational memory working in perfect harmony with Gen AI models, transforming a brilliant but amnesiac expert into a true corporate asset. Together they will form the Enterprise Brain.


Gerard Francis

Next
Next

The Agony of Data Access