Superintelligence: The Next Great Power Struggle

When a novel AI system can rapidly design useful new antibiotics like Halicin or engineer advanced chip layouts in hours that rival the best work of humans, we’re witnessing civilization-scale innovation as a real form of economic power. Hundreds of billions (and soon trillions) of dollars are currently flowing into AI R&D, from startups, big tech, and governments. Many of them are now sprinting toward superintelligence, which is basically defined as AI smarter (at least by a bit) than the most intelligent humans. Whoever controls the most powerful superintelligent models will literally hold the keys to future economies, scientific discovery, and social order.

Unfortunately, for now, today’s policymaking remains light-speed behind the technological development of AI. In 2023, Biden’s Executive Order 14110 instituted watermarking and a number of federal AI safety roles, but the follow-through was quite limited. And now, today, the White House under Trump unveiled “America’s AI Action Plan”, which is a bold pivot focusing on deregulation, rapid data-center expansion, and AI export bundles to allied nations.

The core recommendation: Create secure, full-stack AI export packages — hardware, software, standards — to anchor global model adoption to U.S. influence. It is essentially a geopolitical and economic gambit, not a working policy that will actually address the implications of how maintain some level of control over the key technical breakthroughs of superintelligence, when it arrives.

🔍 What’s at Stake

  • Scientific breakthroughs: AI is already accelerating drug discovery by years. The investment Just for Mitochondrial diseases has crossed $500M this year alone. More dramatic discoveries are likely when superintelligence arrives. The economic potential is almost certainly in the trillions.
  • Unexpected spin-offs: Quantum-safe optimization, climate model reductions, brain‑computer interface advances—none of which fit neatly into today’s regulatory boxes.
  • Global norms on open models vs. closed: Hugging Face CEO Clem Delangue warned this week that China’s open‑source models risk embedding state-driven censorship “cultural aspects…the Western world wouldn’t want to see spread”.

🚦 Forks in the Road Ahead: Open vs Closed AI Models

This is the defining dilemma of AI: Open models accelerate progress, but closed models consolidate power. Early evidence is clear: Open-source AI models can boost economic value and innovation, by enabling faster iteration, reproducibility, and a broader developer base. But openness comes with a steep geopolitical price: China now leads in open AI development, controlling 60% of global frontier models as of 2024 by some measures.

The reality is that state actors can co-opt Western breakthroughs overnight, at least in open settings. On the other hand, closed models may slow innovation and limit oversight but they can retain strategic control, keeping the most advanced capabilities behind corporate or national walls. The question isn’t whether openness or secrecy is better—it’s which risks we are prepared to absorb: Stagnation and concentration, or proliferation and misuse. This tradeoff now defines the fork in the road ahead:

  1. Full-Steam Race
    No regulation; model innovation runs wild. Risk: Runaway power without checks. Global surveillance, identity manipulation, or misaligned AGI. The new AI Action Plan from the White House essentially puts us on this trajectory.
  2. Carefully Coordinated Regime
    Binding export and compute caps, model provenance tracking, IP control, and multilateral audit frameworks—akin to nuclear treaties.
    Think: Global AI Marshall Plan calling for democratic compute and monitoring consensus.
  3. Hybrid Path
    Strong national innovation, transparent oversight, and alliance-level tech guarantees—domestic push with democratic allied coordination that protects and controls technical advances in AI models to keep them out of the hands of unfriendly foreign governments and bad actors.

🔗 Key Policy Gaps

  • Export as influence: The new plan is about embedding U.S. standards through export bundles, not regulating domestic model releases  .
  • Open-source bias: Without norms, Chinese models on open platforms may propagate censorship or propaganda  .
  • No AGI governance: While massive investment is being spent on data centers and power generation, there are no mechanisms to regulate frontier model alignment on superintelligence and beyond.

🧭 Recommendations—Hedging Civilizational Risk

The core risk of uncontrolled open AI is this: Once a sufficiently advanced model is released, it cannot be recalled. And unlike traditional software, frontier models can enable catastrophic misuse with minimal modification. Researchers have already shown that large language models can generate viable biological weapon synthesis protocols, design novel pathogens, and construct autonomous cyberattack chains. These are capabilities once limited to nation-state labs are now one fine-tune away from public availability (RAND, Nov 2024; NTIA, Jan 2024).

The threat isn’t theoretical: Open models like LLaMA have been jailbroken and proven to work around safeguards within weeks of release. Without shared global standards for provenance, control, and auditing, we risk seeding the digital equivalent of unregulated nuclear material into the open internet. The recommendation section that follows assumes this reality—and asks what firms, enterprises, and governments must now do to stay ahead of the curve, without ceding the future to chaos or authoritarian dominance.

For AI Firms (OpenAI, Meta, Anthropic, etc.)

  • Implement secure-by-design: Watermarking + weight-signing + open audits.
  • Delay frontier model release until audit, licensing, and multi-party governance are in place (e.g., export as package, not leak).
  • Join international coalitions to standardize responsible openness.

For Enterprises

  • Demand model provenance, watermark verifiability, and supply-chain traceability.
  • Invest in hybrid infrastructures: cloud + on-prem control to hedge against AI ecosystem failure.
  • Insist on explainable alignment as a procurement standard.

For Governments

  • Embed AI export strategy within a democratic bloc—U.S., EU, Japan, UK—to enforce safety norms.
  • Mandate transparency: National oversight bodies to certify “open-yet-aligned” frontier models.
  • Prepare deterrence doctrine: credible threat of sanction or tech suspension against misaligned or weaponized AI use.

🏁 Conclusion

Those who follow my work know that I’m very much a tech positivist. But all innovation is a two-edged sword, and AI is likely the most powerful technology we’ve ever developed. It’s a watershed moment and we stand at a genuinely historic juncture: Build democracy-enabling superintelligence, or unleash power that could reshape societies without democratic control. The path ahead demands fusion. Bold innovation with ironbound governance, while we still can. If we balance speed with structure, we might just build the bright future that many of us are hoping for.

Enterprise Tech Predictions for 2025

As 2025 begins, I find that the global technology landscape is on the cusp of entering a major new era. It’s one almost totally defined by the arrival of pervasive AI, combined with the urgency for breathtaking speed, scale, and complexity in execution. As businesses worldwide pivot to capitalize on the vast new digital opportunities that AI delivers, I find that there are five key factors looming large in shaping this transformation: The hyperaccelerated adoption of artificial intelligence, the ubiquity of cloud computing across a wider spectrum, a surge in data security and privacy concerns, the tightening of the tech talent pipeline, and the growing war chests required to participate in the game at all these days.

Together, these forces push enterprises to refine and uplevel their digital ambitions. The pace and scale is also driving high-stakes investments in infrastructure and skills that are reshaping how and where innovation happens. Tech associations like the Open Compute Project and IEEE are reporting record interest in the very latest cutting-edge research, underscoring a universal appetite for next-level breakthroughs that promise to redefine the global economy.

However, this tech evolution also unearths fresh challenges as organizations grapple with bottlenecks in resources and policy. Cloud infrastructure, once an enabler of nimble deployments, now requires massive capital expenditures and continuous optimization to serve ever-growing user demands, while AI’s fast-expanding capabilities put unprecedented pressure on outdated governance, legal, and compliance frameworks. Concurrently, data management, security, and privacy remain a top priority for enterprises of all sizes, creating pressing needs for standardized global regulations that can keep up with cross-border data flows. At the same time, the worldwide shortage of skilled professionals capable of building AI-powered products in an environment of high complexity adds another layer to the challenge.

Almost forgotten in the rush are sustainability initiatives—driven by environmental considerations and highlighted by global trade and massive AI buildouts—must now be baked into every strategic decision in most enterprises. The net effect is a dynamic global environment where the gap between leading and lagging companies and their economies will almost certainly widen, ushering in an age that demands almost immediate decisive action, deep investment that is long term, and a bold reimagining of what enterprise technology can be used to achieve.

2025 Tech Trends by Dion Hinchcliffe

Here’s how I see this year shaping up on this high-speed, increasingly hypercompetitive, and very disruptive trajectory:

Prediction: AI will continue to drive the tech industry and stock market for now

AI is set to dominate both the technology landscape and the stock market through 2025, but the path to glory will not be smooth for all players. Several giant tech firms—think Amazon, Google, Microsoft, IBM, and Meta—can easily encounter their first serious missteps in AI research or go-to-market strategies. These stumbles will likely stem from issues like overly optimistic revenue forecasts, mounting regulatory concerns, unsustainable costs, or fragmented internal priorities that hamper agility. Meanwhile, new entrants—along with established yet more focused players—will experience outsize gains as they double down on specialized AI hardware, software stacks, and vertical industry applications. NVIDIA, for instance, holds a uniquely powerful lead with its GPU technologies, robust developer community, and near-ubiquitous CUDA stack that is used in everything from data centers to supercomputers. As others attempt to pry the AI hardware crown from NVIDIA, there will be stiffer competition from chipmakers experimenting with AI-optimized architectures, but the long-standing ecosystem lock-in tilts the playing field strongly in NVIDIA’s favor for the next half-decade.

Still, the ongoing mania around AI stocks could cool if investors fail to see additional real-world success stories emerging. Companies like Palantir and Lenovo have enjoyed noteworthy financial success with AI-powered offerings, and their strong results feed into the broader narrative of limitless potential. But if only a handful of poster children can demonstrate consistently healthy revenues from AI initiatives, the market’s overall enthusiasm will start to wane. A more sustainable trajectory would require further success from a deeper bench of AI adopters—think specialized startups and forward-leaning enterprises across healthcare, finance, and industrial manufacturing—whose compelling use cases validate the technology’s staying power. Trade associations such as MLCommons and the ARC Prize Foundation are actively working to standardize AI and AGI performance benchmarks, which can help weed out inflated claims and bolster those who genuinely deliver. As these benchmarks and real-world implementations mature, we’ll likely see a more rational, albeit still fast-evolving, AI investment climate, with a small number of clear winners pulling away from the pack. But unless there are at least a half dozen AI profit-takers in 2025, the AI freight train may slow down a bit. And it’s hard to see how companies like Microsoft, who just announced they are investing a stunning $80 billion in new data centers in 2025, getting their investment back with profits any time soon.

Prediction: Autonomy in all its forms will emerge as a top focus in 2025

AI agents and humanoid robots will reshape the future of work with speed and breadth that few anticipated. Agentic AI—exemplified by Salesforce Agentforce and other advanced frameworks like IBM’s watsonx.ai agents—will take center stage as evidence grows that organizations of all sizes will experiment with virtual bots that can plan, converse, coordinate, and make dynamic decisions. These virtual agents will essentially serve as the vanguard for the more tangible robotics revolution, allowing businesses to fine-tune their AI workflows, data integration, and governance models before physical machines enter the scene in larger numbers in coming years. By honing processes around agent-based AI, enterprises can prepare for the complexities of robot-human collaboration, training algorithms in low-risk, cost-effective environments that pave the way for next-generation humanoid robots.

However, the full rollout of physical, humanoid robots—like Tesla’s Optimus or Boston Dynamics’ Atlas—will more likely kick into high gear in 2026 and beyond. Once these robots start arriving at scale, their impact on the global labor market—estimated at around $50 trillion—will be profound. Hospitals, factories, service industries, and even retail will begin outsourcing a higher share of routine tasks to robotic systems, while AI skill marketplaces proliferate to help HR departments select from an array of specialized bots that compete with human talent. Although the idea of “replacing humans” definitely sparks concerns, widespread testing of virtual agents in 2025 will help mitigate risks and manage the transition more smoothly. By the time physical robots gain traction, organizations and employees alike will have established best practices for integrating AI-driven labor, ultimately creating a synergy between digital and physical agents that frees up human workers for the most creative, complex, and high-value tasks.

Prediction: The race accelerates to achieve AGI and superintelligence

The current pursuit of Artificial General Intelligence (AGI) represents the ultimate frontier in machine learning, yet its precise definition remains a subject of spirited debate. Some contend that AGI should be capable of learning and performing any intellectual task that a human can, while others insist it must reach a level of self-directed creativity and reasoning that surpasses human aptitude. This ambiguity creates fertile ground for competing visions, with leading minds such as Sam Altman’s work at OpenAI, as well as researchers at Anthropic, exploring multiple pathways to accelerate AGI development. At stake is not just the technical challenge of achieving general intelligence—researchers must also tackle knotty governance, ethical, and interpretability issues that arise when AI systems can adapt and evolve in ways their creators might not fully anticipate. Many experts argue that getting alignment right—ensuring that the AI’s goals match human values—remains a daunting obstacle, especially as the technology edges toward a level of sophistication that borders on self-directed intellectual exploration. While I personally believe it’s difficult to achieve intelligence superior to humans with training data that only exists at the human-level, the training approaches will soon enough overcome this hurdle.

Despite these complexities, the allure and prestige inherent in the pursuit of AGI is powerful. The technology’s proponents envision profound breakthroughs in every domain touched by information and computation, from personalized healthcare to the discovery of truly novel inventions to climate modeling on an unprecedented scale. An AGI capable of synthesizing massive data sets and generating creative, strategic outside-the-box solutions could compress decades of human-driven discovery into years or even months. The fervor surrounding AGI also explains why major stakeholders in the AI race devoted considerable resources to securing top talent and unrivaled compute capacity—each believes that the first to achieve truly generalized machine intelligence will gain transformative advantages, influencing not only the future of business but also humanity’s trajectory in fields such as education, medicine, and beyond. Thus, while the goal of AGI remains elusive, the competition to get there is intensifying, fueled by both the promise of astonishing innovation and the recognition that whoever solves the puzzle may well shape the course of the 21st century. Specifically, my prediction is that AGI will consume many of the very best minds in AI, for vast investment and talent sinks but uncertain outcomes other than claiming leadership in the industry. This will likely delay short and medium-term ROI for many top AI companies.

Prediction: For better and worse, the ascendance of the tech oligarchs

The rise of tech oligarchs like Elon Musk, Peter Thiel, Mark Zuckerberg, and Marc Andreessen showcases how influential individuals with concentrated capital and a knack for success in disruptive innovation can reshape entire industries—and increasingly, societies. Their ventures range from AI and social media to aerospace and advanced computing, giving them the power to steer not just new technologies but also the cultural and civic currents surrounding them. By virtue of holding the purse strings to frontier research and spearheading high-risk, high-reward ventures, these figures can quickly move markets, corral talent, and set policy agendas in de facto ways that were once solely the domain of governments. This broad influence may spark breakthroughs—such as low-cost space exploration or ubiquitous internet access—but it also worries those who view democracy as predicated on a broad decentralization of power that is at odds with the near-monopolistic reach of these digital titans.

For now, governments around the world appear entirely uncertain how to strike a balance between reining in the excesses of these figures or harnessing the positive benefits that their bold, well-funded initiatives can bring. Regulatory frameworks are being written and re-written, but they lag behind real-world developments in AI, biotech, and social media at the scale that these individuals operate. Societies, meanwhile, will be grappling with questions of privacy, equity, and cultural norms as they adapt—or sometimes bend—to the visions set forth by these tech giants. While 2025 won’t see a full reckoning, mounting concerns suggest that a tug-of-war over who gets to define ethical, political, and economic boundaries will continue to intensify. Governments, trade associations, and civic institutions will be busy exploring how to hold such influential actors accountable without stifling the innovation that just might power the next generation of breakthroughs.

For enterprises, billionaire mavens like Musk, Thiel, Zuckerberg, and Andreessen can mold entire markets with their big bets on AI, cloud, and other frontier tech—shifts that hit CIOs squarely. Their penchant for rapid, audacious, large-scale experiments can trigger sudden hardware shortages, fresh compliance rules, customer backlashes, or entirely new IT service models. CIOs must rapidly adapt to these top-down disruptions, reallocating budgets, revamping vendor relationships, and recalibrating security and governance. In short, as the tech oligarchs’ tech—and increasingly political—leadership plays out on the grandest scale, CIOs must keep one eye on cost and risk and the other on breakthrough innovation—or risk getting sidelined, negatively impacted, or left behind.

Prediction: Strong AI regulation will arrive

Major AI regulation is finally arriving—and it’s taking a form few anticipated just a year ago. Beyond the “do no harm” rhetoric of early rules like the EU’s AI Act, the United States is now exploring more muscular laws, treating AI and the GPUs that power it almost like munitions. At the center is the Biden Administration’s “Export Control Framework for Artificial Intelligence Diffusion”—an Interim Final Rule that’s no mere policy tweak but rather a sweeping and controversial regulatory structure. Leading tech firms such as Oracle have dubbed it “the Mother of All Regulations,” warning it could shrink the global chip market for U.S. firms by 80% and effectively hand vast new opportunities to foreign competitors like China. The rule lumps nearly all high-performance GPU usage into the same risk bucket, with few surgical carve-outs for mundane tasks like enterprise analytics or retail recommender engines. A host of acronyms—UVEU, LPP, TPP, AIA—compound the confusion, and the rule ties compliance to U.S. government standards like FedRAMP High, which most commercial data centers have never needed to implement. Critics argue this approach ignores the reality of modern cloud deployments, which are global, heavily monitored for revenue, and not easily “diverted” to nefarious ends.

The upshot is most likely a rapidly looming showdown between industry and government. Advocates of stronger regulations see an urgent need to prevent adversaries from aggregating massive GPU farms and rushing headlong into potentially dangerous AI applications—think WMD modeling or AI-boosted virus research without guardrails. Yet tech firms worry this blunt approach could hobble America’s long-standing leadership in cloud computing and AI just as the CHIPS Act tries to catalyze domestic semiconductor manufacturing. With the Interim Final Rule fast-tracked and no meaningful public consultation, leading cloud providers face a new compliance labyrinth, especially outside a small circle of favored “AIA countries.” Rather than precisely target those bad actors or high-risk use cases, the Diffusion Framework imposes licensing requirements on nearly everyone, creating uncertainty for global-scale AI projects in healthcare, finance, transportation, and beyond. In the year ahead, expect the private sector to push back vigorously—through lawsuits, lobbying, and new alliances—while regulators attempt to finalize a policy that addresses genuine national security concerns without strangling one of America’s most competitive industries. If Washington and Silicon Valley fail to strike a workable balance, the broader international race for AI supremacy may tilt in unexpected directions, and 2025 could be remembered as the year that heavy-handed rules on “AI munitions” hit the market with a force no one was truly prepared for.

Prediction: The AI-overhaul of the digital workplace

A sweeping AI overhaul is transforming the digital workplace, with agent-based tools and large language models weaving themselves into every layer of daily business. Content generation, research synthesis, and even project facilitation can now be handled by a growing array of AI-driven apps, freeing human teams from drudgery and accelerating creative output. Employee service is a prime example: Chatbots and automated workflows have quickly evolved from rudimentary FAQ systems to sophisticated conversation agents that learn on the fly and seamlessly hand off to live operators only when absolutely necessary to handle HR questions and simple tasks. In sales and marketing, AI-assisted campaign design allows even smaller teams to match the polish of major agencies, while in software development, AI pair programming tools shorten debugging times and keep code quality high. From data entry to human resources, no task is off limits, making AI an intrinsic co-pilot for much of the modern knowledge workforce.

Notably, the most recent data suggests that while these advancements are reshaping entry-level roles—particularly in customer support and back-office administration—they also supercharge top talent and specialists who leverage AI to multiply their productivity. Sam Altman has even acknowledged that ChatGPT Plus usage is so high that they’re “losing money on it,” underscoring how users across multiple job functions are flocking to AI services. In practice, the technology helps mid-level engineers tackle more complex problems, and junior analysts power through larger data sets in a fraction of the time, bridging skill gaps faster than conventional training ever could. Coupled with embedded AI in project management tools, real-time language translation in international teams, and streamlined information retrieval across cloud platforms, organizations are seeing that AI isn’t just an upgrade to existing workflows—it’s a radical new foundation that rewires how work gets done.

Related: See my Guide to the Future of Work in 2030 to get a full sense of what is coming

Prediction: CIOs rethink their IT supply chains

Chief information officers (CIOs) are undertaking a wholesale reevaluation of their IT supply chains, driven by an urgent need for more scalable, cost-effective solutions. Even as public cloud providers continue to expand their offerings, many CIOs are rediscovering the benefits of private cloud—particularly when it comes to predictable capacity and tighter control over operational costs. At the same time, they are placing bigger bets on AI startups that can deliver specialized insights or automation capabilities. This push aligns with broader FinOps practices aimed at balancing aggressive innovation against the sharp reality of ballooning IT expenditures. In fact, in my latest global CIO survey, not a single respondent anticipated a budget decrease, underscoring how organizations are scrambling to accommodate AI’s voracious demand for compute in areas like inference, training, and experimentation.

Yet the turbulence does not stop with the cloud. Established Software-as-a-Service (SaaS) offerings now appear dangerously pricey as wave after wave of AI-driven breakthroughs—generative AI, AI agents, and the looming specter of AGI—make existing services look stale, underpowered, and worst of all overvalued. This inflationary effect on SaaS pricing has many CIOs hunting for lower-cost compute sources and rethinking how they allocate their tech budgets. Where once it was enough to simply provision a handful of powerful compute instances in the public cloud, the new frontier of constant experimentation with advanced AI models demands high-volume, flexible capacity. As a result, 2025 will likely see a flurry of novel sourcing strategies, from pooling regional data-center resources to forging multi-vendor alliances, all in a bid to keep enterprise AI ambitions on track without sinking under the weight of relentless cost escalation.

Prediction: The gap between innovative economies and the rest of the world will grow rapidly in 2025

The global economy is hurtling toward a stark tech divide, but the geography of high-tech powerhouses is no longer confined to Silicon Valley. Instead, a new network of innovation clusters—ranging from Singapore to Tel Aviv, from Berlin to Bengaluru—has taken shape, each attracting substantial investment and expertise in cloud infrastructure, AI research, and software development. High-income regions continue to spend several times more on digital R&D than the combined total of lower-income countries, yet that capital is now more widely dispersed among these rising hubs. Public-private partnerships and streamlined regulations in these locales fuel self-reinforcing ecosystems, funneling talent and funding to areas with robust infrastructure and an appetite for transformational technologies. Even so, many regions remain on the outside looking in, without the baseline connectivity, capital, or coordination to spur game-changing innovation on their own.

That said, these so-called “have-not” areas are not without recourse. Determined policymakers and tech entrepreneurs in certain countries are stepping up with bold initiatives designed to break the cycle of underinvestment. Portugal and Lithuania, for example, have launched programs aimed at bolstering startup ecosystems by offering tax incentives, international seed funding, and cutting-edge accelerator programs—rapidly building a reputation for being two of Europe’s growing tech hotspots. For its part, Lithuania have included simplified visa processes for foreign specialists, specialized tech parks, and collaboration with global trade associations to elevate local AI research. These concerted pushes are paying dividends, serving as a blueprint for other regions looking to invigorate their digital economies, keep homegrown talent, and bridge the innovation gap. The result is an emerging playbook for balancing out the global technology landscape and preventing permanent economic stratification. There is hope for regions outside the innovative economy that take bold actions rapidly enough, but unless they do, the world may divide more profoundly into the tech innovators and the regions slower to embrace new advancements.

Where To, Beyond 2025?

All signs point to a technology landscape hurtling toward increasing concentration, both geographically and economically, yet still ripe with unprecedented creative potential. Leading-edge economies—now spanning well beyond Silicon Valley to Asia and Eastern Europe—are funneling talent, capital, and AI breakthroughs at a pace that underlines the yawning gap between those “in the club” and those struggling to catch up. The race for AI dominance transcends productivity tools and edges closer to AGI, with monolithic tech giants, well-funded startups, and a new class of agent-based frameworks pushing boundaries daily. The rampant stock market enthusiasm around AI may cool if new success stories fail to materialize, but the relentless need for compute—coupled with generative AI’s rapid adoption—seems poised to drive sustained investment. CIOs, caught in the maelstrom, are revamping supply chains and exploring private cloud, specialized AI infrastructure, and newly rolled-out FinOps practices to keep costs in check. Meanwhile, ambitious new government regulations signal that national security and geopolitical concerns are colliding head-on with an industry used to moving fast and breaking things.

Against this backdrop, the digital workplace is morphing into a sweeping collaboration between humans and machines, with agentic AI serving as a precursor to physical, humanoid robots slated for the near future. Worker roles at every level are set to evolve; entry-level, routine work will be offloaded to intelligent assistants, while top-tier talent will wield AI as an amplifier for human ingenuity. Tech oligarchs are seemingly moving toward the center of a lot of this transformation, wielding the power to shape entire policy debates—and even entire markets—by virtue of their colossal influence. Yet even they face mounting pressures from governments and NGOs to ensure AI development is more responsibly governed. In the end, 2025 will lay the foundation for an era in which AI’s global proliferation hinges on a complex interplay: Stricter regulation, nimble IT strategies to adopt AI or potentially stagnate, inspired entrepreneurs, and a gradual but determined march toward autonomy in every form, which is probably the most transformative trend this year. The stakes really couldn’t be higher—nor the opportunities more tempting—as the digital world continues its long march toward an unprecedented reimagining of how we live and work.

Enterprises Must Now Rework Their Knowledge into AI-Ready Forms: Vector Databases and LLMs

With the recent arrival of potent new AI-based models for representing knowledge, the methods enterprises use to manage data today is now faced with yet another major new transformation. I remember a few decades back when the arrival of SQL databases were a major innovation. At the time, they were both quite costly and took great skill to use well. Despite this, enterprises readily understood they were the best new game in town in which their most important data had to live, and so move they did.

Now vector databases and especially foundation/large language models (LLMs) have shifted the focus — in just a couple of short years — on the way organizations must now store and retrieve their own data . And we are also right back at the beginning of the maturity curve that most of us left behind a couple of decades ago.

While not everyone realizes this yet, the writing is now on the wall: Much of our business data now has to migrate again and be recontextualized into these new models. Because the organizations that don’t will likely be at a significant disadvantage, given what AI models of our data can deliver in terms of value.

Enterprise Information Evolution: Documents, Key-Values (JSON), relational SQL database, graph databases, vector databases, and LLMs

Our Marathon with Organizational Data Will Continue With AI

The result, like a lot of technology disruption, will be an journey through a series of key stages of maturation. Each one will progressively enrich the way our organizations store, understand, and leverage our vast reservoirs of information using AI. This process will naturally be somewhat painful, and not all data will need to migrate. And certainly, our older database models aren’t going anywhere either. But at the core of this shift will be the creation of our own private AI models of organizational knowledge. These models must be carefully developed, nurtured, and protected, while also made highly accessible, with appropriate security models.

We’ve moved on from the early days of digital documents, capturing loosely structured data in primitive forms, to the highly structured revolutions introduced by relational and graph databases. Both phases marked a significant movement forward in how data is conceptualized and utilized within the enterprise. The subsequent emergence of JSON as a lightweight, text-based lingua franca further bridged the gap between these two worlds and the burgeoning Web, offering a structured yet flexible way to represent data that catered to the needs of modern Internet applications/services and also helped give rise to NoSQL, a mini-boom of a new database model that ultimately found a home in many Internet-based systems, but largely didn’t disrupt our businesses like AI will.

AI-Models Are A Distinct Conceptual Shift in Working with Data

However, the latest advancements in knowledge representation really do usher in a steep increase in technical sophistication and complexity. Vector databases and foundation models, including large language models (LLMs), represent a genuine quantum leap in how enterprises can manage their data, introducing unprecedented levels of semantic insight, contextual understanding, and universal access to knowledge. Such AI models are able to find and understand the hidden patterns that tie diverse datasets together. This ability can’t be understated and is a key attribute that emerges from a successful model training process. As such, it is one of the signature breakthroughs of generative AI.

Let’s go back to the unknown issues with AI in the enterprise. This uncertainty ranges from what the more effective technical and operational approaches are to picking the best tools/platforms and supporting vendors. This new vector- and model-based era is characterized by an exponential increase in not just the sophistication with which data is stored and interpreted, but in the very way it is vectorized, tokenized, embedded, trained on, represented and transformed. Each of these requires a separate set of skills and understanding, and very considerable compute resources. While this can be outsourced to some degree, this has many risks of its own, not least is that such outsourcers may not deeply understanding the domain of the business and how best to translate it into an AI model.

AI-Based Technologies for Enterprise Data

Vector databases, leveraging the power of machine learning to deeply understand and query enterprise data in ways that mimic human cognitive processes, offered us the first new glimpse into a new future. Going forward, the contextual understanding of our data will largely be based on these radical new forms that bear little resemblance to what came before. Similarly, foundation models like LLMs have revolutionized information management by providing tools that can seemingly comprehend, generate, synthesize, and interact with human language in a manner using neural nets, vast pre-trained parameter sets, and complex transformer blocks that each have a high learning curve to set up and create (using them, however, is very easy.) These technologies provide us with a new dawn of possibilities, from enhanced decision-making processes with unparalleled insights using all our available knowledge, to automating complex tasks with a nuanced understanding of language and context. But all these new AI technologies are generally not familiar to IT departments, which now have to make strategic sense of them for the organization.

Thus, this remarkable progress brings with it a large number of concerns and hurdles to make reality. First, the creation, deployment, and utilization of these sophisticated data models — at least with current technologies — entrails significantly higher costs compared to previous approaches to representing data, according to HBR. A real-world cost example: Google has a useful AI pricing page for benchmarking fundamental costs, which breaks down the various cloud-based AI rates, with grounding requests costing $35 per 1K requests,. Grounding — the process of ensuring that the output of an AI is factually correct — is probably necessary for many types of business scenarios using AI, and is thus a significant extra cost not required in other types of data management systems.

Furthermore, the computational resources, available time, and time required to develop and maintain such systems are also quite substantial. Moreover, the transition to these advanced data management solutions involves navigating a complex landscape of technical, organizational, and ethical considerations.

Related: How to Embark on the Transformation of Work with Artificial Intelligence

As enterprises stand on the cusp of this major new migration to AI, the journey ahead promises real rewards. It also demands careful strategizing, intelligent adoption, and I would argue at this early date, a lot of experimentation, prototyping, and validation. The phases of integrating vector databases and foundation models into the fabric of enterprise knowledge management will require a nuanced approach backed by rigorous testing, balancing the potential for transformative improvements against the practicalities of implementation costs and the readiness of organizational infrastructures to support such advancements.

That this is already happening, there is little doubt, based on my conversations with IT leaders around the world. We are witnessing the beginning of a significant shift in how enterprise knowledge is stored, accessed, and utilized. This transition, while demanding serious talent development and capability acquisition, offers an opportunity to redefine the very boundaries of what is possible in data management and utilization. The key to navigating this evolution lies in a strategic, informed approach to adopting these powerful new models, ensuring that an enterprise can harness its full potential while mitigating the risks and costs associated with such groundbreaking technological advancements.

Early Approaches For Private AI Models of Enterprise Data

Right now, the question I’m most often asked about enterprise AI is how best to create private AI models. Given the extensive concerns that organizations currently have about losing intellectual property, protecting customer/employee/partner privacy, complying with regulations, and giving up control over the irreplaceable asset of enterprise data to cloud vendors, there is a lot of searching around for workable approaches that produce cost-effective private AI models that produce results while minimizing the potential downsides and risks of AI.

As part of my current research agenda on generative AI strategy for the CIO, I’ve identified a number of initial services and solutions from the market to help with creating, operating, and managing private AI models. Each has their own pros and cons.

Services to Create Private AI Models of Enterprise Data

PrivateLLM.AI – This service will train an AI model on your enterprise data and host it privately for exclusive use. They specialize in a number of vertical and functional domains including legal, healthcare, financial services, government, marketing, and advertising.

Turing’s LLM Training Service – Trains large language models (LLMs) for enterprises. Turing uses a variety of techniques to improve the LLMs they create, including data analysis, coding, and multimodal reasoning. They also offer speciality AI services like supervised fine-tuning, reinforcement learning from human feedback (RLHF), and direct preference optimization (DPO), which helps optimizing language models to adhere to human preferences.

LlamaIndex – A popular way to connect LLMs to enterprise data. Has hundreds of connectors to common applications and impressive community metrics (700 contributors with 5K+ apps created.) Enables use of many commercial LLMs, so must be carefully evaluated for control and privacy issues. Make it very easy to use Retrieval-Augmented Generation (RAG), a way to combine a vector database of enterprise information with pass-through to a LLM for targeted but highly enriched results, and even has a dedicated RAG offering to make it easy.

Gradient AI Development Lab – This is an end-to-end service for creating private LLMs. They offer LLM strategy, model selection, training and fine-tuning services to create custom AIs. They specialize in high security AI models and offer SOC2, GDPR, and HIPAA certifications and guarantees enterprise data “never leaves their hands.”

Datasaur.AI – They offer an LLM creation service that provides customized models for LLM development including using vector stores to provide enterprise-grade domain-specific context. They offer a wide choice of existing commercial LLMs to build on as well, so care must be taken to create a private LLM instance. They are more platform-based than some of the others, which makes it easier to get started, but may limit customization downstream.

Signity – Has a private LLM development service that is optimized more to specific data science applications.However, they can handle the whole LLM development process, from designing the model architecture, developing the model and then tuning it. They can create custom models using PyTorch, TensorFlow and many other popular frameworks.

TrainMy.AI – A service that enables enterprises to run an LLM on a private server using retrieval augmented generation (RAG) for enterprise content. While it is more aimed at chatbot and customer service scenarios, it’s very easy to use and allows organizations to bring in vectorized enterprise data for RAG enhancement into a conversational AI service that is entirely controlled privately.

NVIDIA NeMo – For creating serious enterprise-grade custom models, NVIDIA, the GPU industry leader and leading provider of AI chips, offers an end-to-end “compete” solution to creating enterprise LLMs. From model evaluation to AI guardrails, the platform is very rich and is ready to use, if you can come up with the requisite GPUs.

Clarifai – Offers a service that enterprise can quickly use for AI model training. It’s somewhat self-service and allows organization to set up models quickly and continually learn from production data. Has pay-as-you-go pricing and can train pre-built, pre-optimized models of their own already pre-trained with millions of expertly labeled inputs, or you can build your own model.

Hyperscaler LLM Offerings – If you trust your enterprise data to commercial clouds and want to run your own private models in them, that is possible too and all the major cloud vendors offer such capabilities including AWS’s SageMaker JumpStart, Azure Machine Learning, and Google Cloud offers private model training on Vertex AI. These are more for IT departments wanting to roll their own AI models and don’t produce business-ready results without technical experience, unlike many of the services listed above.

Cerebras AI Model Services – The maker of the world’s largest AI chip also offers large-scale private LLM training. They take a more rigorous approach with a team of PhD researchers and ML engineers that they report will meticulously prepare experiments and quality assurance checks to ensure a predictable AI model creation journey to achieve desired outcomes.

Note: If you want to appear on this list, please send me a short description to dion@constrellationr.com.

Build or Customize an LLM: The Major Fork in the Road

Many organizations, especially those unable to maintain sufficient internal AI resources, will have to decide whether to build their own AI model of enterprise data or carefully use a third party service. The choice will be tricky. For example, OpenAI now offers a fine-tuning service, for example, that allows enterprise data to augment how GPT-3.5 or 4 produces domain specific data. This is a slippery slope, as there are many advantages to building on a high capability model, but many risks, including losing control over valuable IP.

Currently, I believe that the cost of training private LLMs will continue to fall steadily, and that service bureaus will increasingly make it turn-key for anyone to create capable AI models while preserving control and privacy. The reality is, that most enterprises will have a growing percentage of their knowledge stored and accessed in AI-ready formats, and the ones that move their most strategic and high value data early are likely to be the most competitive in the long run. Will vector databases and LLMs become the dominant model for enterprise knowledge? The jury is still out, but I believe they will almost certainly become about as important as SQL databases are today. But the main point is clear: It is high time for most organizations to proactively cultivate their AI data-readiness.

My Related Research

AI is Changing Cloud Workloads, Here’s How CIOs Can Prepare

A Roadmap to Generative AI at Work

Spatial Computing and AI: Competing Inflection Points

Salesforce AI Features: Implications for IT and AI Adopters

Video: I explore Enterprise AI and Model Governance

Analysis: Microsoft’s AI and Copilot Announcements for the Digital Workplace

How Generative AI Has Supercharged the Future of Work

How Chatbots and Artificial Intelligence Are Evolving the Digital/Social Experience

The Rise of the 4th Platform: Pervasive Community, Data, Devices, and AI