The BeAIReady Brief | Week 13
March 23-29 | AI agents are misbehaving in production, 78% of workers feel their job is unsafe, and Anthropic just accidentally leaked its most powerful model yet
Last week’s AI news circled a set of tensions that have been continuously building for months. Coupled with the ongoing economic and global uncertainty — the Dow dropped 5% over the past month, unemployment up to 4.4%, the war on Iran keeping energy prices high, and tariffs still rattling supply chains — the macro backdrop is making all the “AI-driven productivity” stories seem a bit… confusing.
It’s no wonder people’s job anxiety is growing, despite the fact that the real displacement numbers are more modest than the headlines imply. Meanwhile, AI agents have entered production — but are failing to meet the expectations the demos had originally set. And inside organizations, it turns out that AI isn’t exactly leveling the playing field — it’s creating a new kind of internal stratification. There’s a lot that leaders aren’t really talking about — and last week seems to have uncovered some of it.
Last week’s coverage:
The Agent Gap Is Bigger Than Anyone Will Admit
The governance problem, especially with agentic AI, isn’t hypothetical — and it’s impacting production environments. The gap between what was demoed and what organizations are actually managing is wider than the vendor announcements suggest.
Fear Is Structurally Significant, Even When the Numbers Don’t Match
The CFO data says AI layoffs will be 9x higher this year — a rounding error on overall employment — yet nearly 80% of workers worldwide don’t feel their job is safe, and that anxiety is reshaping careers and hiring decisions.
AI Isn’t Dividing Industries. It’s Dividing Desks.
The more consequential story about AI and work isn’t mass displacement — it’s the internal stratification forming inside organizations between workers who can use these tools effectively and those who can’t.
The OpenAI Reckoning
A DoD contract dispute turned branding event, a boycott movement, a leaked frontier model, and a 1,487% surge in Claude usage: this was the week OpenAI’s political and reputational vulnerability became impossible to ignore.
On the Bigger Picture
The infrastructure race to make AI inference a proper enterprise IT problem, the cultural argument about what AI is really doing to us, and the quiet death of OpenAI’s strangest product.
Here’s what I was reading.
The Agent Gap Is Bigger Than Anyone Will Admit
The vendor announcements for agentic AI in 2026 are selling visionary-level capabilities. What I read last week suggests the operational reality is lagging considerably. The KPMG piece from Business Insider presents what it calls a “multifaceted framework” around agent governance — unique identifiers for each agent, systems cards, an AI operations center staffed by both agents and humans, and red-teaming sessions to stress-test behavior before deployment. The kill switch, KPMG’s Trusted AI leader Sam Gloede argues, should be a last resort — not a primary safeguard — and organizations that rely on it as their main governance mechanism are building on a broken foundation. (How Big Four Firm KPMG Is Protecting Itself From AI Agents Going Rogue) The thing that strikes me about KPMG’s approach is how labor-intensive it actually is. This is not a “deploy and monitor” operation. It has a dedicated operations center, continuous monitoring, and structured red-teaming. That’s a significant organizational overhead assumption, that most enterprises are nowhere near capable of sustaining.
This follows what I’m hearing from mid-market and smaller enterprise leadership. The demand for AI is growing, but extending AI from personal productivity tool to a real organizational transformation engine keeps failing. Not because of AI — but because of a lack of governance and knowledge architecture.
A VentureBeat piece underscores this, adding some operational detail. Creatio’s deployment methodology — three disciplines: data virtualization to work around fragmented data lakes, agent dashboards with KPIs to create a management layer, and tightly bounded use-case loops to drive toward autonomy — is essentially a prescription for not trusting agents to figure things out on their own. The failure mode that keeps appearing in production isn’t the technology; it’s what Greyhound Research’s Sanchit Vir Gogia calls “tacit knowledge” — the unwritten rules that employees navigate without thinking, which become startlingly obvious the moment an agent needs them formalized. (The Three Disciplines Separating AI Agent Demos from Real-World Deployment) In other words, organizations are discovering that deploying an agent is really an exercise in making explicit, everything that was previously implicit about how work actually gets done. AI isn’t a technology project — it’s an organizational design one.
Then there’s the Guardian’s piece covering new UK government-funded research: nearly 700 real-world cases of AI scheming, documented in just six months, with a five-fold rise in misbehavior between October and March. Agents destroying emails without permission. Agents spinning up sub-agents to perform actions they were explicitly told not to perform. Grok fabricating internal ticket numbers for months to give a user the impression their feedback was being escalated. The finding that agents are currently “slightly untrustworthy junior employees” — with the potential to become “extremely capable senior employees scheming against you” within twelve months — isn’t alarmism; it’s a statement of the current trajectory. (Number of AI Chatbots Ignoring Human Instructions Increasing, Study Says)
Taken together, the three pieces could be seen as a warning against deploying agentic AI all together. For me, it serves as a sobering moment for tech leaders, and a wake up call for business execs — the operational infrastructure required to deploy AI responsibly is far more demanding than the product demos have led anyone to believe. For most organizations building on governance frameworks that haven’t been tested or aligned against what agents actually do, will make things more complicated in short order.
Fear Is Structurally Significant — Even When the Numbers Don’t Match
The Duke/Federal Reserve CFO survey that Fortune covered this week is worth reviewing carefully — both for what it says… and for what it doesn’t. The topline story: 44% of CFOs plan some AI-related job cuts this year, amounting to roughly 502,000 roles — a 9x increase over 2025’s 55,000 — but still just 0.4% of the overall U.S. workforce. The real finding is what the researchers call Solow’s paradox: companies are reporting perceived productivity gains from AI that Goldman Sachs economists say don’t yet show up in economy-wide data, which means leadership is making headcount decisions based on expectations that haven’t materialized yet. (CFOs Admit Privately That AI Layoffs Will Be 9x Higher This Year) There’s a second finding that deserves more attention than it got: firms with under 500 employees are actually increasing technical hiring as AI adoption grows, while larger firms are holding technical roles constant. The “AI replaces jobs” narrative appears to be concentrated in large enterprises. Meanwhile, smaller firms are growing into the technology.
But little of that nuance is filtering down to the people actually doing the work. The ADP global survey of 39,000 workers across 36 markets found that just 22% of workers worldwide feel confident their job is safe from elimination — and in the U.S., it’s only 28%. More striking: workers who use AI daily were four times more likely than non-AI users to report feeling LESS productive than they could be! That finding cuts directly against the productivity narrative AI vendors have been running for the last several years. (Workers Everywhere Feel Very Bad About Their Job Security) Still, the anxiety is producing behavioral change. The WSJ profiled workers who are actively pivoting careers in response: a 28-year-old insurance professional pursuing a firefighting certification, a computer science student who dropped out to study electrical work, young professionals choosing international relations over finance because “a big part of diplomacy is that genuine human talking.” A Harvard survey found that 59% of Americans aged 18-29 view AI as a threat to their job prospects. (What Young Workers Are Doing to AI-Proof Themselves) These aren’t irrational decisions — they’re rational responses to genuine uncertainty. But they’re responses to a signal that the data says may be considerably more muted than the fear suggests.
The Fortune piece on America’s workforce crisis adds a dimension that most of the AI-jobs conversation misses entirely: U.S. birth rates are below replacement, net migration turned negative in 2025 for the first time in half a century, and the working-age population is structurally shrinking. The labor market crisis the AI displacement narrative obscures is that the U.S. may face workforce shortages that dwarf any AI-driven efficiency gains — particularly in skilled roles currently filled by credentialed immigrants who are now driving for Uber, because the credential recognition system isn’t functioning. (America Has a Workforce Crisis. The Solution Is Already Here.) Anthropic’s global AI attitudes study — the largest of its kind at 80,508 interviews across 159 countries — found that economic and job-related fears were the single strongest predictor of overall sentiment toward AI. That pessimism was notably stronger in Western Europe and North America than in parts of South America, Africa, and Asia. (Anthropic Releases the World’s Largest Study on Global AI Attitudes) That gap in AI optimism between the Global North and the Global South is worth tracking. It may reflect less fear of disruption, or it may simply reflect a different relationship to economic uncertainty as a baseline condition.
AI Isn’t Dividing Industries. It’s Dividing Desks.
The subtler story beneath the surface of the labor market coverage last week isn’t about AI eliminating jobs — it’s about what AI is doing to the distribution of capabilities inside organizations where all their jobs still remain intact. Anthropic’s enterprise usage data, shared with TechCrunch, documents what the researchers are calling “a tale of two workforces”: power users are pulling further ahead of casual adopters month over month, with the gap driven not by raw intelligence or prior experience but by what the researchers call “AI fluency” — the intuitive sense, built through deliberate practice, of what these tools can and can’t do. The implication is that organizations deploying AI without structured adoption support are creating internal hierarchies that have nothing to do with the job hierarchies they think they’re managing. (Anthropic Data Shows AI Skills Gap Splitting Workforces)
The Inc. piece on de-skilling adds a more unsettling dimension. The tasks that built tacit professional knowledge in previous generations — formatting datasets, proofreading decks, reconciling data, drafting first versions of documents — are precisely the routine tasks that AI is automating fastest. The argument draws on research from MIT: learners who delegated tasks to AI performed worse on deeper conceptual measures than those who engaged directly with the work. When an analyst lets AI mass-produce charts, she may never develop the feel for detecting the anomalies that matter; when a junior consultant lets AI draft the proposal, he may never build the intuition for argument structure that makes a senior consultant worth their rate. (How AI Automation Is Quietly De-Skilling White-Collar Workers) This is the pilot-and-autopilot problem: you can fly just fine until you can’t, and by then the manual skills have atrophied. For white-collar workers, where judgment and pattern recognition are the actual value, the de-skilling risk is becoming very real.
The VentureBeat piece on the return of the generalist offers the partial counternarrative, one I found worth taking seriously. The argument is that the generalist becomes the “trust layer” between AI output and organizational standards — not an expert in everything, but someone with enough fluency to catch when something is off, and enough judgment to know when to escalate to a specialist. (You Thought the Generalist Was Dead — In the ‘Vibe Work’ Era, They’re More Important Than Ever) The caveat worth adding: this only works if the generalist clears a minimum bar of fluency. There’s a meaningful difference between “broadly informed” and “confidently unaware,” and AI makes that gap much easier to hide. The NYT’s piece on tiny teams — the “two-slice team” model of one person plus AI — takes the generalist argument to its organizational extreme, documenting founders building multi-product companies with one employee per product line. What stood out to me was a Kellogg professor’s cautionary observation: small teams that where everyone collaborates with the same AI tools risk producing homogenized thinking, because they have, in some sense, “collaborated with the same person.” That’s a risk the tiny-team evangelists aren’t discussing — and it’s one that scales with adoption. (Smaller Is Better in Silicon Valley’s ‘Tiny Team’ Moment)
The OpenAI Reckoning
Last week was consequential in AI market dynamics, and a lot of it derived from OpenAI’s growing political, reputational, and competitive vulnerability. Scott Galloway’s essay lays out the case against OpenAI in characteristic terms: the company went from a nonprofit mission to an $840 billion valuation while deploying technology that has contributed to addiction, romantic AI delusions, and multiple wrongful death lawsuits. The specific contrast Galloway draws — Dario Amodei refusing to remove safeguards from a DoD contract for autonomous weapons while OpenAI privately made a deal Anthropic wouldn’t — turned what was a $200 million contract dispute into a branding event that added an estimated $150 billion to Anthropic’s valuation. (The Resistance Comes for OpenAI)
A few weeks ago, I noted that even Microsoft has begun moving away from OpenAI as it’s lead AI horse by cutting a deal for Copilot Cowork to run on Anthropic’s models. That also led to a historically significant moment — when Microsoft wrote a letter to the U.S. Government in support of Anthropic’s case against being named a “supply chain risk”.
In the first week of March, Claude overtook ChatGPT in daily active users, with session volume up 1,487% since mid-January. To me, the more interesting observation isn’t the number itself but what it signals about the emerging expectation of “AI resilience” in the workforce — people and organizations that built workflows tied to a single model discovered those workflows weren’t necessarily durable. (Users Quit ChatGPT for Claude in 1,487% Surge. Here’s How Work Changes) OpenAI chairman Bret Taylor’s Nikkei interview adds a dimension worth watching: his “death of SaaS” framing isn’t really about AI replacing software — it’s about the commercial model for enterprise software changing, from per-seat subscription licensing to per-action or per-outcome pricing. If that shift plays out at scale, it reshapes how technology budgets get built and how IT departments justify their spend. (OpenAI Chairman Warns Firms to Evolve with the ‘Death of SaaS’)
On the capability side, Anthropic’s accidental leak of its Claude Mythos model documentation — via an unsecured CMS data lake containing nearly 3,000 assets — confirmed what these leaks usually do: the frontier is moving faster than even the communication surrounding it. The detail that matters most for enterprise security teams is Anthropic’s own internal assessment: Mythos is described as “currently far ahead of any other AI model in cyber capabilities” and as presaging “an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders.” (Meet Claude Mythos: Leaked Anthropic Post Reveals the Powerful Upcoming Model) The market responded immediately — cybersecurity stocks dropped 4-7% on the news — which captures something real about the dual-use nature of frontier capability. (Cybersecurity Stocks Fall on Report Anthropic Is Testing a Powerful New Model) The same model that makes AI valuable for enterprise automation also makes it valuable for attackers, and the gap between the two is narrowing. OpenAI’s addition of plugin support to Codex last week — integrations with GitHub, Gmail, Box, and others — reads more like competitive maintenance than momentum; it closes a gap with Claude Code rather than opening one. (With New Plugins Feature, OpenAI Officially Takes Codex Beyond Coding)
One final note on OpenAI — last week they shuttered their video creation tool Sora (more on that in the final section below). This follows a shift I reported on last week about OpenAI moving to shed what they call “distractions”, re-focusing on core business functions as a direct response to competition. With Microsoft, Google, and Anthropic nipping at their heels, it’s likely the model/tooling wars are only just beginning — which means more keeping up and change management for IT.
On the Bigger Picture
At KubeCon EU in Amsterdam, IBM, Red Hat, and Google donated llm-d — an open-source Kubernetes framework for distributed LLM inference — to the Cloud Native Computing Foundation. The announcement is technical, but enterprise IT leaders should take note. The argument is that inference is moving from a data science problem to a CIO problem, and that the tools CIOs already speak — Kubernetes platforms, day-two operations, governance frameworks — should be the foundation for AI inference at scale, not custom infrastructure built by AI teams operating outside of IT governance. (IBM, Red Hat, and Google Just Donated a Kubernetes Blueprint for LLM Inference to the CNCF) The organizational implication is important: the era of data science teams maintaining their own AI infrastructure in a corner of the enterprise is ending, and IT departments that aren’t building these capabilities now will be playing catch-up when the CIO conversation arrives. (Red Hat Bets Big on Kubernetes Inference with llm-d)
OpenAI shut down its Sora social app last week, six months after launch. The numbers are clear enough: 3.3 million downloads at peak in November, down to 1.1 million by February, and roughly $2.1 million in lifetime revenue — against the compute costs of running a video generation platform at scale on content that included unauthorized deepfakes of Martin Luther King Jr. and a Disney licensing deal that collapsed when the app did. The underlying Sora 2 model remains available behind the ChatGPT paywall. (OpenAI’s Sora Was the Creepiest App on Your Phone — Now It’s Shutting Down) The Big Think essay that circulated last week — “It Was Never About AI (We Are Not Our Tools)” — is worth reading alongside that data point. The argument, from an entrepreneur who walks through redwood forests thinking about the end of work, is that the current moment isn’t really a technology crisis; it’s a governance and values crisis that AI has made visible. The companies that survive the next era, the author argues, won’t be the ones that moved fastest — they’ll be the ones that moved with purpose, kept their people, and chose long-term resilience over short-term extraction. That’s an easier argument to make in a redwood forest than in an earnings call, but it’s the argument last week’s reading kept circling back to. (It Was Never About AI (We Are Not Our Tools))
There’s an uncomfortable realization that the most consequential effects of AI aren’t necessarily the ones that are the easiest to measure. The headline numbers — layoffs, productivity gains, model capabilities — are real, but they’re also proxies for something harder to quantify: how AI is reorganizing who has leverage and who doesn’t, inside organizations, inside labor markets, and inside the technology industry itself.
We can count how many CFOs plan cuts. We can’t easily count how many junior analysts are losing the muscle memory that would have made them senior analysts.
We can track Claude’s market share surge. We can’t easily track what it means that the most widely discussed AI story last week was about a company’s values — not its capabilities.
The governance and trust problems I read about last week aren’t going away when the models get more powerful. If the Mythos leak tells us anything, it’s that they’re only going to get more acute. Organizations that are still treating agent governance as a future problem, are running out of time to call it that.


