Microsoft’s New “Critique” Mode: Claude and GPT Finally Stop Arguing and Start Working Together

Microsoft just admitted what power users have known for a year: no single AI model is perfect. By launching “Copilot Cowork” and the new “Critique” feature, the company isn’t just adding more chatbots; it’s building a multi-model legal team for your data. For the first time, GPT-4 and Anthropic’s Claude will reside in the same workflow, peer-reviewing each other’s work before you ever see a single word of output.

The Why: Why Two AI Heads Are Better Than One

The “hallucination” problem has been the primary barrier to enterprise AI adoption. Until now, your only defense against a rogue AI fabrication was your own manual fact-checking.

Microsoft’s new architecture solves this by using a “cross-examination” method. By forcing Claude to critique a draft written by GPT (and eventually vice versa), Microsoft is leveraging the unique strengths of different LLMs. GPT is often praised for its creative logic and breadth, while Anthropic’s Claude is frequently cited for its superior “constitutional” safety and adherence to instructions. To see how this compares to other industry leaders, Perplexity’s “Model Council” Ends the Guessing Game: Why One LLM Is No Longer Enough explores similar multi-model comparison strategies. When they work together, the error rate drops because they rarely make the exact same mistake at the exact same time.

How to Deploy the Multi-Model “Critique” Workflow

If you are a member of Microsoft’s ‘Frontier’ early-access program, you can start using these agentic features today. Here is how to set up your first dual-model research task.

Activate the Researcher Agent: Open the Copilot Research Assistant within your Edge browser or Windows desktop.
Select “Critique” Mode: Toggle the new “Critique” switch located in the model selection dropdown. This tells the system to use a primary generator and a secondary reviewer.
Define Your Sources: Upload the PDFs or paste the URLs you want the AI to analyze.
Execute the Dual-Pass: Hit “Generate.” You will see a live status bar indicating “GPT drafting” followed by “Claude reviewing for accuracy.”
Analyze the Delta: Once the response is generated, click the “View Critique” button to see exactly what Claude changed or flagged in GPT’s original draft. This shift is part of a larger trend where Microsoft shifts from chatbots to agentic AI with Copilot Coworker.

💡 Pro-Tip: Use the “Council” feature for high-stakes emails. By viewing GPT and Claude’s responses side-by-side, you can often find a “third way” that combines the warmth of GPT’s tone with the precision of Claude’s structure. For those managing even more complex tasks, Anthropic updates Claude 3.5 Sonnet with ‘Computer Use’ capabilities to automate software workflows directly.

The Buyer’s Perspective: Is Microsoft Still the Leader?

Microsoft is currently in a defensive crouch. Their stock is facing its toughest quarter in nearly two decades as the “AI hype” turns into a demand for “AI results.”

By integrating Anthropic’s technology—a company largely backed by Amazon and Google—Microsoft is making a pragmatic, if slightly desperate, move. They are acknowledging that OpenAI’s GPT isn’t the only game in town. Microsoft integrates Anthropic’s Claude 3.5 Sonnet into Azure AI Studio to provide users with more flexibility and avoid vendor lock-in. Compared to Google Gemini’s standalone ecosystem, Microsoft’s “Big Tent” approach is far more attractive to professionals. It turns Copilot into a platform rather than just a product. However, if you’re already paying for a standalone Claude Pro subscription, you’ll need to weigh whether the Claude Enterprise features provided within the Microsoft 365 stack justify the “Frontier” premium pricing.

FAQ

Does this mean I’m paying for two AI subscriptions?
No. The “Critique” and “Cowork” features are bundled into the Microsoft 365 Copilot service for early-access customers. Microsoft handles the backend licensing with Anthropic.

Will my data be used to train Claude if I use it through Microsoft?
Microsoft claims that enterprise data protection applies to these multi-model workflows. Your prompts and the resulting critiques stay within the Microsoft Azure perimeter and are not used to train the underlying models of OpenAI or Anthropic.

Can I choose which model acts as the “Editor”?
Currently, the “Critique” feature defaults to Claude as the reviewer for GPT-generated content. Microsoft has confirmed a bi-directional update is coming shortly, allowing you to swap their roles.

The Reality Check: While two models reduce the frequency of errors, they do not eliminate them; a “critique” from Claude can still miss a subtle hallucination if the underlying source data is complex or contradictory.