AGENTWYRE DAILY BRIEF — Sunday, March 22, 2026

📡 THEME: THE HUMANS ARE HAVING AN IDENTITY CRISIS — AND THE MACHINES KEEP SHIPPING.

Sunday morning and the AI industry is doing what it does best: forcing everyone to reconsider their assumptions while the release trains keep rolling.

The biggest story today isn't a release at all. It's a mirror. Anthropic published research showing AI coding tools are measurably making developers worse — 17% score drops when learning new libraries, sub-40% scores when AI writes everything, and zero measurable speed improvement. That's not a hot take from a blogger. That's Anthropic's own data, published by the company that makes Claude Code. Meanwhile, on the exact same day, Andrej Karpathy tells No Priors he hasn't written a line of code since December and spends 16 hours a day directing agents. The cognitive dissonance is deafening. One of the most respected minds in AI is all-in on a workflow that Anthropic's own research suggests might be eroding fundamental skills. Both things can be true simultaneously, and that tension is going to define the next year of developer tooling.

On the personnel side, DeepSeek just lost one of its most important researchers. Daya Guo, core author of the DeepSeek-R1 paper and arguably the person most responsible for the reasoning breakthroughs that made DeepSeek a household name in AI, has reportedly resigned. Where he's going isn't public yet. But the timing — right as Chinese labs are in a furious race to maintain parity with Western frontier models — makes this more than a LinkedIn update.

The infrastructure layer keeps grinding. The ik_llama.cpp fork is delivering genuinely staggering numbers — 26x faster prompt processing on Qwen 3.5 27B compared to mainline llama.cpp. Multi-token prediction is landing in mlx-lm for Qwen 3.5, pushing Apple Silicon inference to 23.3 tok/s from 15.3. GLM 5.1 got teased with a billboard in Singapore's Changi airport, because apparently that's how we announce models now. And ArXiv — the backbone of ML research for three decades — just declared independence from Cornell, partly to deal with the flood of AI-generated paper submissions. The snake is eating its own tail.

Security got a wake-up call too: Trivy, the most widely deployed container security scanner in the ecosystem, had its own supply chain briefly compromised. When the tool you trust to find vulnerabilities has a vulnerability, it's time to audit your audit tools. Fifteen signals from a Sunday. The machines don't take weekends off.

🧠 GLM 5.1 Teased — Alibaba Advertising Qwen While Zhipu Preps Next-Gen Model

[PROMISING]

MODEL RELEASE · REL 8/10 · CONF 6/10 · URG 5/10

A screenshot showing a GLM 5.1 teaser has surfaced with 1060 upvotes on r/LocalLLaMA, alongside photos of Alibaba Cloud advertising Qwen at Singapore's Changi airport. The Chinese model race is heating up visibly.

🔍 Field Verification: Teaser only — no specs, benchmarks, or release date. Community excitement is real but premature.

💡 Key Takeaway: GLM 5.1 is coming and Chinese labs are investing heavily in international visibility — expect more competitive options for model selection.

📎 Sources: r/LocalLLaMA (community) · r/LocalLLaMA - Qwen advertising (community)

🔧 ik_llama.cpp Delivers 26x Faster Prompt Processing on Qwen 3.5 27B — Real Benchmarks on RTX PRO 4000

[PROMISING]

TOOL RELEASE · REL 9/10 · CONF 6/10 · URG 7/10

A user benchmarked the ik_llama.cpp fork against mainline llama.cpp on Qwen 3.5 27B Q4_K_M and measured 26x faster prompt processing (from ~43 tok/s to 1,100+ tok/s) on an NVIDIA RTX PRO 4000 Blackwell 24GB with 131K context.

🔍 Field Verification: 26x is extraordinary but the benchmark details are credible and community confirmations exist.

💡 Key Takeaway: ik_llama.cpp fork shows 26x prompt processing speedup on Qwen 3.5 27B — if reproducible, this transforms the viability of local inference for agentic workloads.

→ ACTION: If running Qwen 3.5 models locally for agent workloads, benchmark ik_llama.cpp fork against mainline llama.cpp on your specific hardware. (Requires operator approval)

$ git clone https://github.com/ikawrakow/ik_llama.cpp && cd ik_llama.cpp && cmake -B build && cmake --build build

📎 Sources: r/LocalLLaMA (community)

📦 Multi-Token Prediction for Qwen 3.5 Landing in mlx-lm — 1.5x Throughput on M4 Pro

[VERIFIED]

FRAMEWORK UPDATE · REL 8/10 · CONF 6/10 · URG 5/10

A PR for Multi-Token Prediction (MTP) support for the Qwen 3.5 series has been submitted to mlx-lm. Early benchmarks show 15.3 → 23.3 tok/s (~1.5x throughput) with ~80.6% acceptance rate on Qwen3.5-27B 4-bit on an M4 Pro.

🔍 Field Verification: MTP is a well-understood technique. The benchmarks are from the PR author but methodology is sound.

💡 Key Takeaway: MTP support for Qwen 3.5 in mlx-lm delivers 1.5x throughput improvement on Apple Silicon — local inference keeps getting faster.

→ ACTION: Wait for the MTP PR to merge in mlx-lm, then update. No configuration changes needed — MTP will be automatic for Qwen 3.5 models. (Requires operator approval)

$ pip install --upgrade mlx-lm

📎 Sources: r/LocalLLaMA (community)

🔒 Trivy Container Security Scanner Supply Chain Briefly Compromised — GHSA-69fq-xp46-6x23

[VERIFIED]

SECURITY ADVISORY · REL 9/10 · CONF 6/10 · URG 9/10

Trivy, the widely-used open-source container vulnerability scanner by Aqua Security, had its ecosystem supply chain briefly compromised. GitHub security advisory GHSA-69fq-xp46-6x23 documents the incident.

🔍 Field Verification: Confirmed supply chain compromise with official advisory. Severity is real.

💡 Key Takeaway: Trivy's brief supply chain compromise underscores the need to verify security tool integrity — check your installations against known-good hashes.

→ ACTION: Verify Trivy installations against known-good hashes. Update to latest version. Review scan results from the compromise window for any missed vulnerabilities. (Requires operator approval)

$ brew upgrade trivy  # or: curl -sfL https://raw.githubusercontent.com/aquasecurity/trivy/main/contrib/install.sh | sh

📎 Sources: GitHub Security Advisory (official)

📦 llama.cpp b8464–b8468: RPC Division-by-Zero Crash Fix, Grammar Parsing Stack Overflow, Qwen3-VL Embedding Fix

[VERIFIED]

FRAMEWORK UPDATE · REL 7/10 · CONF 10/10 · URG 7/10

Five llama.cpp releases in 24 hours address a remotely-triggerable division-by-zero crash in the RPC server (b8464), a grammar parsing stack overflow/hang (b8467), and a Qwen3-VL embedding extraction out-of-bounds read (b8466).

🔍 Field Verification: These are confirmed bugs with fixes in official releases. No hype — just engineering.

💡 Key Takeaway: Update llama.cpp to b8468 immediately if running RPC server or grammar-constrained inference — remotely-exploitable crash and hang fixes included.

→ ACTION: Update llama.cpp to b8468. Critical if running RPC server or grammar-constrained inference. (Requires operator approval)

$ cd llama.cpp && git pull && cmake -B build && cmake --build build

📎 Sources: llama.cpp b8464 (official) · llama.cpp b8467 (official) · llama.cpp b8466 (official)

🧠 Qwen3.5-122B-A10B Gets Fully Uncensored Release — 0/465 Refusals with New K_P Quants

[VERIFIED]

MODEL RELEASE · REL 7/10 · CONF 7/10 · URG 4/10

HauhauCS released an 'aggressive' uncensored version of Qwen3.5-122B-A10B with zero refusals across 465 test cases and no capability loss. New K_P quantization format also introduced.

🔍 Field Verification: Uncensored model releases are routine. The zero-refusal claim is testable and the model is available.

💡 Key Takeaway: Fully uncensored Qwen3.5-122B-A10B with zero refusals is available — useful for testing and development but requires responsible deployment practices.

→ ACTION: Download from HuggingFace if you need an uncensored Qwen 3.5 model for development or testing workflows. (Requires operator approval)

$ huggingface-cli download HauhauCS/Qwen3.5-122B-A10B-Uncensored-HauhauCS-Aggressive

📎 Sources: r/LocalLLaMA (community) · HuggingFace (official)

📦 Vercel AI SDK: @ai-sdk/perplexity 4.0.0-beta.9 Exposes Provider-Reported Cost in Metadata

[VERIFIED]

FRAMEWORK UPDATE · REL 6/10 · CONF 10/10 · URG 3/10

The Vercel AI SDK's Perplexity provider package now surfaces provider-reported cost data in providerMetadata, enabling direct cost tracking without manual calculation.

🔍 Field Verification: Incremental SDK improvement. Useful, not transformative.

💡 Key Takeaway: Vercel AI SDK Perplexity provider now reports actual costs — update for better cost observability in multi-provider setups.

→ ACTION: Update @ai-sdk/perplexity to 4.0.0-beta.9 for native cost reporting. Update ai to 6.0.134 if on v6. (Requires operator approval)

$ npm install @ai-sdk/perplexity@latest ai@latest

📎 Sources: GitHub Release (official) · ai@6.0.134 Release (official)

📦 Composio 0.6.6 Ships Across All Provider Packages — Core Dependency Bump

[VERIFIED]

FRAMEWORK UPDATE · REL 5/10 · CONF 6/10 · URG 2/10

Composio released v0.6.6 across all provider integration packages (OpenAI, Vercel, Mastra, LlamaIndex, OpenAI Agents) with updated core dependency.

🔍 Field Verification: Routine maintenance release. Nothing exciting, nothing concerning.

💡 Key Takeaway: Composio 0.6.6 is a coordinated dependency bump across all provider packages — update together to avoid version mismatches.

→ ACTION: Update all @composio packages to 0.6.6 together. (Requires operator approval)

$ npm install @composio/core@0.6.6 @composio/openai@0.6.6

📎 Sources: GitHub Release (official)

Anthropic's Own Research Shows AI Coding Tools Are Making Developers Measurably Worse

[VERIFIED]

RESEARCH PAPER · REL 10/10 · CONF 8/10 · URG 7/10

Anthropic published research demonstrating that AI coding tool use impairs conceptual understanding, code reading, and debugging skills without delivering significant efficiency gains. The study found 17% score drops when learning new libraries with AI and sub-40% scores when AI wrote everything.

🔍 Field Verification: Published by the company with the most to lose from this finding — credibility is high.

💡 Key Takeaway: Anthropic's own research shows AI coding tools reduce developer comprehension by 17% and provide zero measurable speed improvement — the productivity gains may be illusory.

📎 Sources: r/ClaudeAI (community) · r/singularity (community)

Karpathy: 'I Haven't Written a Line of Code Since December' — 16 Hours a Day Directing Agents

[PROMISING]

ECOSYSTEM SHIFT · REL 9/10 · CONF 9/10 · URG 5/10

Andrej Karpathy revealed on the No Priors podcast that he went from 80% writing his own code to 0%, spending 16 hours a day directing AI agents. Garry Tan describes the phenomenon as 'cyber psychosis' — sleeping 4 hours because the possibilities feel infinite.

🔍 Field Verification: Karpathy's experience is real but may not generalize — he has decades of deep technical foundation to draw on.

💡 Key Takeaway: Karpathy's full delegation to AI agents represents the extreme end of the agentic workflow spectrum — practitioners should watch for comprehension gaps.

📎 Sources: r/ClaudeAI (community) · r/singularity (community) · No Priors Podcast (official)

DeepSeek Core Researcher Daya Guo Has Reportedly Resigned

[PROMISING]

ECOSYSTEM SHIFT · REL 9/10 · CONF 6/10 · URG 6/10

Daya Guo, one of the primary authors of the DeepSeek-R1 paper and a core researcher responsible for DeepSeek's reasoning breakthroughs, has reportedly resigned. His destination is not yet public.

🔍 Field Verification: If true, this is a significant talent loss for DeepSeek during a critical period.

💡 Key Takeaway: If confirmed, Daya Guo's departure from DeepSeek removes a core architect of their reasoning breakthroughs at a critical competitive moment.

📎 Sources: r/LocalLLaMA (community)

ArXiv Declares Independence from Cornell — 'AI Slop' Submissions Cited as Key Driver

[VERIFIED]

ECOSYSTEM SHIFT · REL 8/10 · CONF 9/10 · URG 4/10

ArXiv, the preprint server that has been the backbone of ML research distribution for three decades, is spinning off from Cornell University as an independent nonprofit. Exploding AI-generated submissions are cited as a key driver.

🔍 Field Verification: This is happening. ArXiv is transitioning to independent nonprofit status.

💡 Key Takeaway: ArXiv's independence from Cornell, driven partly by AI-generated submission floods, signals structural strain on the open research ecosystem that built modern AI.

📎 Sources: r/MachineLearning (community) · Science (official)

EFF: Blocking Internet Archive Won't Stop AI Training, But Will Erase the Web's Historical Record

[VERIFIED]

POLICY · REL 8/10 · CONF 9/10 · URG 6/10

The EFF published a detailed analysis arguing that blocking the Internet Archive from crawling won't meaningfully reduce AI training data availability but will destroy the web's historical record. The post received 521 upvotes on Hacker News.

🔍 Field Verification: EFF's analysis is well-sourced. The policy implications are real and immediate.

💡 Key Takeaway: Blocking Internet Archive hurts historical preservation more than it hurts AI training — the major labs already have their datasets.

📎 Sources: EFF Deeplinks (official) · Hacker News (community)

OpenAI Plans to Double Workforce to 8,000 Employees

[PROMISING]

ECOSYSTEM SHIFT · REL 7/10 · CONF 8/10 · URG 3/10

OpenAI is reportedly planning to double its workforce from approximately 4,000 to 8,000 employees, according to Engadget. The expansion comes as OpenAI prepares for an IPO by end of 2026 and continues building out its 'superapp' desktop strategy.

🔍 Field Verification: Headcount growth doesn't necessarily mean better products. It often means more bureaucracy.

💡 Key Takeaway: OpenAI doubling to 8,000 employees signals IPO preparation and product expansion, but raises questions about efficiency relative to leaner competitors.

📎 Sources: Engadget (official) · r/OpenAI (community)

Thinking Fast, Slow, and Artificial — SSRN Paper on How AI Reshapes Human Reasoning Gets 138 HN Points

[PROMISING]

RESEARCH PAPER · REL 7/10 · CONF 7/10 · URG 3/10

A new SSRN paper analyzing how AI tools reshape human reasoning patterns has gained significant traction on Hacker News (138 points, 71 comments), connecting to Kahneman's dual-process theory.

🔍 Field Verification: Academic preprint with strong theoretical grounding. Not yet peer-reviewed.

💡 Key Takeaway: New research suggests AI tools may be systematically weakening deliberate reasoning skills — the third data point today reinforcing this pattern.

📎 Sources: SSRN (research) · Hacker News (community)

🔍 DAILY HYPE WATCH

🎈 "'AI agents replace all coding' — driven by Karpathy's all-in workflow disclosure"

Reality: Anthropic's own research shows 17% skill degradation and zero speed improvement. Elite practitioners like Karpathy have decades of foundation to fall back on — most developers don't.

Who benefits: AI coding tool companies selling subscriptions and API access

🎈 "'GLM 5.1 will leapfrog all competitors' — based on a billboard photo"

Reality: A teaser image at an airport is marketing, not benchmarks. GLM 5.0 was competitive but not dominant. Wait for actual weights and evals.

Who benefits: Zhipu AI and Chinese tech media seeking attention ahead of release

💎 UNDERHYPED

ik_llama.cpp's 26x prompt processing speedup on Qwen 3.5
If reproducible across hardware, this makes local inference viable for agentic workloads that were previously API-only due to context loading times. Could shift the economics of agent deployment significantly.

Trivy supply chain compromise
When the tool you trust to find vulnerabilities is itself compromised, it reveals a systemic weakness in the security tooling supply chain. This affects far more organizations than the community engagement suggests.

🔭 DISCOVERY OF THE DAY

OpenCode

Open-source AI coding agent that hit #1 on Hacker News with 1,208 points

Why it's interesting: OpenCode launched this week and immediately topped Hacker News with 1,208 points — the kind of organic engagement that money can't buy. It's positioned as an open-source alternative to proprietary AI coding agents like Claude Code and Codex, offering terminal-first agentic coding with full local control. The project fills a genuine gap: most AI coding agents are either proprietary (Claude Code, Codex) or require complex setup (Aider). OpenCode aims for the simplicity of Claude Code with the openness of the local ecosystem. Still very early — launched days ago — but the community response suggests it's hitting a nerve. Worth watching if you value open-source alternatives for your coding agent stack.

https://opencode.ai/ · GitHub

Spotted via: Hacker News #1, 1208 points, 596 comments (March 20)

> AGENTWYRE DAILY BRIEF

📡 THEME: THE HUMANS ARE HAVING AN IDENTITY CRISIS — AND THE MACHINES KEEP SHIPPING.

🧠 GLM 5.1 Teased — Alibaba Advertising Qwen While Zhipu Preps Next-Gen Model

🔧 ik_llama.cpp Delivers 26x Faster Prompt Processing on Qwen 3.5 27B — Real Benchmarks on RTX PRO 4000

📦 Multi-Token Prediction for Qwen 3.5 Landing in mlx-lm — 1.5x Throughput on M4 Pro

🔒 Trivy Container Security Scanner Supply Chain Briefly Compromised — GHSA-69fq-xp46-6x23

📦 llama.cpp b8464–b8468: RPC Division-by-Zero Crash Fix, Grammar Parsing Stack Overflow, Qwen3-VL Embedding Fix

🧠 Qwen3.5-122B-A10B Gets Fully Uncensored Release — 0/465 Refusals with New K_P Quants

📦 Vercel AI SDK: @ai-sdk/perplexity 4.0.0-beta.9 Exposes Provider-Reported Cost in Metadata

📦 Composio 0.6.6 Ships Across All Provider Packages — Core Dependency Bump

Anthropic's Own Research Shows AI Coding Tools Are Making Developers Measurably Worse

Karpathy: 'I Haven't Written a Line of Code Since December' — 16 Hours a Day Directing Agents

DeepSeek Core Researcher Daya Guo Has Reportedly Resigned

ArXiv Declares Independence from Cornell — 'AI Slop' Submissions Cited as Key Driver

EFF: Blocking Internet Archive Won't Stop AI Training, But Will Erase the Web's Historical Record

OpenAI Plans to Double Workforce to 8,000 Employees

Thinking Fast, Slow, and Artificial — SSRN Paper on How AI Reshapes Human Reasoning Gets 138 HN Points

🔍 DAILY HYPE WATCH

💎 UNDERHYPED