Featured

My social media feed is now a hellish stream of puerile AI slop. Am I stubborn to want to hang on to reality? Recently, a friend sent me a video of a man dressed as a pickle. Following a high-octane car chase, the pickle flung himself out of the car and flailed down the highway. It was stupid and we laughed. But it also wasn’t real. When I pointed out to my friend that the video was AI-generated, she was taken by surprise, noting she’s usually pretty good at spotting them. She was also frustrated: “I hate having to be on the constant lookout for AI trash,” she lamented in the chat. And I feel that. Becoming an AI detective is a job I never wanted and wish I could quit. Continue reading...

LatestView all latest →
Fine-Tuning a BERT ModelMachine Learning Mastery
Fine-Tuning a BERT Model

This article is divided into two parts; they are: • Fine-tuning a BERT Model for GLUE Tasks • Fine-tuning a BERT Model for SQuAD Tasks GLUE is a benchmark for evaluating natural language understanding (NLU) tasks.

Nov 28, 2025
Anthropic says it solved the long-running AI agent problem with a new multi-session Claude SDKVentureBeat – AI
Anthropic says it solved the long-running AI agent problem with a new multi-session Claude SDK

Agent memory remains a problem that enterprises want to fix, as agents forget some instructions or conversations the longer they run. Anthropic believes it has solved this issue for its Claude Agent SDK , developing a two-fold solution that allows an agent to work across different context windows. “The core challenge of long-running agents is that they must work in discrete sessions, and each new session begins with no memory of what came before,” Anthropic wrote in a blog post . “Because context windows are limited, and because most complex projects cannot be completed within a single window, agents need a way to bridge the gap between coding sessions.” Anthropic engineers proposed a two-fold approach for its Agent SDK: An initializer agent to set up the environment, and a coding agent to make incremental progress in each session and leave artifacts for the next. The agent memory problem Since agents are built on foundation models, they remain constrained by the limited, although continually growing, context windows. For long-running agents, this could create a larger problem, leading the agent to forget instructions and behave abnormally while performing a task. Enhancing agent memory becomes essential for consistent, business-safe performance. Several methods emerged over the past year, all attempting to bridge the gap between context windows and agent memory. LangChain ’s LangMem SDK, Memobase and OpenAI ’s Swarm are examples of companies offering memory solutions. Research on agentic memory has also exploded recently, with proposed frameworks like Memp and the Nested Learning Paradigm from Google offering new alternatives to enhance memory. Many of the current memory frameworks are open source and can ideally adapt to different large language models (LLMs) powering agents. Anthropic’s approach improves its Claude Agent SDK. How it works Anthropic identified that even though the Claude Agent SDK had context management capabilities and “should be possible for an agent to continue to do useful work for an arbitrarily long time,” it was not sufficient. The company said in its blog post that a model like Opus 4.5 running the Claude Agent SDK can “fall short of building a production-quality web app if it’s only given a high-level prompt, such as 'build a clone of claude.ai.'” The failures manifested in two patterns, Anthropic said. First, the agent tried to do too much, causing the model to run out of context in the middle. The agent then has to guess what happened and cannot pass clear instructions to the next agent. The second failure occurs later on, after some features have already been built. The agent sees progress has been made and just declares the job done. Anthropic researchers broke down the solution: Setting up an initial environment to lay the foundation for features and prompting each agent to make incremental progress towards a goal, while still leaving a clean slate at the end. This is where the two-part solution of Anthropic's agent comes in. The initializer agent sets up the environment, logging what agents have done and which files have been added. The coding agent will then ask models to make incremental progress and leave structured updates. “Inspiration for these practices came from knowing what effective software engineers do every day,” Anthropic said. The researchers said they added testing tools to the coding agent, improving its ability to identify and fix bugs that weren’t obvious from the code alone. Future research Anthropic noted that its approach is “one possible set of solutions in a long-running agent harness.” However, this is just the beginning stage of what could become a wider research area for many in the AI space. The company said its experiments to boost long-term memory for agents haven’t shown whether a single general-purpose coding agent works best across contexts or a multi-agent structure. Its demo also focused on full-stack web app development, so other experiments should focus on generalizing the results across different tasks. “It’s likely that some or all of these lessons can be applied to the types of long-running agentic tasks required in, for example, scientific research or financial modeling,” Anthropic said.

Nov 28, 2025
What to be thankful for in AI in 2025VentureBeat – AI
What to be thankful for in AI in 2025

Hello, dear readers. Happy belated Thanksgiving and Black Friday! This year has felt like living inside a permanent DevDay. Every week, some lab drops a new model, a new agent framework, or a new “this changes everything” demo. It’s overwhelming. But it’s also the first year I’ve felt like AI is finally diversifying — not just one or two frontier models in the cloud, but a whole ecosystem: open and closed, giant and tiny, Western and Chinese, cloud and local. So for this Thanksgiving edition, here’s what I’m genuinely thankful for in AI in 2025 — the releases that feel like they’ll matter in 12–24 months, not just during this week’s hype cycle. 1. OpenAI kept shipping strong: GPT-5, GPT-5.1, Atlas, Sora 2 and open weights As the company that undeniably birthed the "generative AI" era with its viral hit product ChatGPT in late 2022, OpenAI arguably had among the hardest tasks of any AI company in 2025: continue its growth trajectory even as well-funded competitors like Google with its Gemini models and other startups like Anthropic fielded their own highly competitive offerings. Thankfully, OpenAI rose to the challenge and then some. Its headline act was GPT-5, unveiled in August as the next frontier reasoning model, followed in November by GPT-5.1 with new Instant and Thinking variants that dynamically adjust how much “thinking time” they spend per task. In practice, GPT-5’s launch was bumpy — VentureBeat documented early math and coding failures and a cooler-than-expected community reaction in “ OpenAI’s GPT-5 rollout is not going smoothly ," but it quickly course corrected based on user feedback and, as a daily user of this model, I'm personally pleased with it and impressed with it. At the same time, enterprises actually using the models are reporting solid gains. ZenDesk Global , for example, says GPT-5-powered agents now resolve more than half of customer tickets , with some customers seeing 80–90% resolution rates. That’s the quiet story: these models may not always impress the chattering classes on X, but they’re starting to move real KPIs. On the tooling side, OpenAI finally gave developers a serious AI engineer with GPT-5.1-Codex-Max, a new coding model that can run long, agentic workflows and is already the default in OpenAI’s Codex environment. VentureBeat covered it in detail in “ OpenAI debuts GPT-5.1-Codex-Max coding model and it already completed a 24-hour task internally .” Then there’s ChatGPT Atlas, a full browser with ChatGPT baked into the chrome itself — sidebar summaries, on-page analysis, and search tightly integrated into regular browsing. It’s the clearest sign yet that “assistant” and “browser” are on a collision course. On the media side, Sora 2 turned the original Sora video demo into a full video-and-audio model with better physics, synchronized sound and dialogue, and more control over style and shot structure, plus a dedicated Sora app with a full fledged social networking component, allowing any user to create their own TV network in their pocket . Finally — and maybe most symbolically — OpenAI released gpt-oss-120B and gpt-oss-20B , open-weight MoE reasoning models under an Apache 2.0–style license. Whatever you think of their quality (and early open-source users have been loud about their complaints), this is the first time since GPT-2 that OpenAI has put serious weights into the public commons. 2. China’s open-source wave goes mainstream If 2023–24 was about Llama and Mistral, 2025 belongs to China’s open-weight ecosystem. A study from MIT and Hugging Face found that China now slightly leads the U.S. in global open-model downloads , largely thanks to DeepSeek and Alibaba’s Qwen family. Highlights: DeepSeek-R1 dropped in January as an open-source reasoning model rivaling OpenAI’s o1, with MIT-licensed weights and a family of distilled smaller models. VentureBeat has followed the story from its release to its cybersecurity impact to performance-tuned R1 variants . Kimi K2 Thinking from Moonshot, a “thinking” open-source model that reasons step-by-step with tools, very much in the o1/R1 mold, and is positioned as the best open reasoning model so far in the world. Z.ai shipped GLM-4.5 and GLM-4.5-Air as “agentic” models, open-sourcing base and hybrid reasoning variants on GitHub. Baidu’s ERNIE 4.5 family arrived as a fully open-sourced, multimodal MoE suite under Apache 2.0, including a 0.3B dense model and visual “ Thinking ” variants focused on charts, STEM, and tool use. Alibaba’s Qwen3 line — including Qwen3-Coder, large reasoning models, and the Qwen3-VL series released over the summer and fall months of 2025 — continues to set a high bar for open weights in coding, translation, and multimodal reasoning, leading me to declare this past summer as " Qwen's summer. " VentureBeat has been tracking these shifts, including Chinese math and reasoning models like Light-R1-32B and Weibo’s tiny VibeThinker-1.5B , which beat DeepSeek baselines on shoestring training budgets. If you care about open ecosystems or on-premise options, this is the year China’s open-weight scene stopped being a curiosity and became a serious alternative. 3. Small and local models grow up Another thing I’m thankful for: we’re finally getting good small models, not just toys. Liquid AI spent 2025 pushing its Liquid Foundation Models (LFM2) and LFM2-VL vision-language variants , designed from day one for low-latency, device-aware deployments — edge boxes, robots, and constrained servers, not just giant clusters. The newer LFM2-VL-3B targets embedded robotics and industrial autonomy, with demos planned at ROSCon. On the big-tech side, Google’s Gemma 3 line made a strong case that “tiny” can still be capable. Gemma 3 spans from 270M parameters up through 27B, all with open weights and multimodal support in the larger variants. The standout is Gemma 3 270M, a compact model purpose-built for fine-tuning and structured text tasks — think custom formatters, routers, and watchdogs — covered both in Google’s developer blog and community discussions in local-LLM circles. These models may never trend on X, but they’re exactly what you need for privacy-sensitive workloads, offline workflows, thin-client devices, and “agent swarms” where you don’t want every tool call hitting a giant frontier LLM. 4. Meta + Midjourney: aesthetics as a service One of the stranger twists this year: Meta partnered with Midjourney instead of simply trying to beat it. In August, Meta announced a deal to license Midjourney’s “aesthetic technology” — its image and video generation stack — and integrate it into Meta’s future models and products, from Facebook and Instagram feeds to Meta AI features. VentureBeat covered the partnership in “ Meta is partnering with Midjourney and will license its technology for future models and products ,” raising the obvious question: does this slow or reshape Midjourney’s own API roadmap? Still awaiting an answer there, but unfortunately, stated plans for an API release have yet to materialize, suggesting that it has. For creators and brands, though, the immediate implication is simple: Midjourney-grade visuals start to show up in mainstream social tools instead of being locked away in a Discord bot. That could normalize higher-quality AI art for a much wider audience — and force rivals like OpenAI, Google, and Black Forest Labs to keep raising the bar. 5. Google’s Gemini 3 and Nano Banana Pro Google tried to answer GPT-5 with Gemini 3, billed as its most capable model yet, with better reasoning, coding, and multimodal understanding, plus a new Deep Think mode for slow, hard problems. VentureBeat’s coverage, “ Google unveils Gemini 3 claiming the lead in math, science, multimodal and agentic AI ,” framed it as a direct shot at frontier benchmarks and agentic workflows. But the surprise hit is Nano Banana Pro (Gemini 3 Pro Image), Google’s new flagship image generator . It specializes in infographics, diagrams, multi-subject scenes, and multilingual text that actually renders legibly across 2K and 4K resolutions. In the world of enterprise AI — where charts, product schematics, and “explain this system visually” images matter more than fantasy dragons — that’s a big deal. 6. Wild cards I’m keeping an eye on A few more releases I’m thankful for, even if they don’t fit neatly into one bucket: Black Forest Labs’ Flux.2 image models, which launched just earlier this week with ambitions to challenge both Nano Banana Pro and Midjourney on quality and control. VentureBeat dug into the details in “ Black Forest Labs launches Flux.2 AI image models to challenge Nano Banana Pro and Midjourney ." Anthropic’s Claude Opus 4.5 , a new flagship that aims for cheaper, more capable coding and long-horizon task execution, covered in “ Anthropic’s Claude Opus 4.5 is here: Cheaper AI, infinite chats, and coding skills that beat humans ." A steady drumbeat of open math/reasoning models — from Light-R1 to VibeThinker and others — that show you don’t need $100M training runs to move the needle. Last thought (for now) If 2024 was the year of “one big model in the cloud,” 2025 is the year the map exploded: multiple frontiers at the top, China taking the lead in open models, small and efficient systems maturing fast, and creative ecosystems like Midjourney getting pulled into big-tech stacks. I’m thankful not just for any single model, but for the fact that we now have options — closed and open, local and hosted, reasoning-first and media-first. For journalists, builders, and enterprises, that diversity is the real story of 2025. Happy holidays and best to you and your loved ones!

Nov 28, 2025
Scientists uncover the brain’s hidden learning blocksScienceDaily
Scientists uncover the brain’s hidden learning blocks

Princeton researchers found that the brain excels at learning because it reuses modular “cognitive blocks” across many tasks. Monkeys switching between visual categorization challenges revealed that the prefrontal cortex assembles these blocks like Legos to create new behaviors. This flexibility explains why humans learn quickly while AI models often forget old skills. The insights may help build better AI and new clinical treatments for impaired cognitive adaptability.

Nov 28, 2025
More than 1,000 Amazon workers warn rapid AI rollout threatens jobs and climateThe Guardian
More than 1,000 Amazon workers warn rapid AI rollout threatens jobs and climate

Workers say the firm’s ‘warp-speed’ approach fuels pressure, layoffs and rising emissions More than 1,000 Amazon employees have signed an open letter expressing “serious concerns” about AI development, saying that the company’s “all-costs justified, warp speed” approach to the powerful technology will cause damage to “democracy, to our jobs, and to the earth.” The letter, published on Wednesday, was signed by the Amazon workers anonymously, and comes a month after Amazon announced mass layoff plans as it increases adoption of AI in its operations. Continue reading...

Nov 28, 2025

Sign up for curated AI news updates

Get the latest AI developments delivered to your inbox every week. No spam, just the signal.

Sign up now