The latest generation of AI models from OpenAI, Anthropic, and others promise something revolutionary: machines that can "think" before they answer. These Large Reasoning Models (LRMs) generate detailed chains of thought, self-reflect on their reasoning, and supposedly tackle complex problems better than their predecessors. But new research from Apple throws cold water on these claims,...
Category: Science
Self-Improving AI: Darwin Gödel Machine Evolves Code
In 1965, mathematician I.J. Good predicted the possibility of an "intelligence explosion"—a hypothetical scenario where AI systems could recursively improve themselves, each generation surpassing the last. Nearly 60 years later, researchers from the University of British Columbia, Vector Institute, and Sakana AI have taken a significant step toward this vision with their Darwin Gödel Machine...
Zero Data, Superhuman Code: A New AI Paradigm Emerges—and It Has an “Uh-Oh Moment”
The relentless march of AI capabilities continues, driven largely by ever-larger models and increasingly sophisticated training methods. For large language models (LLMs), Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a powerful technique, allowing models to learn directly from outcome-based feedback rather than just mimicking human-provided steps. Recent variants have pushed towards a "zero"...
What is reinforcement learning? An AI researcher explains a key method of teaching machines – and how it relates to training your dog
Understanding intelligence and creating intelligent machines are grand scientific challenges of our times. The ability to learn from experience is a cornerstone of intelligence for machines and living beings alike. In a remarkably prescient 1948 report, Alan Turing – the father of modern computer science – proposed the construction of machines that display intelligent behavior....
CaMeL Prompt Injection Defense Explained
Prompt injection attacks pose a major threat to AI assistants, but a breakthrough system called CaMeL offers a powerful defense by design. Developed by researchers from Google, DeepMind, and ETH Zurich, CaMeL uses proven software security principles to protect large language models where others fail. Here’s how it works—and why it matters. The Problem: Why...
AI’s Emerging Preference Systems
A paper from researchers at the Center for AI Safety, University of Pennsylvania, and UC Berkeley has uncovered something surprising about large language models (LLMs): they develop coherent, structured preferences that become more organized as the models get larger. This finding challenges some common assumptions about how AI systems work and raises important questions about...
How close is the superintelligence?
The pursuit of superintelligence—the development of artificial intelligence (AI) systems that significantly surpass human cognitive abilities—has become a focal point of research and debate within the AI community. As advancements in machine learning, particularly through large language models (LLMs) and deep learning techniques, have accelerated, experts increasingly ponder the implications of systems that could autonomously...
In-Context Scheming in Frontier Language Models
Researches from Apollo Research have investigated the ability of large language models (LLMs) to engage in "scheming"—covertly pursuing misaligned goals. The research evaluated several leading LLMs across various scenarios designed to incentivise deceptive behaviour, finding that these models can strategically deceive, manipulate, and even attempt to subvert oversight mechanisms to achieve their objectives. The study...
Test-Time Compute: The Next Frontier in AI Scaling
Major AI labs, including OpenAI, are shifting their focus away from building ever-larger language models (LLMs). Instead, they are exploring "test-time compute", where models receive extra processing time during execution to produce better results. This change stems from the limitations of traditional pre-training methods, which have reached a plateau in performance and are becoming too...
Ferret-UI 2: Towards Universal UI Understanding for LLMs
Ferret-UI 2 is a multimodal large language model (MLLM) designed to interpret, navigate, and interact with UIs on iPhone, Android, iPad, Web, and AppleTV. It enhances UI comprehension, supports high-resolution perception, and tackles complex, user-centered tasks across these diverse platforms. Core Architecture: Multimodal Integration The foundational architecture of Ferret-UI 2 integrates a CLIP ViT-L/14 visual...