TMCnet News
Causal Dynamics Lab outperforms Anthropic and OpenAI in multiple coding testsSan Francisco, CA, May 05, 2026 (GLOBE NEWSWIRE) -- AI coding tools are now producing code faster than teams can check what it will do in real use. Today, Causal Dynamics Lab (CDL) announced new research explaining why this happens, along with a new product called Cielara Code. This product achieved the highest accuracy in code localization among AI coding tools, outperforming both Claude Code (Opus-4.6) and OpenAI Codex (GPT-5.4) across three independent tests. CDL studied how coding agents operate by tracking their actions across thousands of coding sessions. They found 56.8% of agents' actions involved reading files, and 24.2% involved using grep. Less than 1% of their actions were actual code edits. The problem was not that agents couldn't write code; they had difficulty finding the correct code to edit. The situation worsened with more complex tasks: when a correct fix involved more than six files, the agents' ability to recall the necessary information dropped significantly, and the computing power used in failed attempts increased by a factor of 4 compared to successful ones. "Every coding agent out there today uses grep, which is like a surgeon operating without imaging," said Hasibul Haque, CEO at Causal Dynamics Lab. "We created Cielara Code to help agents see better: it provides a clear understanding of the working environment, making the reasons behind each change clear and verifiable."
Causal Dynamics team. The 2025 DORA report showed the use of AI coding tools led to a 7.2% drop in deployment stability. AWS CTO Werner Vogels called this problem "dynamic verification debt." A well-known issue with Claude Code (GitHub issue #42796) illustrates the same problem on a larger scale: current agents treat code as flat text without showing how files connect, how functions call each other, or how changes affect the overall system. How Cielara Code works Benchmark results
REASONARA: causal memory at enterprise scale Cielara Code is a safety layer for AI coding agents. It aims to enhance the safety of their output rather than replace them. Currently, 11 Fortune 100 and over 40 Fortune 500 companies use Cielara Code on their codebase. "Board members and auditors expect more proactive risk management. Leaders now want proof that security can anticipate risks caused by fast-moving AI and automation, instead of just reacting after incidents," said the CISO of one of the largest law firms in the United States, who is also a Cielara Code customer. Phillip Miller, Vice President, Global Chief Information Security Officer, H&R Block added: “Enterprises need solutions to problems they cannot solve with people alone. Cielera's technology is a generational leap towards the original promise of AI: tackling complexity 7x24 with acquired knowledge, deep reasoning, and unbeatable accuracy. For engineering teams, this means a single engine to discover faults in real-world deployments (including legacy, cloud) and provide clear resolution steps. When I wrote, Hacking Success, I described a world where AI needs strong, directive policy (not rules / guardrails) to be safe and effective. Information Security lags behind the innovation curve, as most options rely on legacy thinking including posture, gateways, and logging. Enterprises now have an option to leverage Cielera's models to oversee deployments of AI agents, models, and their supporting infrastructure.” The team "AI has already changed how people find information. The next step is to change how people make decisions by exploring possibilities, comparing options, and understanding the outcomes before making a choice," said Matt Fisher, former Co-Founder and CTO of Daydream and an Adjunct Professor at Brown University. "That shift towards exploring outcomes is what CDL is focusing on." What's next Media images can be found here. Methodology: Benchmarks were run against Claude Code (Opus-4.6) and OpenAI Codex (GPT-5.4) using the publicly available MULocBench, UltraDomain, LoCoMo, and LongMemEval test harnesses. Full methodology, configuration, and reproduction instructions are available at [research.causaldynamics.com/benchmarks]. About Causal Dynamics Lab ![]() For further information please contact the Causal Dynamics Lab press office via Bilal Mahmoood on [email protected] or +447714007257. |



