The story of how Arcseer
came to exist — and why it matters.
Arcseer was not built in a boardroom. It grew out of a shared conviction, held by a group of offensive security specialists who had spent years working on the front line of enterprise security — and who believed that AI was about to change the nature of that work entirely.
For most of 2023 and into 2024, we watched the development of large language models with the particular attention of people who understood both their potential and their limits. The question we kept returning to was not whether AI could be applied to security assurance — we were convinced it could — but whether the economics and the capability had yet reached the point where it could be done properly. For most of that period, the honest answer was no.
By early 2025, model capability and commercial viability had converged. What had been research speculation became something we could build seriously — and build properly.
From too expensive
to irresistible.
The early costs of LLM-powered AI made serious, production-grade use of the technology for security assurance effectively prohibitive. Running meaningful assessments at the token volumes required was not a viable commercial proposition. We were not alone in recognising this — but we were also not willing to build something that only worked on paper.
What changed that calculation was not a single breakthrough but a compounding of several. Model capability improved exponentially while inference costs dropped substantially. Reasoning quality — the ability of a system to work through a multi-step problem with the kind of structured logic that security work demands — reached a threshold that made previously theoretical applications suddenly tractable. By early 2025, the picture had changed materially. The tipping point had arrived.
Initial experiments yielded results that moved us from interested to genuinely excited. Not because the technology was perfect — it was not — but because the direction was clear and the trajectory was compelling. We began a more serious development effort.
Building something
that actually works.
What followed was a genuine engineering and research challenge. Building an AI-driven penetration testing capability that produces outputs a senior security professional would trust required solving a series of interconnected problems that do not yield to off-the-shelf answers.
Multi-agent architectures gave us the ability to decompose complex assessments into parallel workstreams — running reconnaissance, exploitation, and analysis simultaneously rather than sequentially. Budget-managed assessment design allowed us to control the scope and depth of each engagement with the precision that enterprise clients require. MCP adoption and tool access gave the agents the ability to interact with real systems in the ways that matter. Memory techniques and embedding management addressed the challenge of maintaining coherent context across long and complex assessments. Each of these capabilities was developed, tested, and refined against real-world conditions.
We also witnessed, at close range, the rise of what the industry came to call AI slop — the flood of low-quality, unverified, hallucination-prone output that AI tooling was producing in adjacent fields, and the damage it was beginning to cause in security specifically. Bug bounty programmes were being overwhelmed with automated noise. Reports were being filed that had not been validated. Findings that looked real in a summary were incoherent under scrutiny. The reputational damage to the idea of AI-assisted security work was real, and it was being caused by tools that had been built fast without being built carefully.
Hallucination management became one of our core technical concerns. An AI system that invents vulnerabilities is worse than useless — it actively undermines the trust that security assurance depends on. Our approach required that every finding be grounded in evidence that could be traced, reproduced, and reviewed. If it could not be verified, it did not make the report.
Whatever we built needed to produce outcomes that were verified, reproducible, and defensible under expert review. That was not a marketing position. It was an engineering requirement.
We believe the best offensive security work in the coming years will come from teams that know how to work with AI — not teams that have been replaced by it.
Humans in the loop —
not out of it.
We had the technical ambition to build a fully automated assessment platform. But we did not want to take humans out of the process. That distinction matters, and it is not a compromise — it is a deliberate position.
The value of experienced offensive security professionals does not disappear when AI can automate reconnaissance or chain exploits at scale. It concentrates. What seasoned practitioners bring — contextual judgement, pattern recognition developed across hundreds of real engagements, the ability to recognise when an anomaly is interesting rather than just anomalous — becomes more valuable in an AI-augmented workflow, not less. The question is not whether to include human expertise, but how to deploy it where it creates the most impact.
Our model ensures that qualified security professionals are present throughout. In review, in supervision, in the interpretation of complex findings, and in direct contribution when the nature of an engagement calls for it. Automation handles the volume and the velocity. Humans handle the judgement. The result is something neither could produce alone.
From project
to platform.
The point at which internal excitement became commercial conviction was not a single demonstration — it was a gradual accumulation of evidence that what we had built could add real value to the specific, day-to-day challenges facing the security professionals responsible for defending large, complex organisations.
CISOs and their teams are operating under conditions that have not been adequately served by the tools available to them. Attack surfaces are expanding faster than teams can test them. Regulatory demands are increasing. The skills market is structurally unable to fill the gap between what is needed and what is available. The problems are real, they are growing, and they are not going to be solved by doing the same things more expensively.
What we had built addressed those problems in a way we could defend — technically, operationally, and commercially. The platform had reached the point of commercial viability. The decision to take it to market was not a hard one.
Arcseer was born from that conviction — to bring this research to market as an AI penetration testing platform, supervised by offensive security specialists, built to meet the challenge of both accelerated technology adoption and continuous development at enterprise scale.
What we are here to do.
To provide an automated security assurance platform — supervised by offensive security specialists — that addresses the real-world challenges organisations face in defending complex estates, while supporting the regulatory compliance requirements that enterprise and regulated clients must meet.
That means continuous testing rather than point-in-time snapshots. It means findings that are verified and reproducible. It means outputs that a board, an auditor, or a regulator can engage with. And it means doing this at a scale and cost that makes it a viable operational choice rather than an aspirational one.
We are not trying to replace the security profession. We are trying to give it better tools — and to deploy those tools in a way that is worthy of the trust that enterprise clients place in their security partners.
The journey has only just started.
We do not believe we have finished building Arcseer. We believe we have started. Our research identifies new capabilities and possibilities with remarkable regularity, and we intend to pursue them rigorously.
The cybersecurity industry has always been driven by research — research undertaken by some of the most talented and intellectually committed individuals in technology. That will not change. What AI changes is the speed at which insights can be operationalised, the scale at which hypotheses can be tested, and the range of attack scenarios that can be explored in a single assessment. The practitioners who understand both domains — offensive security and AI capability — are among the most valuable people in the industry right now. We have built a team around that intersection.
The threat landscape will continue to evolve. Attacker tooling will continue to incorporate AI. The demands placed on defenders will continue to grow. We are building a platform designed to move with that landscape — not to describe it from a distance, but to stay ahead of it.
We believe AI's greatest impact on offensive security is not yet visible. We intend to be part of making it so.
Want to see what
we have built?
Request a Proof of Value engagement or speak with our team.