Validating software and AI tools in your QMS
Computer software validation has never mattered more in the age of AI. Software now sits at the centre of most medical device quality systems. A medical device manufacturer may use software to track requirements, control documents, record complaints, manage CAPAs, train staff and maintain technical documentation.
A useful way to think about this is to picture a factory – say, one that produces rubber ducks. If your goal is to manufacture nice, consistently yellow rubber ducks, you need machinery that is high-quality and well-maintained. You can churn out a single duck with a bad machine, but you can't consistently produce them to a high standard if your machines are poor or poorly maintained. Validating your production process, which inherently means verifying that your machinery is working correctly, is the key to ensuring consistent and reliable quality output.
The same applies to software. You can build a good prototype without much process. But you can't deliver updates to a consistent quality standard over years of a software lifecycle without a set of processes governing how that software is built and maintained. This is what we call a QMS – a Quality Management System.
No longer do we have shelves with folders - particularly in the current world of hybrid and remote working, everything from complaint records to design files is handled digitally, by increasingly sophisticated software, often on the cloud. The most recent addition to this landscape, undeniably, is AI, which is being used to draft, summarise and review documents.
As the toolkit of software that helps us to manage quality grows, it invites a different question – who is watching the watchers, or put in a less literary way, how do we make sure that the software we use to manage quality is itself of sufficient quality? It is the kind of question that is likely only asked with a straight face in medical device regulatory compliance – but it doesn’t diminish its seriousness.
Where do we begin? This question has two sides to it. First, classic software. Then, secondly, AI. In a previous Hardian Health blog, Ben Howes asked whether compliance has really happened if regulatory documentation is created by AI and then reviewed by AI. The concern was clear: AI-generated documentation can lack context, invent references, create contradictions, or describe processes that the company does not actually follow.
This article looks at the next practical question:
If you use software, including AI, in your QMS or regulatory workflow, what do you need to validate?
When does software need validation?
Software usually needs to be assessed for validation when it is used as part of production, testing, labelling, complaint handling, the quality management system, or regulatory documentation.
Translated from the legalese, a much more intuitive way to think about it is - if this piece of software fails, would it affect product quality, patient safety, data integrity, or regulatory compliance?
Also, another helpful question to ask yourself is: Does the software automate, support, or record a QMS or regulatory activity?
If the answer is yes to either question, the software should at least be assessed for validation, with the depth of validation driven by risk and intended use.
More formally, the way that the International Organisation for Standardisation (ISO) puts it in their publication PD ISO/TR 80002-2:2017 Medical device software - Validation of software for medical device quality systems, §5.2.2:
a) Could the failure or latent flaws of the software affect the safety or quality of medical devices?
b) Does the software automate or execute an activity required by regulatory requirements (in particular, the requirements for medical device quality management systems)? Examples may include capturing electronic signatures and/or records, maintaining product traceability, performing and capturing test results, maintaining data logs such as CAPA, non-conformances, complaints, calibrations, etc.
A “yes” answer to any of the questions identifies software that is required to be validated and is within scope of this document.
CSV starts with the intended use
Dealing with an open-ended question - like either of the two questions above - I always feel like there should be a comprehensive and exhaustive list of what software must be validated.
Unfortunately, compiling such a list is not feasible, and not only because new software keeps emerging every day. To better understand why we have to rely on the abstract test instead of concrete guidance, it helps to think from the definition. Computer software validation, or CSV, is the documented process of showing that software performs as intended in its actual use environment. The important phrase is as intended.
Any software is not validated in the abstract but rather against a concrete use case. Let’s link this point back to the first section with this simple example: suppose you have Excel licenses. Do you need to validate Excel? Of course, this question can not be answered in the abstract.
One would require rather developed creative thinking to justify the potential impact of Excel failure on product quality, patient safety, data integrity, or regulatory compliance if the intended use of this software in your company is to keep track of lunch orders. On the other hand, it is not hard to see how a sneaky bug in an Excel spreadsheet used to trace product and system requirements may directly affect both product quality and regulatory compliance.
The same is true for AI.
AI used to improve the grammar of an internal draft may be low risk. An AI agent used to review whether a submission is ready is a very different matter.
That is why the first CSV question should always be: What are we using this software to do?
What software is usually in scope?
While a complete list of software that needs to be validated doesn’t exist, there is still a little ‘cheat sheet’ that may help you to make the right call when uncertain. If you have applied the ‘two-question test’ and are uncertain about the results, take a look at the table below - things in this table almost always require validation.
| Software | Why it matters |
|---|---|
| eQMS (Hardian Flow, Greenlight Guru, Veeva Vault QMS) | Controls QMS processes and quality records |
| eDMS (electronic document management system) (Hardian Flow, Qualio Document Management) | Controls document lifecycle and approvals |
| CAPA / complaints tools | Maintains regulated quality records |
| Training systems | Demonstrates training and competence |
| Electronic signature tools (eSign, DocuSign) | Records approval, responsibility, authorship, or review |
| Test tools (Mocha, Playwright, pytest, etc) | Captures or processes verification and validation evidence |
| Code repositories (Github, Gitlab, Hugging Face) | Controls software source code and development history |
| Data repositories (Hugging Face, S3, Blob Storage, DVC) | Controls data, evidence, and model artifacts, particularly for AI model training and validation |
| AI tools, even public general-purpose ones such as Anthropic Claude, OpenAI ChatGPT, Google Gemini, Microsoft CoPilot | Drafts, reviews, compares, or checks QMS and regulatory documents |
However, even if you know that a certain piece of software must be validated, this is still not a complete answer. The point of a risk-based approach- the standard methodology in quality management - is that not every tool needs the same level of validation.
A tool used to support a low-risk administrative activity may only need a simple assessment and procedural control. A tool used to approve controlled records, support regulatory submissions, or maintain traceability will need stronger evidence.
What should a CSV package include?
A reasonable question - if a regulator doesn’t prescribe exactly what buttons to click and what tests to run, how do I know what needs to be tested and evidenced for compliance?
Let’s introduce a second ‘two-question test’:
1. What is the software intended to do?
2. What evidence shows it is fit for that use?
The mindset companies need for CSV is not maximum paperwork, but justified evidence - understand the intended use and then work backwards from the intended use to understand what evidence would be sufficient proof that the software is fit for that use.
For an eQMS, the package may include:
intended use assessment;
scope and system description;
risk assessment;
validation plan;
user requirements/software requirements;
configuration specification;
executed test evidence;
traceability;
issues/deviation found during testing;
validation report;
release approval;
сhange control and revalidation approach;
periodic review;
retirement/data retention plan
There is no universal CSV pack that works for every system. The right evidence depends on what the software does, where it sits in the QMS, and what could happen if it fails. The fol,lowing references are useful when deciding how much evidence is enough:
ISO/TR 80002-2:2017 - standard for validation of software for medical device quality systems
FDA’s Computer Software Assurance. It applies to computers and automated data processing systems used as part of medical device production or the quality management system. It describes a risk-based approach to establish confidence in automation used for production or QMS activities (U.S. Food and Drug Administration).
ISO 13485:2016. The current FDA QMSR also incorporates this standard – so if you are familiar with it, it should be a good reference.
21 CFR Part 11 is also relevant for electronic records and signatures. It states that electronic records and electronic signatures are considered trustworthy, reliable, and generally equivalent to paper records and handwritten signatures when the requirements of Part 11 are met. (21 CFR Part 11)
Example: eQMS validation
An eQMS is one of the clearest examples of software that needs validation. Being the software system that supports quality management of the actual product, from document control to electronic signatures, the case for eQMS being one of the most sensitive pieces of software in a healthtech company is rather undeniable. As such, it is also a good example of CSV.
A good eQMS validation should show that the system is configured correctly and supports the company’s actual procedures.
For example:
Can only authorised users approve documents?
Are CAPA workflows aligned with the procedure?
Are audit trails enabled where needed?
Can records be retrieved for audit or inspection?
In this example, the question we are trying to answer is: “Does your configured use of the system fit your QMS?”
Where AI changes the conversation
AI makes CSV more important but also trickier. The risk with AI is that it can feel like a shortcut - draft a procedure in seconds, review a technical file quickly, identify inconsistencies across documents faster than a human reviewer.
On the other hand, the current literature and guidelines on the use of AI consistently emphasise that control should not be traded for speed.
As AI can hallucinate, be ‘confidently wrong’ and sycophantically try to please a user, it leads to multiple characteristic failure modes, such as creating policy/evidence records that perfectly match compliance requirements but don’t match how the company actually operates. We at Hardian Health keep track of such repetitive failure modes that we see across the industry, and overall, it is rather clear that AI requires strong guardrails – hallucination and sycophancy are dangerous traits for compliance, a trade that thrives on precision and honest acknowledgement of gaps and shortcomings.
As an ‘honest and truthful’ AI is yet to be invented, the best we can do with real-life deployments is ask: “What is the AI allowed to do, and what is it not allowed to do?”
A low-risk intended use might be:
“AI may be used to improve the readability of draft text, subject to human review.”
A higher-risk intended use might be:
“AI is used to support review of QMS and regulatory documents by identifying inconsistencies, missing sections, unsupported references, and traceability gaps.”
That second use needs stronger controls.
For AI-supported QMS or regulatory work, the validation record should follow the same logic as other QMS software: intended use, risk-based analysis, testing performed, result for each test case, issues found, conclusion, and approval. What changes with AI is not the structure of the validation record, but the type of evidence needed. The evidence should challenge whether the AI output is accurate, complete, consistent, and appropriately reviewed. It should also make clear where human judgment remains required.
The most important control – mandatory until Artificial General Intelligence (AGI) arrives – is simple: AI can support the reviewer, but it should not become the reviewer of record.
And if you apply ISO 27001 or ISO 42001
Do not forget that validation of tools should be on your Statement(s) of Applicability (SOAs) and Risk Treatment Plans (RTPs) for information security management under ISO 27001 and, if applicable for AI management under ISO 42001.
Conclusion
Computer software validation is no longer only about traditional eQMS platforms. Companies that produce SaMD (including AI) products rely on repositories such as GitHub and Hugging Face, continuous integration and deployment tools, testing tools and plenty of other software. On top of that, AI is rapidly taking over the compliance and quality management workflows.
Still, the principle remains the same:
Define the intended use. Assess the risk. Validate proportionately. Keep evidence. Control change. Review periodically.
Used well, software can make the QMS faster, cleaner, and more consistent. However, introducing quality management software without validation can create hidden compliance risk.
CSV is how companies show that the software they rely on is fit for purpose.
You could do it yourself - the process is reasonably well-documented, but successful completion requires deep knowledge of the regulatory thinking around what risks would be deemed acceptable, what validation evidence is appropriate, and the other elements of the compliance landscape.
Alternatively, at Hardian Health, we provide a pre-built eQMS/eDMS system, prepare a validation plan, and report on the performed testing - bringing you to compliance fast and with little overhead.