Can you use Gen AI to create your regulatory documentation?
We're witnessing a rapid rise of the use of generative AI (ChatGPT et al) in the medical device regulatory space to produce and review documentation. So much so that it is probably already the case that there are device files which were largely created by AI which are also being reviewed largely by AI (I have heard anecdotally of auditors openly using ChatGPT during audits).
That raises the question:
If all compliance documentation is created by a machine, and all compliance documentation is audited by a machine - has any compliance actually happened?
To answer, let's first assume we're in a world where AI is at least equivalent to human experts at performing every aspect of the process. Weโll also assume that the gen AI tools have been validated for use in the QMS (per ISO 13485:2016 ยง4.1.6).
In that case, I would argue compliance has happened. The creating AI has followed the guidance and relevant regulations maximally, and the audit will have assessed against the regulations to ensure correctness and find any faults.
But we're not currently in that world - we're in a world where LLM based gen AI hallucinates and is far from perfect, especially when not prompted by a "prompt engineer".
LLM generated files today contain:
Lack of context - Production of processes and documentation which are entirely divorced from what the company actually does
Hallucinations - made up standards, false references, confidently incorrect information etc
Contradictions - LLMs love to spit out a lot of content and when you have multiple chats, they often end up with different conflicting stylistic choices
Irrelevant information - Processes which will be very hard to follow in practice (e.g. a design control process collecting 25 metrics on performance rather than the 3-5 really important ones)
Considering the first three as the points relevant to compliance (the others are bad, but more business problems):
The context challenge is arguably fixable through clever prompting and having all relevant standards available to the AI at the time of generation of the documents. I do doubt that this is happening a great deal in the real world though due to lack of understanding. Even with all the context in place, you're still vulnerable to the next two.
Including hallucinations in a generated file risks falsely representing a device's performance or other characteristics and is arguably fraud.
Including contradictions represents risk to passing audit - depending on your auditor's (or auditing AI's) ability to pick this up. No doubt it will cause confusion in any case.
For auditing bodiesโ use of AI to review documentation - itโs way harder to reason about without seeing how itโs actually being done behind closed doors, but in the case of auditors simply using ChatGPT - all the same arguments apply on the review side and are compounded further if the input documentation also contains the aforementioned problems.
Use Gen AI with caution in regulatory processes
Don't get me wrong, there are ways that gen AI can be very useful in the regulatory process, but it needs to be approached with extreme caution and used for what LLMs are good at - text manipulation, not authoring content.
AI may appear to help with creating processes and records but unless those processes are followed by staff, and the generated records are truthful and accurate representations of those processes being carried out, many of the above concerns will be present in the documentation. As a producer of that documentation, you have a legal responsibility to ensure its truthfulness.
Looking to other industries, itโs already well documented that the use of gen AI in legal cases has resulted in inadvertent inclusion of errors of the nature described above, such as hallucinations resulting in fake cases being cited, and serious consequences for the lawyers involved.
In a parallel to the legal cases above, a single "natural or legal person" signs the declaration of conformity for medical devices - taking legal responsibility for the correctness and truthfulness of the file. AI cannot take on that responsibility.
Simply put - prompting ChatGPT to "write me a comprehensive and detailed software development plan" and then repeating for all documents is going to open you up to inadvertently committing fraud if you don't understand and check everything produced, and then follow the produced plan.
Our advice on using Gen AI
Do: use Gen AI to restructure text content, ask it to search for references (giving verifiable links to sources), and use it to generate feedback and ideas - ALWAYS VERIFY
Don't: prompt "Generate me a complete MDF for my mental health tool"
We may one day end up in a world where AI is able to do this well - but we sure aren't anywhere near there yet.
Hardian Health is a clinical digital consultancy focused on leveraging technology into healthcare markets through clinical strategy, scientific validation, regulation, health economics and intellectual property.