As organizations develop AI applications used across their business units, they need to identify relevant regulations and frameworks quickly. By utilizing Generative AI — more precisely, large language models (LLMs) — CDOs, Analytics Leads and other data professionals can simplify their identification of common patterns and differences across varied texts to inform their AI governance strategies.
- Accelerate Search: Dive through a company-relevant global library of regulations, frameworks, white papers, directives, and best practices.
- Simplified: With the use of an LLM, results are returned as a generated answer with references, not highlights of words in a document.
- Transparent: Full access to sources and capacity to dive deeper.
How It Works: Architecture
The organization identifies relevant documents issued by governments, specialized institutions, or international organizations, mixing regulations, vision papers, and voluntary frameworks. A flow is then built in Dataiku to read the documents and split them into meaningful blocks of a few hundred words. Word blocks are encoded using a sentence embedding transformer.
An app allows users to enter questions evaluated by the AI model. When a question is raised about a specific regulation or policy, the app uses the relevant documentation to generate an answer. A user may ask questions like:
- Our business unit dedicated to personal finance wants to enhance its credit scoring approach with machine learning. What does the EU AI Act say about credit scoring?
- In which regulations and frameworks is credit scoring mentioned?
- What are the recommendations on explainability formulated in Canada Directive on Automated Decision-Making?
- How is Personal Data defined by GDPR and by California Consumer Privacy Act of 2018?
The application encodes the question and identifies the 5-10 encoded word blocks with the closest match. Relevant sections of the documents are sent over to an LLM along with a prompt. The LLM generates an answer in the application based on the word blocks provided, and sources are reordered and displayed based on their similarity with the answer. The references to the source documentation for the generated answers are made available to the user for further review as needed.
This use case involves publicly available documents and provides a summary report for end users. Outputs should be regularly reviewed for consistency and correctness in accordance with subject matter expert knowledge about the field covered — in this case, data and AI governance.
Additionally, end users should be aware that the summary or answers provided by a model are not a guarantee of correctness and be encouraged to use their best judgment when acting upon information returned by the model. More specifically, this use case is not to be qualified as suited to provide legal advice.
Finally, the organization should have an overarching Responsible AI policy to enforce consistent practices across AI projects.