AI Governance & Risk 33% of exam
A third of the AAIA exam lives here. Domain 1 is where the auditor decides whether an organization has the structures, policies, risk discipline, data governance, and ethical guardrails to deploy AI responsibly β long before a single model is tested. Master the governance and risk vocabulary and you have already de-risked a third of the exam.
AI Models, Considerations & Requirements D1 Β· A
Before auditing any AI system you must classify what kind of model it is, judge whether AI was the right tool at all, and confirm the organization defined what the system needs to run safely β because each choice drives a different control set.
1.1 Recognizing AI model types and their audit implications
AI is an umbrella over very different technologies: deterministic rule-based / expert systems; classical ML (supervised learning from labelled data, unsupervised learning that finds structure in unlabelled data); deep learning neural networks that learn features automatically but become opaque; generative AI / LLMs that produce new text, code, or images; foundation models (large pre-trained models adapted to many tasks); and agentic AI that plans and takes actions with little human intervention. You are not asked to build them β you are asked to know the risk profile of each.
Auditor's angle: the model type sets the inherent risk and therefore the control expectations. The risk is mis-scoping the audit β applying lightweight controls to a deep-learning or agentic system. Evidence to look for: a model card or design document that names the architecture, the training data, and the intended use.
- Rule-based = explainable, brittle; verify rules match policy and are version-controlled.
- Deep learning & LLMs = opaque; expect explainability tooling and human review of high-impact outputs.
- Foundation / vendor models = inherited (third-party) bias, data, and security posture.
- Agentic AI = highest blast-radius; expect action boundaries, approval gates, kill switches, full action logs.
| Model type | Dominant risk | Control the auditor expects |
|---|---|---|
| Rule-based / expert system | Brittleness, stale logic | Version control, rule-to-policy traceability |
| Classical ML (supervised) | Label quality, bias, drift | Data provenance, train/test split, monitoring |
| Deep learning | Opacity / black box | Explainability tooling, human-in-the-loop |
| Generative AI / LLM | Hallucination, leakage, IP | Grounding, output review, acceptable-use policy |
| Foundation model | Third-party / supply chain | Vendor due diligence, audit rights |
| Agentic AI | Autonomy / blast radius | Approval gates, kill switch, action logging |
Classifying an unlabeled "AI chatbot"
You are scoping an audit of a new customer-service "AI chatbot." Management's project brief simply calls it "an AI assistant." You need to classify it before you can set the control expectations.
- Pull the architecture documentation
Request the design document or model card and confirm whether it is a rule-based decision tree, a retrieval system, or an LLM. The brief's label tells you nothing about risk.
- Identify the underlying model and its source
Confirm it is a third-party LLM accessed via API, which immediately flags both generative risk (hallucination) and third-party risk (you inherit the vendor's posture).
- Check whether it can take actions
Ask whether the bot only answers or can also issue refunds, change accounts, or call other systems. If it transacts, it is agentic and the blast radius changes the entire audit scope.
- Map the type to expected controls
For an LLM that can transact, expect grounding/RAG, output guardrails, an acceptable-use policy, approval gates for actions, and full logging.
- Confirm controls exist or record the gap
Test for each expected control. Document any missing control as a finding tied to the model type, not a generic "improve documentation" note.
1.2 Is AI even appropriate? The fit decision
A surprising number of scenarios test whether AI should be used at all. AI is a poor fit when the problem is fully deterministic (use rules), when decisions must be perfectly explainable by law, when data is too sparse or biased to learn from, or when an error is catastrophic and unrecoverable. The classic red flag is AI adopted for hype β no defined problem, no success metric, no fallback.
Auditor's angle: the risk is solution-first thinking that introduces opacity and bias to solve a problem a simpler method handled safely. The control is a documented suitability/fit assessment; evidence is a problem statement, a success metric, a considered-alternatives section, and a fallback plan.
- Deterministic, regulated, or sparse-data problems often argue against ML.
- Look for a defined success metric and a non-AI fallback.
- "Because competitors use AI" is not a business justification.
An LLM for regulatory calculations
A finance team proposes using an LLM to compute regulatory capital figures because "AI is faster." The calculation is a fixed formula defined by the regulator.
- State the problem precisely
Document that the task is a deterministic, regulator-defined formula with one correct answer for any input.
- Test the fit against AI's strengths
Note that LLMs are probabilistic and can hallucinate β exactly the wrong property for a calculation that must be exact and reproducible.
- Check explainability requirements
Confirm the regulator expects a fully traceable calculation; an LLM cannot guarantee the same output twice or show its arithmetic reliably.
- Compare to the simpler alternative
A coded, tested formula (rule-based) is fully explainable, reproducible, and cheaper to validate.
- Recommend the proportionate solution
Advise that AI is not appropriate here; a deterministic implementation with unit tests meets the need with far less risk.
1.3 Business case & strategic alignment
Every AI initiative should trace to a documented business objective, a measurable benefit, an accountable owner, and sign-off at the right level of authority. The cost-benefit analysis must include risk and compliance cost, not just expected savings. AI deployed in a silo outside the strategy and outside governance is shadow AI β a classic finding.
Auditor's angle: the risk is unmanaged spend and exposure on initiatives with no owner and no measure of success. The control is a governed business-case and approval process; evidence is an approved business case, a benefits-realization metric, and minutes of the approval authority.
- Trace each initiative to enterprise strategy and a named sponsor.
- Cost-benefit must price in risk, compliance, and ongoing monitoring.
- No business case + no owner = shadow AI.
The pilot that became production
A marketing manager spun up a generative-AI tool as a "quick pilot." A year later it drafts customer emails at scale, with no business case, owner, or governance entry. You discover it during fieldwork.
- Confirm production status
Establish that the tool is no longer a pilot β it touches real customers at scale β so full governance should apply.
- Search for the business case and approval
Request the approved business case and the sign-off authority. Their absence confirms shadow AI.
- Check the AI inventory
Confirm the tool is not in the central AI registry, meaning it is invisible to risk and oversight functions.
- Trace strategic alignment and benefit metric
Determine whether anyone measures benefit or owns outcomes; document that no metric or owner exists.
- Recommend formalization, not just removal
Advise registering the system, assigning an owner, building a retrospective business case, and routing it through the approval gate, or formally decommissioning it.
1.4 Build vs. buy & third-party / vendor AI risk
Buying or using a vendor model does not transfer accountability β the deploying organization remains responsible to its customers and regulators. The auditor checks vendor due diligence, contractual rights (audit rights, model documentation, data-use restrictions, indemnities), evidence of the vendor's own controls (independent attestations), and an exit/continuity plan.
Auditor's angle: the risk is "the vendor handles compliance" complacency that leaves AI-specific risks (hallucination, bias, leakage) uncontrolled. The control is a vendor risk-management process; evidence is a completed due-diligence file, the executed contract clauses, attestations, and an exit plan.
A vendor's SOC 2 covers the vendor's operational controls; it does not transfer accountability for your outcomes or cover AI-specific risks. Accountability always stays with the deploying organization.
The vendor that "handles everything"
An insurer deploys a third-party generative-AI tool to draft claims correspondence. Management says, "The vendor is SOC 2 certified and contractually responsible for compliance, so no internal review is needed."
- Read the SOC 2 scope
Confirm what the report actually covers β typically security and availability of the vendor's platform, not bias, hallucination, or your regulatory obligations.
- Examine the contract clauses
Check for audit rights, model documentation, data-use restrictions, and indemnities. A generic "vendor is responsible" clause does not move legal accountability.
- Identify residual AI-specific risks
List the risks the SOC 2 leaves open: hallucinated claim language, data leakage of policyholder PII, biased tone or decisions.
- Test for internal controls
Verify whether the insurer reviews outputs, restricts inputs, and has an inventory entry and named owner. Document gaps.
- Recommend an internal control layer
Advise an internal vendor risk assessment, output-review controls, data-use limits, and accountability assignment β proportionate, not a ban.
1.5 Requirements: data, compute, talent, explainability
Before deployment the organization must define what the system needs: sufficient, representative, lawful data; adequate compute; the talent to operate and challenge the model; and the level of explainability the use case demands. A high-stakes decision (credit, hiring, healthcare) needs far more explainability than a content recommendation.
Auditor's angle: the risk is a model put into a high-impact role without the explainability or human expertise to govern it. The control is a documented requirements specification tied to the use-case impact; evidence is data-sufficiency analysis, a staffing/skills plan, and a stated explainability threshold.
- Explainability requirement should scale with decision impact and legal exposure.
- "Talent to challenge" means second-line/MRM capability, not just builders.
- Data requirement includes representativeness and lawful basis, not just volume.
A black-box model for loan decisions
A lender wants to use a complex neural network for consumer credit decisions because it scores slightly higher in accuracy than a simpler model. Applicants must, by law, receive a reason for denial.
- Identify the legal explainability requirement
Confirm adverse-action / right-to-explanation rules require a meaningful reason for each denial.
- Assess the chosen model against it
Note the neural network is opaque and cannot natively produce a clear, individual reason for a decision.
- Evaluate compensating tooling
Check whether explainability tooling can produce reliable, defensible reason codes β and whether the lender has the talent to validate them.
- Weigh the accuracy gain against the requirement
Document that a marginal accuracy gain does not justify failing a binding explainability requirement.
- Recommend the requirement-driven choice
Advise either a more interpretable model or proven, validated explainability with human review β so every denial carries a lawful reason.
AI Governance & Program Management D1 Β· B
Governance is the system of accountability, oversight, and decision rights over AI; the exam wants you to know who is responsible, what artifacts prove it, and how AI governance plugs into the enterprise governance that already exists.
2.1 Governance structures & operating model
An AI governance operating model defines the bodies, decision rights, and escalation paths for AI. It should extend existing enterprise governance (ERM, IT, data governance) rather than create a disconnected parallel regime. Typical structures include an AI governance/ethics committee, defined approval gates, and reporting lines into the board.
Auditor's angle: the risk is fragmented or duplicated governance where AI decisions bypass enterprise risk processes. The control is a documented operating model with clear decision rights; evidence is a committee charter, an org chart, and an escalation/approval workflow.
- AI governance should integrate with, not duplicate, ERM and IT governance.
- Decision rights and escalation paths must be explicit.
- The board needs a reporting line for AI risk.
Two committees, no clarity
An organization has an IT steering committee and a new "AI council," but neither charter says who approves a high-risk AI use case. A risky use case was deployed because each body assumed the other had approved it.
- Collect both charters
Read the IT steering and AI council charters to see how each defines its authority over AI.
- Map decision rights to use-case types
Build a simple matrix of who approves what; find the gap where high-risk AI approval is unassigned.
- Trace the deployed use case
Confirm no body formally approved it β each assumed the other did.
- Identify the control failure
Document the root cause as ambiguous decision rights, not individual negligence.
- Recommend an integrated operating model
Advise a single, clear approval path with explicit decision rights and escalation, integrated with existing governance.
2.2 Roles, accountability & the three lines of defense
Clear roles make accountability auditable: the board/senior leadership set risk appetite and approve strategy and policy; an AI ethics/governance committee reviews high-risk use cases; model owners are accountable for a model's performance and outcomes; Model Risk Management (MRM) independently validates and challenges. Applied to the three lines of defense: 1st line (business/data science) owns and operates; 2nd line (risk, compliance, MRM) sets policy and challenges; 3rd line (internal audit) provides independent assurance.
Auditor's angle: the risk is collapsed lines β builders validating their own models, or internal audit building what it later audits. The control is segregation of the three lines; evidence is RACI charts, independent validation reports, and audit-charter independence.
Internal audit (3rd line) must never own, build, or validate the models it audits. Objectivity, not familiarity, is what assurance provides.
Who validates the model?
The data-science team builds models and performs the only pre-deployment validation. Management says this is efficient because "the builders understand the model best."
- Map who builds and who validates
Confirm the same first-line team does both, so no independent party challenges the model.
- Name the line-of-defense breach
Document that the second-line challenge function is missing β the first line is validating its own work.
- Assess the risk it creates
Identify that bias, overfitting, or flawed assumptions can pass unchallenged into production.
- Check for any compensating review
Look for a qualified independent reviewer or MRM sign-off; find none.
- Recommend independent validation
Advise that an independent MRM function or qualified reviewer validate models before deployment.
2.3 AI policy
A foundational AI policy defines acceptable use, prohibited use, approval gates, roles, and the principles the organization commits to (fairness, transparency, human oversight). The auditor checks that the policy exists, is approved at the right level, is communicated, and is actually followed β not shelfware.
Auditor's angle: the risk is a policy that exists on paper but is unknown or unenforced. The control is an approved, communicated, monitored policy; evidence is the approval record, communication/training logs, and compliance-monitoring results.
- Policy must name acceptable, prohibited, and approval-gated uses.
- Approved at the right authority and demonstrably communicated.
- Test for adherence, not just existence.
The policy nobody read
An organization shows you a board-approved generative-AI policy that prohibits pasting confidential data into public tools. You suspect it is shelfware.
- Confirm approval and currency
Verify the policy was approved at board level and is the current version.
- Check communication
Request training-completion and acknowledgment records; find that few staff have seen it.
- Test operating effectiveness
Review DLP or proxy logs for confidential data sent to public AI tools, or sample user activity.
- Compare evidence to the policy
Find violations occurring despite the policy, confirming it is not operating.
- Recommend enforcement, not rewriting
Advise communication, training, and technical enforcement (DLP rules) with monitoring β the policy text is fine; adherence is the gap.
2.4 AI inventory / model registry
You cannot govern what you cannot see. A central inventory (registry) of AI models and use cases β with owner, purpose, data sources, risk tier, status, and last-review date β is one of the highest-value controls in the domain. Its absence means shadow AI is almost certainly present.
Auditor's angle: the risk is uncatalogued AI that no one risk-assesses or monitors. The control is a maintained, complete registry; evidence is the registry itself, its completeness/reconciliation, and last-review dates.
If a scenario says the organization "isn't sure how many AI systems are in use," the best first recommendation is almost always to establish/complete the AI inventory.
"How many AI systems do we have?"
The CISO cannot tell you how many AI systems are in production. You need to assess governance maturity and recommend a first step.
- Request the AI inventory
Ask for the central registry of models and use cases; discover it does not exist or is partial.
- Independently discover AI usage
Reconcile across procurement records, SaaS spend, network/proxy logs, and team interviews to surface unregistered AI.
- Quantify the gap
Compare discovered systems to the (missing) inventory to size the shadow-AI problem.
- Define required registry fields
Specify owner, purpose, data sources, model type, risk tier, status, and last-review date for each entry.
- Recommend establishing the inventory first
Advise building and maintaining the registry as the foundation for risk assessment, monitoring, and policy enforcement.
2.5 Governance across the AI lifecycle
Governance is not a one-time gate at launch. It spans the lifecycle β ideation β design β development β validation β deployment β monitoring β retirement β with checkpoints and approvals at each stage. Many failures occur after launch (drift, scope creep), so post-deployment governance matters as much as pre-deployment.
Auditor's angle: the risk is "launch and forget," where a system passes one gate then runs unmonitored for years. The control is lifecycle gates with re-approval triggers; evidence is stage-gate sign-offs, periodic review dates, and retirement records.
- Each lifecycle stage needs a checkpoint and an owner.
- Material change (new data, new use) should trigger re-approval.
- Retirement must be governed too (model decommissioned, data disposed).
Approved once in 2023
A model was approved at launch in 2023, has since been repurposed for a new, higher-impact decision, and has had no review since. Governance points to the original approval as proof of oversight.
- Retrieve the original approval
Confirm the 2023 sign-off covered the model's original purpose only.
- Identify the material change
Establish that the use case changed to a higher-impact decision β a new risk profile.
- Check for a re-approval trigger
Determine whether governance requires re-assessment on material change; find no trigger fired.
- Look for periodic review
Confirm no scheduled review occurred since launch, so drift and scope creep went unchecked.
- Recommend lifecycle gates
Advise re-approval on material change plus periodic review dates tied to the model's risk tier.
AI Risk Management D1 Β· C
This is the analytical heart of Domain 1: know the AI-specific risk taxonomy, the identify β assess β treat β monitor cycle, how risk appetite and use-case tiering work, and how it all maps to the NIST AI RMF functions.
3.1 The AI-specific risk taxonomy
AI introduces risks beyond traditional IT: bias (discriminatory outcomes), hallucination (confident false output), drift (decay as the world changes), opacity (unexplainable decisions), automation bias (humans over-trusting the machine), security (poisoning, prompt injection, model theft), privacy (memorization, re-identification), IP/copyright, third-party (inherited vendor risk), and reputational/regulatory exposure.
Auditor's angle: the risk is using a generic IT risk taxonomy that misses AI-specific failure modes. The control is an AI-specific risk taxonomy applied per use case; evidence is a risk register with AI categories mapped to the system.
| Risk | Typical control | Evidence to request |
|---|---|---|
| Bias / unfair outcomes | Fairness testing; representative data | Bias test reports, data demographics |
| Hallucination | Grounding/RAG, output review | Review logs, guardrail config |
| Drift | Monitoring, retraining triggers | Dashboards, alert/retraining tickets |
| Opacity | Explainability tooling, human review | Explainability reports, override records |
| Privacy | Minimization, de-identification, DPIA | DPIA, data lineage, consent records |
| Third-party | Due diligence, audit rights | Due-diligence files, contracts |
| Security | Input validation, adversarial testing | Red-team results, access logs |
Building the risk register for a CV-screening tool
You are reviewing the risk assessment for an AI rΓ©sumΓ©-screening tool. The register lists only "availability" and "data breach" β the same risks as any IT system.
- Walk the AI taxonomy against the use case
Step through bias, opacity, automation bias, drift, privacy, third-party, and regulatory risk for hiring specifically.
- Flag the missing top risk
Identify bias/discrimination as the dominant, unlisted risk for a hiring tool.
- Add automation bias and opacity
Note recruiters may defer to scores (automation bias) and cannot explain rejections (opacity), creating legal exposure.
- Map each risk to a control
For bias add fairness testing; for opacity add explainability and human review; for drift add monitoring.
- Re-rate the register
Document that the original register understated risk by ignoring AI-specific categories.
3.2 Risk identification & assessment
Risk management starts with identifying risks per use case (considering model type and context) and assessing them by likelihood and impact. The assessment must distinguish inherent risk (before controls) from residual risk (after controls) so the organization sees what its controls actually buy.
Auditor's angle: the risk is a vague "high/medium/low" with no rationale or inherent-vs-residual split. The control is a documented, repeatable assessment method; evidence is scored risk entries with stated likelihood, impact, controls, and residual rating.
- Assess per use case and context, not per technology in the abstract.
- Always separate inherent from residual risk.
- The method should be repeatable and documented.
Inherent vs. residual on a chatbot
A customer-facing LLM chatbot is rated "low risk" overall, with no explanation. You need to test whether the assessment is sound.
- Establish inherent risk
Rate the risk before controls: a public LLM can hallucinate and leak data β inherently high, not low.
- Inventory the controls in place
List grounding, output filters, and human review; confirm which actually operate.
- Derive residual risk
Recompute risk after controls; only then can a "low/medium" rating be justified β and only if controls are effective.
- Test the controls' effectiveness
Sample chatbot transcripts and guardrail logs to confirm the controls reduce risk as claimed.
- Challenge the rating
If controls are weak or unproven, document that the "low risk" rating is unsupported.
3.3 Risk appetite, tolerance & use-case tiering
The board sets the risk appetite (how much AI risk the organization will accept) and tolerance (acceptable variation around it). Use cases are then risk-tiered by impact β a loan-eligibility model demands far more oversight than an internal tag-suggester. Tiering keeps controls proportionate; the exam loves "risk-based."
Auditor's angle: the risk is one-size-fits-all controls β either over-controlling trivial uses or under-controlling high-impact ones. The control is a defined risk-tiering scheme tied to appetite; evidence is a tiering rubric, each use case's assigned tier, and tier-specific control requirements.
- Appetite (board-set) drives the tiering thresholds.
- Tier should reflect impact on people, money, safety, and compliance.
- Controls scale with tier β proportionality is the goal.
Same controls for every model
An organization applies an identical lightweight control checklist to all AI, from a cafeteria-menu recommender to a model that approves medical claims. You assess proportionality.
- Check for a tiering scheme
Request the risk-tiering rubric; find a single flat checklist instead.
- Tier two contrasting use cases
Score the menu recommender as minimal impact and the medical-claims model as high impact on individuals.
- Compare controls to impact
Show the high-impact model is under-controlled while the trivial one is over-controlled.
- Tie tiers to risk appetite
Map thresholds to the board's appetite so high-impact uses get heavier oversight.
- Recommend a tiered control model
Advise tier-specific requirements: human oversight and fairness testing for high tiers, light touch for minimal.
3.4 Risk treatment & residual-risk acceptance
After assessment, risk is treated: accept, mitigate (controls), transfer (contract/insurance), or avoid (don't deploy). Whatever residual risk remains must be formally accepted by an accountable owner within the organization's appetite. A control reduces risk; risk accepted by no one is a finding.
Auditor's angle: the risk is residual risk drifting along with nobody owning it. The control is documented treatment decisions and a sign-off; evidence is a treatment plan and a dated residual-risk acceptance by a named accountable owner.
Inherent vs. residual risk, and that residual risk must be formally accepted by an accountable owner within appetite. Risk that no one accepts is the implied finding.
Residual risk in limbo
A model's residual bias risk is rated "medium." The team mitigated what it could; the rest sits unaddressed with no acceptance and no further plan.
- Confirm the treatment applied
Verify which mitigations were implemented and what residual risk remains.
- Check against risk appetite
Compare the medium residual risk to the board's stated appetite to see if it is even acceptable.
- Look for a formal acceptance
Request a dated, signed residual-risk acceptance by an accountable owner; find none.
- Identify the accountability gap
Document that the residual risk is effectively owned by no one.
- Recommend a treatment decision and sign-off
Advise either further mitigation or a formal acceptance within appetite by the named owner.
3.5 Monitoring & mapping to the NIST AI RMF
Risk must be monitored continuously β residual risk, performance, and drift tracked, with re-assessment on change. The whole cycle maps cleanly to the NIST AI RMF functions: GOVERN (culture, policy, roles), MAP (context and risk identification), MEASURE (test and track risks), MANAGE (prioritize, treat, respond).
Auditor's angle: the risk is "deploy and forget" with no drift monitoring. The control is continuous monitoring with thresholds; evidence is monitoring dashboards, alert logs, and re-assessment records mapped to MEASURE/MANAGE.
| NIST function | What it does | Auditor looks for |
|---|---|---|
| GOVERN | Culture, policies, roles, accountability | AI policy, committee, risk-appetite statement |
| MAP | Context; identify risks per use case | Use-case docs, risk register, tiering |
| MEASURE | Analyze, test, track risks | Test results, fairness metrics, dashboards |
| MANAGE | Prioritize, treat, respond, recover | Treatment plans, residual sign-off, incident response |
Drift in production
A fraud-detection model performed well at launch but accuracy has quietly degraded over eight months as fraud patterns changed. There is no monitoring. The team proposes simply retraining it once now.
- Confirm the decay and its cause
Establish that accuracy fell because the data distribution shifted β concept drift β undetected.
- Find the missing control (MEASURE)
Identify that no performance monitoring or thresholds exist, so drift was invisible.
- Evaluate the proposed one-time retrain
Note a single retrain treats the symptom; the model will silently decay again without monitoring.
- Define the durable control (MANAGE)
Specify continuous monitoring with thresholds and automated retraining triggers, plus re-assessment.
- Recommend the root-cause fix
Advise establishing ongoing drift monitoring; consider short-term restriction if current risk is intolerable.
Privacy & Data Governance Programs D1 Β· D
AI is data-hungry, and data is where most legal and ethical risk concentrates; the exam expects fluency in the data-governance and privacy concepts that touch AI and the artifacts that evidence them.
4.1 Data provenance & lineage
Provenance is where data came from; lineage is its journey through transformations into the model. Without them you cannot prove data was lawfully obtained, suitable, or free of contamination (e.g., test data leaking into training).
Auditor's angle: the risk is training on data of unknown or unlawful origin, or contaminated pipelines. The control is documented provenance and lineage; evidence is source documentation, lineage diagrams, data dictionaries, and transformation logs.
- Provenance answers "where from and under what right?"
- Lineage answers "what happened to it on the way in?"
- Lineage also exposes train/test contamination.
The mystery training set
A model performs suspiciously well in testing. The team cannot say exactly where the training data came from or how it was prepared.
- Request provenance documentation
Ask for the source of every dataset and the legal right to use it; find gaps.
- Trace the lineage
Map the pipeline from source through transformations into training; look for where test and training data may have mixed.
- Identify contamination
Discover that test records leaked into training, inflating the performance score.
- Assess lawful-use risk
Flag that unknown provenance means the right to use some data is unproven.
- Recommend lineage controls
Advise documented provenance, lineage tracking, and a clean train/test separation before the results can be trusted.
4.2 Consent, lawful basis & purpose limitation
Personal data needs a lawful basis (e.g., consent, legitimate interest). Purpose limitation means data collected for one purpose generally cannot be repurposed to train a model without a valid basis β a frequent AI failure. A privacy policy on file does not by itself make secondary use lawful.
Auditor's angle: the risk is repurposing operational data for training with no fresh basis. The control is a lawful-basis and purpose check before training; evidence is the basis record, the original collection purpose, and (where needed) fresh consent.
Using data collected for service delivery to train a new model is a textbook purpose-limitation and lawful-basis violation. Possession is not permission.
The convenient training set
A retailer wants to train a recommendation model on purchase history collected to fulfil orders. The data team says, "We already have it and it's our data."
- Establish the original purpose
Confirm the data was collected for order fulfilment, not model training.
- Test purpose limitation
Determine that training is a new, incompatible purpose requiring its own basis.
- Check the lawful basis for the new use
Look for consent or another valid basis covering model training; find none.
- Assess whether a DPIA is needed
Conclude the high-risk reuse triggers a DPIA before processing.
- Recommend basis-first, then minimize
Advise confirming a lawful basis (or fresh consent), running a DPIA, and minimizing data before any training.
4.3 Data minimization & purpose limitation in practice
Data minimization says collect and retain only what is needed for the defined purpose; combined with purpose limitation it constrains how much PII ever enters a model. AI teams often default to "collect everything, it might help accuracy" β the opposite of minimization.
Auditor's angle: the risk is excessive PII collection that enlarges the breach and re-identification surface. The control is documented necessity for each data field; evidence is a data-necessity/justification mapping and field-level minimization decisions.
- Each field should have a documented reason it is necessary.
- Minimization shrinks breach and re-identification risk.
- "More data improves accuracy" does not override necessity.
Hoovering up every field
A churn-prediction model ingests full customer records including national ID and health flags "in case they help." You assess minimization.
- List the fields ingested
Enumerate every attribute the model consumes, including sensitive ones.
- Demand a necessity justification
Ask the team to justify each field against the churn-prediction purpose; many fail.
- Flag sensitive and excessive data
Identify national ID and health flags as unnecessary and high-risk for this purpose.
- Quantify the added risk
Note the extra fields enlarge breach impact and re-identification potential with no proven benefit.
- Recommend field removal
Advise dropping unjustified fields and documenting the necessity decision for those retained.
4.4 PII in training data & de-identification
PII in training data risks memorization and re-identification. Controls include de-identification β anonymization (irreversible) and pseudonymization (reversible, and still personal data). Recognizing that pseudonymized data remains in scope of privacy law is a common exam point.
Auditor's angle: the risk is treating pseudonymized data as anonymous and dropping controls, or a model regurgitating memorized PII. The control is appropriate de-identification plus testing for memorization; evidence is the de-identification method, re-identification risk assessment, and output tests.
- Anonymization (irreversible) vs. pseudonymization (reversible, still personal data).
- LLMs can memorize and emit training PII verbatim.
- Re-identification risk rises when datasets are combined.
"It's anonymized" β is it?
A team claims its training set is "anonymized" because names were replaced with customer IDs that map back to records in another table.
- Inspect the de-identification method
Find that names were swapped for IDs that still map to identities β pseudonymization, not anonymization.
- Classify the data correctly
Conclude the data is still personal data and remains in scope of privacy law and controls.
- Assess re-identification risk
Check whether quasi-identifiers (ZIP, DOB, gender) allow re-identification even without the mapping table.
- Test the model for memorization
Probe the model to see if it can output verbatim PII from training.
- Recommend stronger controls
Advise correct classification, retained safeguards, and stronger de-identification or memorization mitigations.
4.5 DPIAs & retention
A Data Protection Impact Assessment (DPIA) is required for high-risk processing (which many AI use cases are) and is a key evidence artifact. Retention (storage limitation) means data is kept only as long as needed for the purpose, then securely disposed β applying to training data, logs, and model outputs alike.
Auditor's angle: the risk is high-risk processing with no DPIA, or indefinite data hoarding. The control is a DPIA before high-risk processing and an enforced retention schedule; evidence is the completed DPIA, a retention schedule, and deletion logs.
- DPIA is required before high-risk processing begins.
- Retention applies to training data, logs, and outputs.
- Secure disposal must be evidenced, not assumed.
No DPIA, no end date
A health-prediction model went live without a DPIA, and its training data and prediction logs are kept indefinitely "for future research."
- Confirm high-risk processing
Establish that health data plus automated prediction is high-risk processing requiring a DPIA.
- Check for the DPIA
Request the DPIA performed before go-live; find it was never done.
- Review the retention schedule
Determine that data and logs have no defined retention period or disposal trigger.
- Assess the "future research" basis
Note that vague future use does not justify indefinite retention or override storage limitation.
- Recommend DPIA and retention controls
Advise completing a DPIA, setting purpose-based retention periods, and evidencing secure disposal.
Leading Practices, Ethics, Regulations & Standards D1 Β· E
This subtopic ties the domain together with the responsible-AI principles and external frameworks the exam tests by name; know what each is for and how it maps to controls β don't memorize clause numbers.
5.1 Responsible / trustworthy-AI principles
Across virtually every framework the same principles recur: fairness, transparency/explainability, accountability, safety, robustness/reliability, privacy, security, and human oversight. Learn them as a checklist you can apply to any use case.
Auditor's angle: the risk is a values statement with no operational controls behind it. The control is each principle traced to a concrete mechanism; evidence is a mapping from principle to control to test result.
- Principles are universal; the exam expects you to recognize them.
- Each principle must map to a real, testable control.
- A poster of principles is not a control.
Principles on the wall
An organization publishes a "Responsible AI" charter listing fairness, transparency, and accountability, but you find no controls tied to them.
- List the stated principles
Capture each principle the organization commits to.
- Map each to a control
For fairness, look for bias testing; for transparency, explainability and AI-use disclosure; for accountability, named owners.
- Identify unbacked principles
Find that fairness and transparency have no operating control behind them.
- Test the controls that do exist
Where a control is claimed, request evidence it operates.
- Recommend operationalization
Advise tying each principle to a specific, testable control with an owner.
5.2 The EU AI Act β risk tiers
The EU AI Act is the most heavily tested regulation. Its core idea is a risk-tiered, proportionate approach: the higher the risk to people, the heavier the obligations β and it is binding law, not a voluntary framework.
Auditor's angle: the risk is mis-tiering a system (e.g., calling a hiring tool "limited risk") and missing binding obligations. The control is a documented EU AI Act classification per use case; evidence is the classification rationale and the obligations met for that tier.
| Tier | Examples | Obligation |
|---|---|---|
| Unacceptable | Social scoring, manipulative systems | Prohibited |
| High risk | Credit, hiring, biometrics, medical | Risk mgmt, data governance, documentation, human oversight, conformity assessment |
| Limited risk | Chatbots, deepfakes | Transparency: disclose AI / AI-generated content |
| Minimal risk | Spam filters, AI in games | Largely unregulated; voluntary codes |
Tiering a hiring tool under the AI Act
A company plans to deploy an AI CV-screening tool in the EU and has classified it as "limited risk β it's just a chatbot-style helper."
- Identify the use case precisely
Confirm it screens candidates and influences employment decisions.
- Match it to the Act's categories
Recognize employment/recruitment as an explicitly listed high-risk category.
- Correct the classification
Reclassify from limited to high risk, triggering the strict obligations.
- List the triggered obligations
Enumerate risk management, data governance, documentation, human oversight, and conformity assessment.
- Test against each obligation
Check evidence for each; document gaps as compliance findings.
5.3 NIST AI RMF, ISO/IEC 42001 & 23894
Know what each framework is for: NIST AI RMF β voluntary US framework (GOVERN/MAP/MEASURE/MANAGE) for operationalizing AI risk; ISO/IEC 42001 β the certifiable AI Management System (AIMS) standard (the ISO 27001 of AI); ISO/IEC 23894 β AI risk-management guidance aligning ISO 31000 to AI.
Auditor's angle: the risk is treating any voluntary framework as proof of legal compliance, or confusing the three. The control is selecting the right framework for the need; evidence differs β NIST: a risk process; 42001: certified AIMS scope, policy, internal audits, management review; 23894: risk-method guidance applied.
| Framework | Nature | Best used for |
|---|---|---|
| NIST AI RMF | Voluntary process framework | Operationalizing AI risk management |
| ISO/IEC 42001 | Certifiable management system | Auditable AIMS, certification |
| ISO/IEC 23894 | Risk-management guidance | Applying ISO 31000 to AI risk |
Choosing the right framework
A client wants an externally certifiable way to demonstrate mature AI governance to customers. Someone suggests "just adopt the NIST AI RMF."
- Clarify the requirement
Confirm the goal is external, certifiable assurance β not just an internal process.
- Assess NIST AI RMF against it
Note NIST is voluntary and not certifiable, so it cannot provide a certificate.
- Identify the fit
Match the need to ISO/IEC 42001, which provides a certifiable AI management system.
- Define the evidence ISO requires
List AIMS scope, policy, objectives, internal audits, and management review.
- Recommend ISO/IEC 42001
Advise 42001 for certification, optionally using NIST AI RMF and 23894 to inform the underlying risk work.
5.4 OECD principles & the frameworks-aren't-law trap
The OECD AI Principles are influential international values (inclusive growth, human-centred values, transparency, robustness, accountability) underpinning many national policies. Crucially, complying with a voluntary framework or set of principles does not equal legal compliance; binding law (EU AI Act, GDPR) and sector regulators still apply.
Auditor's angle: the risk is "we follow NIST/OECD, so we're compliant." The control is a regulatory mapping that separates voluntary frameworks from binding law; evidence is a list of applicable laws per use case and how each is met.
NIST/ISO/OECD are voluntary; the EU AI Act and GDPR are binding. A voluntary framework is never proof of legal compliance, and complying with one framework does not imply the others.
"We follow NIST, so we're compliant"
A multinational deploying an AI hiring tool in the EU tells you it has adopted the NIST AI RMF and is "therefore compliant."
- Classify NIST's status
Confirm NIST AI RMF is a voluntary framework, not a legal compliance vehicle.
- Identify the binding laws
Determine the EU AI Act (high-risk hiring) and GDPR both apply to this use case.
- List the legal obligations
Enumerate risk management, data governance, human oversight, documentation, and conformity assessment.
- Compare NIST adoption to those obligations
Show that good risk practice does not automatically satisfy binding legal duties.
- Recommend a regulatory mapping
Advise mapping all applicable laws per use case and evidencing compliance with each.
5.5 Human oversight & human-in-the-loop
High-impact AI decisions should be subject to meaningful human review, with the human able to understand and override the output (human-in-the-loop). The control fails silently when automation bias sets in β humans rubber-stamp the machine and stop challenging it.
Auditor's angle: the risk is a human-oversight control that exists on paper but is hollowed out by automation bias. The control is meaningful, evidenced override capability; evidence is override rates, reviewer training, time-on-decision, and whether reviewers can actually understand the output.
- Oversight must be meaningful β the human needs information and authority to override.
- Near-zero override rates can signal automation bias, not perfection.
- Reviewers need explanation of the output to challenge it.
The rubber-stamp reviewer
A loan model has a "human-in-the-loop" control: an officer must approve each decision. Records show the officer approves 100% of the model's outputs in seconds.
- Confirm the control on paper
Verify the documented requirement for human approval of each decision.
- Pull override and timing data
Examine override rates and time-per-decision; find 100% approval in seconds.
- Test for meaningful review
Check whether the officer receives an explanation and has authority and time to dissent.
- Diagnose automation bias
Conclude the human is rubber-stamping β the oversight control is not operating.
- Recommend a meaningful-oversight redesign
Advise giving reviewers explanations, training, time, and clear override authority, and monitoring override rates.
Exam focus β quick recap
Most Domain 1 answers reward the auditor who establishes or restores clear accountability, applies risk-based, proportionate controls, and tells a control that exists on paper from one that is designed, owned, evidenced, and operating. When a scenario describes AI with no named owner, no inventory entry, or no risk assessment, the finding writes itself.
- NIST AI RMF = GOVERN, MAP, MEASURE, MANAGE (voluntary risk process).
- EU AI Act tiers = unacceptable (prohibited), high, limited (transparency), minimal β binding law; hiring/credit/biometrics are high risk.
- ISO/IEC 42001 = certifiable AI management system; ISO/IEC 23894 = AI risk-management guidance; OECD = high-level principles.
- Frameworks β law: voluntary frameworks never prove legal compliance, and one framework doesn't imply another.
- Three lines of defense: internal audit (3rd line) must never own, build, or validate the models it audits.
- Inherent vs. residual risk: residual risk must be formally accepted within appetite by a named owner.
- The AI inventory enables everything β no registry means no risk assessment, monitoring, or governance.
- Vendor β off the hook: third-party and foundation models never transfer accountability.
- Purpose limitation & lawful basis: operational data can't be repurposed for training without a valid basis; DPIA for high-risk processing.
- Human oversight must be meaningful β beware automation bias hollowing out human-in-the-loop controls.