Test yourself

Practice Questions

Scenario-style questions with full explanations — just like the real exam.

Click an option to lock in your answer. The correct choice and a full explanation appear instantly, and your score updates above. Use the domain filters to drill a single area, or hit Reset quiz to start over. There's exactly one best answer per question — pick what an auditor should do FIRST or BEST.

0 / 0

0 of 90 answered this session

History: 0 correct · 0 to review · 90 untried

⏱️ Take a timed mock exam

💾

Your progress is saved

Each question remembers whether you last got it right or wrong (stored in this browser). Use ★ Review mistakes to drill only the ones you missed.

Domain 1 · Governance & Risk

An auditor reviewing a newly deployed credit-scoring model finds that the data science team built and validated it, but no business or risk owner is formally accountable for its decisions. What should the auditor do FIRST?

ARecommend the data science team be made accountable, since they built the model.
BRaise a finding that accountability for the model has not been assigned, and recommend a named model owner in the business be established.
CDisable the model until a fairness assessment is completed.
DConclude the model is acceptable because it was independently validated.

Answer: B. A core governance control is clear accountability — a named business/risk owner who answers for the model's outcomes. The gap is the root governance risk, so document it and recommend assigning ownership. A is wrong: developers shouldn't own the business decision (and that would weaken segregation). C is a management action the auditor doesn't take, and it overreacts to a governance gap. D ignores the missing accountability entirely.

Domain 1 · Governance & Risk

An organization is building an AI system that assigns risk scores used to screen job applicants. Under the EU AI Act, how should this system most likely be classified, and what does that imply?

AMinimal risk — no specific obligations apply.
BLimited risk — only transparency notices are required.
CProhibited — employment-related AI is banned outright.
DHigh risk — it triggers obligations such as risk management, data governance, human oversight, and conformity assessment.

Answer: D. The EU AI Act lists AI used in employment and worker management (including recruitment and candidate evaluation) as high-risk, which brings obligations like risk management, high-quality data governance, logging, transparency, human oversight, and conformity assessment. A and B understate the tier. C is wrong: most employment AI is high-risk, not prohibited — the prohibited tier covers things like social scoring and certain biometric categorization, not recruitment scoring generally.

Domain 1 · Governance & Risk

Management asks internal audit to help define which AI use cases the company should pursue and to design the model approval workflow. The CAE is concerned. What is the BEST response?

ADecline to design or own the workflow, as that would impair independence; offer to advise and later audit the controls management establishes.
BAccept fully — audit's involvement guarantees the controls will be strong.
CAccept, but have a different auditor sign the final report.
DRefuse any involvement with AI governance whatsoever.

Answer: A. Internal audit can advise, but designing and owning a control process is a management responsibility — taking it on creates a self-review threat to independence when audit later evaluates it. B and C don't fix the impairment (rotating signatures doesn't cure ownership of the design). D over-corrects: advisory input is appropriate, only ownership is the problem.

Domain 1 · Governance & Risk

A company wants a single framework to set up an auditable, certifiable management system for governing AI across its lifecycle. Which is the most appropriate primary reference?

AISO/IEC 42001, the AI management system standard.
BThe EU AI Act.
CThe OWASP Top 10 for LLMs.
DGDPR.

Answer: A. ISO/IEC 42001 is a certifiable management-system standard (like ISO 27001 for security) specifically for governing AI — exactly what's asked. B is law, not a management system you certify against. C is a security-focused vulnerability list for LLM applications, not a governance management system. D governs personal data processing, not AI management broadly.

Domain 1 · Governance & Risk

An auditor maps the organization's AI program to the NIST AI RMF and finds strong activity in Map, Measure, and Manage, but little in the Govern function. What is the most significant implication?

ANone — Govern is optional once the other three functions are in place.
BGovern is the cross-cutting function that establishes culture, accountability, and policies; its weakness undermines the consistency and oversight of the other three.
CIt only matters for generative AI systems.
DThe organization should drop NIST and adopt a different framework.

Answer: B. In the NIST AI RMF, Govern is the cross-cutting function that creates the policies, roles, accountability, and culture the other functions depend on. Weak governance means Map/Measure/Manage are happening without consistent oversight or risk tolerance. A and C are wrong: Govern applies to all AI and is foundational. D is an overreaction — fix the gap, don't change frameworks.

Domain 1 · Governance & Risk

A marketing team plans to deploy an AI tool that profiles customers using sensitive personal data to predict health-related interests. What governance step is MOST important before deployment?

AA Data Protection Impact Assessment (DPIA) to evaluate privacy risk and necessity.
BA penetration test of the hosting environment.
CA marketing A/B test to confirm the tool increases conversions.
DA press release describing the new capability.

Answer: A. High-risk processing of sensitive personal data for profiling is a textbook trigger for a DPIA, which assesses necessity, proportionality, and risks to data subjects before processing begins. B is a security control but doesn't address the privacy/necessity question. C measures business value, not risk or lawfulness. D is irrelevant to governance.

Domain 1 · Governance & Risk

During a governance review, an auditor finds the AI policy prohibits "unacceptable bias" but defines no metrics, thresholds, or owner for measuring fairness. How should the auditor characterize this?

AAcceptable — having any policy statement is sufficient.
BA minor wording issue with no control impact.
CA control design weakness — the policy is not operable because it lacks measurable criteria and accountability.
DOut of scope, because fairness is a Domain 2 operational concern.

Answer: C. A policy that can't be measured or enforced is a design weakness: without metrics, thresholds, and an owner, "unacceptable bias" can't be tested or governed. A and B understate the impact — unmeasurable policy is effectively no control. D is wrong: policy and fairness criteria are governance (Domain 1), even if the testing happens operationally.

Domain 1 · Governance & Risk

A bank's AI risk register lists model-performance risks but omits third-party and supply-chain risk for the foundation model it licenses from a vendor. What is the auditor's BEST recommendation?

ANo action — vendor risk is the vendor's responsibility, not the bank's.
BReplace the vendor model with an in-house model immediately.
CRemove model-performance risks since the vendor handles performance.
DExpand the risk register and due-diligence process to cover third-party/foundation-model risks, since the bank retains accountability for outcomes.

Answer: D. Outsourcing the model does not outsource accountability — the bank still owns the outcomes and must assess vendor/supply-chain risk (data provenance, updates, security, exit). A wrongly transfers accountability. B is a drastic management decision, not an audit recommendation, and ignores risk-based analysis. C removes a legitimate risk and worsens coverage.

Domain 1 · Governance & Risk

A generative AI chatbot is deployed to answer customer questions, but users are not told they are interacting with an AI system. Which concern is MOST directly raised?

AModel drift.
BTransparency / disclosure obligations toward affected individuals.
CInsufficient compute capacity.
DLack of a rollback plan.

Answer: B. Letting people interact with AI without disclosure is a transparency failure; frameworks and the EU AI Act's limited-risk tier require informing users they're dealing with an AI. A (drift) and D (rollback) are operational matters not raised by the disclosure gap. C is a capacity issue unrelated to the ethical/transparency concern described.

Domain 1 · Governance & Risk

An organization claims its AI governance is mature because it has an AI ethics committee. The auditor wants evidence the committee is effective. Which evidence is MOST persuasive?

AThe committee's charter and member list.
BA slide deck announcing the committee's creation.
CMinutes showing the committee reviewed specific use cases and made documented decisions that changed outcomes.
DAn email from the CEO endorsing the committee.

Answer: C. Operating effectiveness is shown by evidence the control actually functioned — decisions made, use cases reviewed, outcomes influenced. A charter (A) shows design, not operation; an announcement deck (B) and a CEO email (D) show intent or tone, not that the committee did its job. Auditors weight evidence of actual operation most heavily.

Domain 2 · AI Operations

A fraud-detection model that performed well at launch is now flagging far fewer transactions, and fraud losses are rising. Production input data patterns have shifted since training. What is the MOST likely cause?

AOverfitting during training.
BModel/data drift — the live data distribution has diverged from the training data.
CA prompt-injection attack.
DInsufficient training-time hyperparameter tuning.

Answer: B. Performance degrading over time as real-world data shifts away from the training distribution is the definition of drift (data/concept drift), and it's why ongoing monitoring exists. A (overfitting) would have shown at launch, not emerged later. C (prompt injection) applies to LLM/prompt-driven systems, not a tabular fraud classifier reacting to changed data. D is a training issue that wouldn't explain a model that started strong then declined.

Domain 2 · AI Operations

An auditor wants to understand a deployed model's intended use, training data, performance across subgroups, and known limitations in one document. Which artifact should they request?

AThe data dictionary.
BThe model card.
CThe network architecture diagram.
DThe incident response runbook.

Answer: B. A model card is the standard document summarizing intended use, training data, evaluation metrics (including subgroup performance), limitations, and ethical considerations — exactly what's asked. A documents fields/data, not model behavior. C shows infrastructure, not model purpose/performance. D is for handling incidents, not describing the model.

Domain 2 · AI Operations

An LLM-based assistant retrieves snippets from a customer-facing knowledge base and includes them in its prompt. An attacker plants text in a public document that instructs the model to ignore its rules and reveal internal data. What threat is this?

AData poisoning of the training set.
BModel inversion.
CConcept drift.
D(Indirect) prompt injection.

Answer: D. Malicious instructions smuggled into content the model ingests at inference time is prompt injection (indirect, via retrieved content). A (data poisoning) corrupts the training data, not runtime input. B (model inversion) reconstructs training data from outputs. C (drift) is a performance-degradation phenomenon, not an attack. Controls: input/output filtering, separating instructions from data, least-privilege on what the model can access.

Domain 2 · AI Operations

A medical-triage classifier reports 98% accuracy, and the team calls it excellent. The condition it screens for occurs in about 2% of patients. What is the auditor's BEST concern?

A98% accuracy proves the model is safe to deploy.
BAccuracy should have been reported as an F1 of 98%.
CWith a rare positive class, accuracy is misleading; recall/precision on the positive class matter far more.
DThe model needs a larger learning rate.

Answer: C. On an imbalanced problem, a model that always predicts "negative" scores 98% accuracy while catching zero cases — so accuracy is the wrong headline. Recall (catching true positives) and precision matter, often summarized by F1 or AUC. A trusts the misleading metric. B confuses metrics — F1 and accuracy aren't interchangeable. D is an unrelated training tweak.

Domain 2 · AI Operations

A data scientist pushed an updated model straight to production over the weekend without an approval or a record of what changed, because "it scored better offline." What control weakness is MOST evident?

AInadequate change management for models — missing approval, versioning, and documentation.
BInsufficient training data.
CLack of model explainability.
DExcessive human oversight.

Answer: A. Deploying without approval, version control, or a change record is a change-management failure — there's no segregation, no traceability, and no ability to roll back safely. B and C describe different issues not evidenced here. D is backwards: the problem is too little oversight and process, not too much.

Domain 2 · AI Operations

A model is trained on data scraped from a source an attacker can edit. The attacker inserts mislabeled examples so the model learns a hidden, harmful behavior. What is this attack called?

APrompt injection.
BMembership inference.
CData poisoning.
DDenial of service.

Answer: C. Corrupting the training data so the model learns malicious or degraded behavior is data poisoning. A acts at inference time on prompts, not training. B (membership inference) tries to determine whether a record was in the training set. D is an availability attack. Controls: data provenance/integrity checks, curation, anomaly detection on training data.

Domain 2 · AI Operations

An auditor reviews the monitoring setup for a deployed recommendation model and finds dashboards for infrastructure uptime and latency only. What is the MOST important gap?

ANothing — uptime and latency fully cover model health.
BThere is no monitoring of model quality — prediction accuracy, drift, and data-quality signals.
CThe dashboards refresh too frequently.
DThe model lacks a public model card.

Answer: B. Infrastructure metrics show the service is up, not that the model is still correct. Effective AI monitoring tracks prediction quality, input/output drift, and data-quality issues so degradation is caught early. A is false — a fast, low-latency model can still be quietly wrong. C is trivial. D is a documentation gap, not the monitoring gap described.

Domain 2 · AI Operations

A production model starts producing clearly harmful outputs to customers. The team has no defined procedure for who decides to take it offline or how to communicate. What should the auditor recommend FIRST?

AEstablish an AI incident-response process with defined roles, escalation, containment (including a kill switch), and communication.
BRetrain the model on more data.
CAdd more GPUs to improve performance.
DPublish a model card.

Answer: A. The gap is the absence of an incident-response capability — defined roles, escalation, containment/rollback, and communication so harm can be stopped quickly. B may fix this instance but not the missing process. C is irrelevant to harmful outputs. D is useful documentation but doesn't address responding to live incidents.

Domain 2 · AI Operations

An auditor examines an image classifier's training data and finds it was labeled by a single annotator with no review, and labeling guidelines were never written down. What risk is MOST directly created?

ALabel quality risk — inconsistent or biased labels propagate into model behavior with no way to verify correctness.
BNetwork latency risk.
COverspending on compute.
DExcessive model explainability.

Answer: A. A model is only as good as its labels: a single unreviewed annotator with no documented guidelines means inconsistent, potentially biased ground truth that the model will learn — and it can't be independently verified. B and C are infrastructure/cost concerns not raised here. D isn't a risk and isn't evidenced.

Domain 2 · AI Operations

A team reports their model achieves 99% accuracy on the test set, but performance in production is much worse. The test set was created by sampling from the same cleaned file used for training. What is the MOST likely problem?

AThe production hardware is slower.
BData leakage / non-representative test data — the test set overlaps with or is too similar to training data, inflating offline scores.
CThe model is underfitting.
DThe model is too explainable.

Answer: B. Drawing the test set from the same cleaned file risks leakage and a non-representative evaluation, so offline metrics overstate real-world performance — exactly the gap seen in production. A wouldn't cause an accuracy gap. C (underfitting) would show poor scores everywhere, not 99% offline. D is not a performance cause.

Domain 2 · AI Operations

In reviewing an MLOps pipeline, an auditor wants assurance that any production prediction can be traced back to the exact model version and data that produced it. Which capability is MOST important?

AAuto-scaling of inference servers.
BA faster GPU.
CA larger marketing budget.
DReproducibility — versioning of models, data, and code plus logging that links predictions to the version that made them.

Answer: D. Traceability and reproducibility require versioning model, data, and code and logging that ties each prediction to the artifact that produced it — essential for investigations, audits, and rollback. A and B are scaling/performance features unrelated to traceability. C is irrelevant.

Domain 2 · AI Operations

A hiring model shows equal overall accuracy for two groups, but its false-negative rate (qualified candidates rejected) is twice as high for one group. The team says the model is fair because accuracy is equal. What is the auditor's BEST position?

AAgree — equal accuracy is sufficient evidence of fairness.
BDisagree — equal accuracy can mask disparate error rates; the unequal false-negative rates indicate a fairness concern needing assessment.
CDisagree, and demand the model be permanently banned.
DAgree, but recommend a faster model.

Answer: B. Fairness has multiple definitions; equal overall accuracy can hide very different error distributions. A doubled false-negative rate for one group is a real, harmful disparity that must be assessed against the chosen fairness criteria. A accepts a misleading metric. C overreaches into a management decision and ignores assessment. D is irrelevant to fairness.

Domain 2 · AI Operations

A company deploys a third-party foundation model via API and builds features on top. The vendor silently updates the model, and downstream behavior changes. What control would have MOST helped detect this?

AA bigger firewall.
BDisabling all logging to reduce noise.
CIncreasing the model's temperature setting.
DOngoing output monitoring with a regression/benchmark test suite run against the API over time.

Answer: D. When you don't control the model, you must monitor its behavior: a maintained benchmark/regression suite run regularly against the API catches behavioral shifts from silent vendor updates. A addresses network security, not behavior. B removes the very evidence you'd need. C changes randomness and would make behavior less stable, not detectable.

Domain 2 · AI Operations

An auditor finds that a model's input-validation layer accepts free-text that is passed directly into a system prompt used to query a database. Which combined risk is MOST relevant?

AOnly slower response times.
BPrompt injection leading to unauthorized data access — untrusted input is mixed with trusted instructions and privileged actions.
CReduced model accuracy on the test set.
DHigher cloud storage costs.

Answer: B. Passing untrusted free-text straight into a privileged, database-querying prompt is a classic injection path: an attacker can manipulate the instructions and reach data they shouldn't. Controls include separating instructions from user data, input/output filtering, and least-privilege. A, C, and D are performance/cost issues that miss the security exposure.

Domain 3 · Audit Techniques

An auditor concludes a model's controls are effective based solely on a verbal assurance from the lead data scientist that "testing is thorough." What is the primary problem with this conclusion?

ANothing — management inquiry is the strongest form of evidence.
BThe evidence is neither sufficient nor appropriate; inquiry alone, without corroboration, doesn't support an effectiveness conclusion.
CThe auditor should have asked two data scientists instead of one.
DThe conclusion is fine as long as it's documented.

Answer: B. Inquiry is the weakest evidence and must be corroborated with inspection, re-performance, or observation to be sufficient and appropriate for a conclusion. A inverts the evidence hierarchy. C still relies only on inquiry. D documenting weak evidence doesn't make the conclusion supportable.

Domain 3 · Audit Techniques

An auditor needs to test whether a model-approval control operated for every release over the past year, where there were only 18 releases. What is the BEST testing approach?

AStatistical attribute sampling of 30 items.
BTest a single release and extrapolate.
CTest the full population of 18 releases, since it is small enough to examine entirely.
DRely on the data scientist's summary spreadsheet only.

Answer: C. When the population is small, testing 100% is more efficient and gives complete assurance — sampling adds risk for no benefit. A would require sampling more items than exist or is needlessly imprecise. B (sample of one) gives no basis to conclude. D relies on unverified, second-hand evidence rather than the source records.

Domain 3 · Audit Techniques

While planning an AI audit, the auditor has limited time and must focus the engagement. Which approach to scoping is MOST appropriate?

AUse a risk-based approach: prioritize the highest-risk models and controls (e.g., high-impact, customer-facing, or regulated use cases).
BTest every control equally regardless of risk.
CAudit whichever systems are easiest to access.
DLet the data science team choose what gets audited.

Answer: A. Risk-based scoping directs scarce audit effort to where the consequences of failure are greatest — the core of audit planning. B spreads effort thinly and ignores materiality. C optimizes for convenience, not risk. D surrenders auditor judgment and independence to the auditee.

Domain 3 · Audit Techniques

An auditor wants to independently verify that a deployed model produces the documented outputs for a set of known inputs. Which technique provides the strongest evidence?

AReading the model's documentation.
BAsking the developer whether the outputs are correct.
CRe-performance: running the auditor's own test cases through the model and comparing to expected results.
DReviewing last year's audit report.

Answer: C. Re-performance — independently running test cases and comparing actual to expected outputs — is direct, auditor-generated evidence and among the strongest available. A (documentation review) and D (prior report) are indirect. B (inquiry) is the weakest and isn't independent. The strength order: re-performance/observation > inspection > inquiry.

Domain 3 · Audit Techniques

In drafting the audit report, the auditor must present a finding that a high-risk model lacks bias testing. What makes the finding MOST useful to management?

AStating only that "the model is biased."
BListing the names of the data scientists responsible.
CIncluding the full source code in the report.
DPresenting condition, criteria, cause, effect (risk/impact), and a clear, actionable recommendation.

Answer: D. A well-structured finding gives condition, criteria, cause, and effect plus an actionable recommendation so management understands the gap, why it matters, and what to do. A is an unsupported, overstated claim (and bias wasn't even tested). B blames individuals rather than addressing the control. C dumps detail that obscures the message and adds no decision value.

Domain 3 · Audit Techniques

An auditor plans to use a data-analytics script to test 100% of model decisions for policy violations. Before relying on the results, what should the auditor do?

ANothing — analytics output is always reliable.
BHave the auditee write and run the script for them.
CValidate the completeness and accuracy of the input data and confirm the analytics logic is correct.
DReduce the test to a sample of 10 decisions to save time.

Answer: C. Analytics conclusions are only as trustworthy as the data and logic behind them, so the auditor must verify the source data's completeness/accuracy and that the script does what it's intended to. A assumes reliability without basis. B compromises independence and reliability. D discards the advantage of full-population testing for no good reason.