Reference

AAIA Glossary

Key terms for the exam — type to filter.

A – E

Accuracy: The proportion of a model's predictions that are correct. A headline metric, but misleading on imbalanced data. Auditor relevance: never accept accuracy alone — ask for precision, recall, and class balance before judging performance.

Adversarial example: An input deliberately perturbed (often imperceptibly) to make a model produce a wrong output. Auditor relevance: evidence of adversarial/robustness testing is a key control for safety- and security-critical models.

Agentic AI: AI that plans and takes actions — calling tools, making transactions — with reduced human intervention. Auditor relevance: highest autonomy and blast-radius risk; look for action boundaries, approval gates, kill switches, and full logging of every action taken.

AI inventory / registry: A central catalogue of every AI model and use case, with owner, purpose, data sources, risk tier, status, and last-review date. Auditor relevance: one of the highest-value governance controls; its absence almost guarantees shadow AI and is a common first recommendation.

AIMS (AI Management System): The interrelated policies, objectives, processes, roles, and controls used to govern AI responsibly — the management system specified by ISO/IEC 42001. Auditor relevance: audit it like any management system (scope, leadership, risk, controls, internal audit, management review).

Alignment: How well a model's behaviour matches its intended goals and human values. Auditor relevance: misalignment underlies many harm scenarios; check guardrails, evaluation against intended use, and human oversight.

AUC / ROC: The ROC curve plots true-positive rate against false-positive rate across thresholds; AUC (area under the curve) summarizes ranking quality from 0.5 (random) to 1.0 (perfect). Auditor relevance: a threshold-independent way to evidence and compare classifier performance.

Automation bias: The human tendency to over-trust automated output and stop challenging it. Auditor relevance: silently defeats "human-in-the-loop" controls; verify reviewers actually exercise meaningful judgement, not rubber-stamping.

Backpropagation: The algorithm that trains neural networks by propagating prediction error backward to adjust weights. Auditor relevance: background concept — you assess the controls around training, not the math itself.

Bias (statistical vs societal): Statistical bias is systematic error between predictions and truth; societal/harmful bias is unfair discriminatory outcomes across groups. Auditor relevance: distinguish the two — fairness testing across protected groups and representative data are the controls for harmful bias.

Black-box model: A model whose internal decision logic is opaque to humans (e.g., deep neural nets). Auditor relevance: opacity threatens fairness, recourse, and regulatory explainability; look for XAI tooling and human review of high-impact decisions.

Canary deployment: Releasing a new model to a small slice of traffic first, monitoring it, then expanding. Auditor relevance: a change-management control that limits blast radius; check rollback criteria and monitoring during the canary phase.

Concept drift: When the relationship between inputs and the target changes over time (the world changes), degrading the model even if inputs look the same. Auditor relevance: requires ongoing monitoring with retraining triggers — distinct from data drift.

Confusion matrix: A table of true/false positives and negatives used to derive precision, recall, accuracy, and F1. Auditor relevance: the source artifact for performance metrics; request it to validate reported figures.

CRISP-DM: Cross-Industry Standard Process for Data Mining: a lifecycle of business understanding, data understanding, data preparation, modelling, evaluation, and deployment. Auditor relevance: a recognizable lifecycle to map governance checkpoints and evidence against.

Data drift: When the distribution of input data shifts from what the model was trained on. Auditor relevance: a leading indicator of degradation; expect monitoring of input distributions, not just output accuracy.

Data lineage / provenance: Provenance is where data came from; lineage is its journey through transformations into the model. Auditor relevance: without it you cannot prove data was lawful, suitable, or uncontaminated — request lineage records and data dictionaries.

Data poisoning: An attack that corrupts training data to implant bad behaviour or backdoors. Auditor relevance: controls include data validation, source integrity, access control over training pipelines, and provenance.

Datasheet (for datasets): Standardized documentation of a dataset's motivation, composition, collection, and recommended uses. Auditor relevance: evidence of data governance and a check on fitness-for-purpose and lawful collection.

De-identification: Removing or obscuring identifiers via anonymization or pseudonymization. Auditor relevance: note that pseudonymized data is still personal data; verify the technique actually prevents re-identification given the context.

Deep learning: Machine learning using many-layered neural networks that learn features automatically. Auditor relevance: powerful but opaque; drives the need for explainability tooling and monitoring.

DPIA (Data Protection Impact Assessment): A documented assessment required for high-risk personal-data processing, which many AI use cases are. Auditor relevance: a primary piece of audit evidence; its absence on high-risk processing is a finding.

Embeddings: Numeric vector representations of text, images, or other data that capture semantic meaning. Auditor relevance: underpin search and RAG; embeddings can leak sensitive information, so consider access and storage controls.

Explainability (XAI): Techniques that make a model's outputs and behaviour understandable to humans. Auditor relevance: a trustworthy-AI characteristic; the required level scales with decision impact (credit/health need far more than a tag suggester).

F – J

Fairness: The principle that AI does not produce unjust or discriminatory outcomes across individuals or groups. Auditor relevance: assessed via disparate-impact/fairness metrics across protected groups and representative training data.

F1 score: The harmonic mean of precision and recall, balancing the two in a single number. Auditor relevance: a fairer headline than accuracy on imbalanced data; useful when both false positives and false negatives matter.

Feature store: A central repository for curated, reusable model input features, serving both training and inference. Auditor relevance: supports consistency and lineage; check access control, versioning, and that training/serving features match (no skew).

Foundation model: A large model pre-trained on broad data and adapted to many downstream tasks (often third-party). Auditor relevance: concentrates supply-chain risk — you inherit the vendor's training data, bias, and security posture; demands due diligence and contractual rights.

Generative AI: AI that produces new content — text, code, images, audio. Auditor relevance: key risks are hallucination, prompt injection, data leakage, and IP exposure; controls include grounding, output review, and acceptable-use policy.

GPAI (General-Purpose AI): The EU AI Act's term for general-purpose/foundation models, with extra duties for models posing systemic risk. Auditor relevance: triggers transparency, documentation, training-data summaries, and copyright-policy obligations; flag GPAI use in scope.

Ground truth: The verified correct labels or outcomes against which model output is compared. Auditor relevance: the quality and integrity of ground truth bounds the credibility of every performance metric.

Groundedness: The degree to which a generative model's output is supported by provided source material rather than invented. Auditor relevance: a measurable control against hallucination, especially in RAG systems; check grounding evaluation.

Hallucination: Confident but false or fabricated output from a generative model. Auditor relevance: controls include grounding/RAG, human review, confidence thresholds, and acceptable-use limits; request review logs.

Human-in-the-loop (HITL): A design where a human reviews, approves, or can override AI decisions. Auditor relevance: a core oversight control for high-impact AI — verify it is meaningful, not undermined by automation bias.

Hyperparameter: A configuration value set before training (e.g., learning rate, tree depth) that shapes how a model learns. Auditor relevance: changes affect performance and reproducibility; expect them to be recorded and version-controlled.

Inference: The phase where a trained model produces predictions on new inputs in production. Auditor relevance: where real-world risk materializes; check input validation, monitoring, logging, and access control at inference time.

Jailbreak: Crafting prompts that bypass a model's safety guardrails to elicit prohibited output. Auditor relevance: tested via red-teaming; controls include layered guardrails, output filtering, and monitoring for abuse patterns.

K – O

LIME: Local Interpretable Model-agnostic Explanations — explains an individual prediction by fitting a simple model around it locally. Auditor relevance: evidence of explainability for case-level decisions and recourse.

LLM (Large Language Model): A large model trained on vast text to generate and understand language. Auditor relevance: the engine behind most generative-AI use cases and its risks (hallucination, injection, leakage).

Membership inference: An attack that determines whether a specific record was in the training data. Auditor relevance: a privacy threat; controls include differential privacy, regularization, and limiting model memorization.

MLOps: Practices and tooling to deploy, monitor, and maintain ML models in production reliably. Auditor relevance: the operational backbone for change management, monitoring, versioning, and reproducibility evidence.

Model card: Standardized documentation of a model's intended use, performance, limitations, and ethical considerations. Auditor relevance: a key transparency artifact; check it is current, honest about limitations, and matches actual use.

Model drift: The general decay of model performance over time, caused by data drift, concept drift, or both. Auditor relevance: demands continuous monitoring with thresholds and retraining triggers — a one-time retrain is not a control.

Model extraction: An attack that steals a model by querying it repeatedly to reconstruct its behaviour. Auditor relevance: controls include rate limiting, query monitoring, and access restrictions on prediction APIs.

Model inversion: An attack that reconstructs sensitive training inputs from model outputs. Auditor relevance: a privacy threat for models trained on personal data; mitigations include output limiting and privacy-preserving training.

Model Risk Management (MRM): The independent validation, challenge, and ongoing oversight of models, their assumptions, and limitations. Auditor relevance: provides the second-line independent challenge; verify validation is independent of the build team.

NIST AI RMF: The NIST AI Risk Management Framework — a voluntary US framework structured as GOVERN, MAP, MEASURE, MANAGE, plus seven trustworthy-AI characteristics. Auditor relevance: excellent outcome-based audit criteria, but voluntary — not proof of legal compliance.

Overfitting: When a model learns training-data noise and fails to generalize to new data. Auditor relevance: signalled by strong training but weak test/production performance; check train/validation/test discipline and monitoring.

P – T

Precision: Of the items the model flagged positive, the fraction that truly are positive (limits false positives). Auditor relevance: the metric to emphasize when false positives are costly (e.g., wrongful fraud blocks).

Prompt injection: Malicious instructions hidden in input or retrieved content that hijack an LLM's behaviour. Auditor relevance: a top generative-AI threat; controls include input/output filtering, privilege separation, and not trusting retrieved content blindly.

Recall: Of all true positives that exist, the fraction the model caught (limits false negatives). Auditor relevance: the metric to emphasize when missing a positive is costly (e.g., missed fraud or disease).

Red-teaming: Structured adversarial testing to find failures, harmful outputs, and security weaknesses before attackers do. Auditor relevance: strong evidence of robustness and safety testing, especially for generative and high-risk systems.

Reinforcement learning: Training an agent through reward and penalty as it interacts with an environment. Auditor relevance: reward design can produce unintended behaviour; check objectives, guardrails, and testing of edge cases.

Reproducibility: The ability to recreate a model's results given the same data, code, and configuration. Auditor relevance: required for credible validation and investigation; expect versioned data, code, seeds, and hyperparameters.

RAG (Retrieval-Augmented Generation): Augmenting an LLM with retrieved documents at query time to ground its answers. Auditor relevance: a key hallucination control; verify source quality, access control on the knowledge base, and groundedness evaluation.

Robustness: A model's ability to maintain performance under noise, distribution shift, or adversarial input. Auditor relevance: a trustworthy-AI characteristic; evidenced by stress and adversarial testing.

Shadow deployment: Running a new model alongside production on real traffic without serving its outputs, to compare safely. Auditor relevance: a change-management control to validate before cutover; check the comparison criteria and sign-off.

SHAP: SHapley Additive exPlanations — attributes a prediction to its input features using game-theoretic Shapley values. Auditor relevance: widely used explainability evidence for both global and case-level transparency.

Supervised learning: Learning from labelled data to predict outcomes (classification, regression). Auditor relevance: risks centre on label quality, training-data bias, and drift; check provenance and train/test separation.

Synthetic data: Artificially generated data that mimics real data's statistical properties. Auditor relevance: can reduce privacy and scarcity issues but may embed bias or fail to represent edge cases; validate fidelity and residual re-identification risk.

Three lines of defense: A risk model where the 1st line owns and operates controls, the 2nd line (risk/compliance/MRM) sets policy and challenges, and the 3rd line (internal audit) gives independent assurance. Auditor relevance: protect independence — audit must not own, build, or validate the models it audits.

Tokens: The chunks of text an LLM processes; cost and context limits are measured in tokens. Auditor relevance: context-window limits affect reliability and what evidence a model can consider; relevant to cost and data-handling controls.

Training / validation / test split: Partitioning data so the model trains on one set, is tuned on another, and is evaluated on an unseen third. Auditor relevance: prevents over-optimistic results and leakage; verify the split is genuine and test data stayed unseen.

Transparency: Making meaningful information about an AI system — its existence, purpose, data, and decisions — available to stakeholders. Auditor relevance: a trustworthy-AI characteristic and a legal duty (e.g., EU AI Act, GDPR); check disclosures and documentation.

Trustworthy AI: AI that is valid & reliable, safe, secure & resilient, accountable & transparent, explainable & interpretable, privacy-enhanced, and fair with harmful bias managed (NIST's seven characteristics). Auditor relevance: a ready checklist for audit criteria.

U – Z

Underfitting: When a model is too simple to capture the data's patterns, performing poorly on both training and new data. Auditor relevance: signals inadequate model or features; relevant to fitness-for-purpose, not just monitoring.

Unsupervised learning: Finding structure in unlabelled data (clustering, anomaly detection) with no ground truth. Auditor relevance: harder to validate — ask how outputs are validated and acted upon before they drive decisions.

Vector database: A store optimized for similarity search over embeddings, central to RAG and semantic search. Auditor relevance: may hold sensitive content; check access control, encryption, and data-retention/deletion governance.

XAI (Explainable AI): The field of techniques (LIME, SHAP, and others) that make model behaviour interpretable. Auditor relevance: supplies the evidence behind the explainability characteristic and supports recourse for affected individuals.

Zero-shot learning: A model performing a task it was not explicitly trained for, with no task-specific examples. Auditor relevance: capability claims need validation on the actual use case; check evaluation before relying on untested behaviour.

Audit terms

Tests of design vs operating effectiveness: A test of design asks whether a control, if it works, would address the risk; a test of operating effectiveness asks whether it actually works consistently over time. Auditor relevance: a control can be well-designed yet fail in operation — assess both, and only conclude after testing operation.

Sufficient & appropriate evidence: Sufficient is about quantity; appropriate is about quality (relevance and reliability). Auditor relevance: conclusions must rest on enough evidence of the right quality — independent, corroborated evidence outweighs management assertion.

Materiality: The threshold above which an error, weakness, or risk would influence the decisions of report users. Auditor relevance: focuses scope and effort on what matters; helps prioritize findings by impact rather than treating all equally.

The 4 Cs finding structure: A way to write a finding: Condition (what is), Criteria (what should be), Cause (why the gap exists), and Consequence/Effect (the risk or impact) — leading to a recommendation. Auditor relevance: a clear, defensible finding addresses the root cause and ties to the risk, exactly what the exam rewards.

Inherent vs residual risk: Inherent risk is risk before controls; residual risk is what remains after controls. Auditor relevance: residual risk must be formally accepted by a named, accountable owner within risk appetite — risk accepted by no one is a finding.

Independence & objectivity: The auditor's freedom from conflicts that would bias judgement (independence) and unbiased mental attitude (objectivity). Auditor relevance: internal audit (3rd line) must not own, build, or validate the AI it audits; protect this in every answer.