Mercedez Lopez -

EthAiSynHuman-AI Integration ArchitecturePreserving Human Judgment in Human-AI SystemsA Mixed-Methods Measurement Framework for Detecting Cognitive,Moral, and Skill Erosion Over TimeA Proposal for the EthAiSyn Research ProgramMarch 2026 | Version 3.0 | Updated Research FoundationAbstractAs artificial intelligence becomes embedded in consequential decision workflows, a critical measurement problem has emerged: assistance and erosion produce identical surface behavior in the short term. A human whose judgment is being well-augmented and a human whose judgment is quietly deteriorating will both appear to be using AI effectively at any single point in time. The divergence only becomes visible under conditions of AI withdrawal, longitudinal decay analysis, or high-stakes failure.This paper presents a comprehensive, validity-grounded measurement framework designed specifically to detect the difference between augmentation and erosion in human-AI systems. The framework operationalizes seven constructs --- mental model gaps, judgment displacement, trust calibration, cognitive load distribution, moral diffusion, deskilling, and override behavior --- using a longitudinal, mixed-methods architecture that triangulates behavioral signals, self-report instruments, and qualitative process evidence. It establishes seven sentinel indicators, maps each to behavioral signatures and intervention triggers, and provides a governance structure for operational deployment.The framework advances a core claim: organizations should not ask whether humans are performing well with AI. They should ask whether humans could still perform without it, whether they know when the AI is wrong, whether they are still contributing something the AI cannot, and whether they still feel and act as responsible agents for outcomes. The answers to those four questions determine whether a human-AI system is genuinely augmenting human judgment or systematically borrowing from it.This version incorporates findings from eight peer-reviewed sources published between 2023 and 2026, including the first field study of clinician AI trust formation in a live mental health deployment, a 30-year systematic review of trust calibration research, neuroscientific evidence on metacognitive sensitivity in AI collaboration, and a 50-state legislative review documenting the regulatory vacuum in mental health AI governance.Keywords: human-AI teaming, judgment preservation, cognitive ergonomics, trust calibration, moral diffusion, deskilling, automation bias, metacognitive sensitivity, digital therapeutic alliance, implementation science, mixed-methods measurement, AI ethics, human factors, psychological safety1. IntroductionThe rapid integration of artificial intelligence into professional decision-making environments has outpaced the development of measurement tools adequate to evaluate its effects on the humans who use it. Current evaluation practice focuses primarily on system-level outcomes: accuracy, throughput, error rates, and user satisfaction. These metrics can detect whether a human-AI system is producing good results. They cannot detect whether the humans within that system are retaining the cognitive and moral capacities to produce good results independently.This gap matters for reasons that are both practical and ethical. Practically, any AI system can fail --- through model drift, distribution shift, adversarial inputs, or simple edge-case failure. When it does, the humans in the loop must be capable of detecting the failure and intervening effectively. A system that has quietly eroded operators' independent skill, mental model accuracy, or corrective override capacity will be most vulnerable precisely at the moments it most needs capable human oversight.Ethically, the question of whether humans remain genuine agents in AI-assisted decisions --- rather than performers of a rationalization function after the AI has effectively decided --- is central to accountability, responsibility, and the integrity of professional judgment in high-stakes domains including healthcare, law, finance, and personnel decisions.The measurement problem is made structurally difficult by a core validity challenge: displacement is not the same as assistance, but they produce identical surface behavior in the short term. The degradation is slow, contextually embedded, and actively rationalized by the humans experiencing it. Standard productivity metrics cannot see this distinction. Neither can one-time surveys. The measurement architecture required to detect it must be longitudinal, withdrawal-sensitive, and multi-method.This paper presents such an architecture, grounded in validity theory and organized around seven constructs that together constitute a working definition of preserved human judgment. The framework is intended as both a research instrument and an operational governance tool, applicable across enterprise domains including healthcare administration, claims review, triage, hiring, fraud detection, and prior authorization.The Research FoundationThis framework is grounded in a cross-disciplinary evidence base spanning eight peer-reviewed sources: a 30-year systematic review of 96 trust calibration studies (Wischnewski et al., 2023); controlled evidence on adaptive explainability and error detection (Tennakoon et al., 2025); neuroscientific findings on metacognitive sensitivity in joint decision-making (Lee et al., 2025); field study evidence from a live clinical AI deployment (Kelly et al., 2025); industry evidence on intentional infrastructure requirements (Strudwick et al., 2025); a 50-state regulatory review (Shumate et al., 2025); a conceptual synthesis of jagged AI capabilities and System 0 positioning (Saßmannshausen & Wagener, 2026); and practitioner evidence on the psychologist gap in AI design (JMIR Human Factors, 2021, 2024, 2025).2. Background and Theoretical Grounding2.1 The Augmentation-Erosion ProblemResearch on human-automation interaction has long documented the phenomenon of automation bias --- the tendency to over-rely on automated systems and under-weight independent judgment (Parasuraman & Manzey, 2010). More recently, scholarship on appropriate reliance has refined this concern, distinguishing between misuse (following AI when it is wrong), disuse (resisting AI when it is correct), and the broader question of whether reliance reflects genuine calibration or mere compliance (Schemmer et al., 2022).Parallel work on out-of-the-loop effects demonstrates that humans who operate as passive monitors of automated systems lose the situational awareness and skill activation needed to intervene effectively when automation fails (Endsley, 1995; Onnasch et al., 2014). This effect is not limited to attention; it extends to domain skill, confidence calibration, and the felt sense of agency and accountability.A critical and underappreciated finding from recent field research establishes the organizational scale of this problem. A systematic review of 96 empirical studies spanning 30 years of trust calibration research found that after three decades of investigation, not a single study had been conducted in an actual workplace (Wischnewski et al., 2023). Every finding the field has produced about how humans calibrate trust in automated systems comes from a lab or controlled online environment. The gap between research and operational reality is not a limitation of the evidence --- it is the evidence. Organizations are making consequential AI deployment decisions without validated field data on what those deployments do to human judgment over time.What is less well-developed is a unified measurement architecture that captures all of these effects simultaneously, across a longitudinal design sensitive to their trajectory, in a form applicable to the enterprise AI contexts where they are now most consequential. This framework is that architecture.2.2 Jagged Intelligence and the System 0 ProblemGenerative AI systems exhibit what Saßmannshausen and Wagener (2026) term jagged intelligence: superhuman performance on some tasks combined with brittle, often opaque failures on others, with the capability boundary invisible, unpredictable, and continuously shifting. This structural property of AI systems has direct implications for measurement: any framework that treats AI capabilities as stable, inspectable, or bounded by user expertise will systematically fail to capture the actual risk profile of human-AI collaboration.Compounding this, AI systems increasingly function as what Chiriatti et al. (2025) call a pre-cognitive System 0 --- shaping what information enters human awareness before deliberate evaluation can occur. When AI provides filtered information, suggested framings, and confident-sounding outputs, it operates prior to System 1 (intuition) and System 2 (deliberation) in the cognitive processing sequence. This pre-cognitive positioning explains a finding that has troubled the field: even interventions designed to promote critical evaluation of AI output often fail, because by the time evaluation begins, the AI has already structured the cognitive landscape in which that evaluation occurs.Why Standard Evaluations Miss ThisThe transparency paradox (BaHammam, 2025) and the jagged intelligence problem (Saßmannshausen & Wagener, 2026) converge on a single measurement implication: performance metrics taken in AI-present conditions cannot detect the erosion they are supposed to measure, because AI's influence on cognition precedes the behaviors those metrics observe. Measurement must occur before AI input, after AI input, and under AI-withdrawal conditions to generate valid signal about what the AI is actually doing to human judgment.2.3 Trust Calibration as Infrastructure, Not AttitudeThe research literature distinguishes warranted trust --- trust that accurately matches actual system reliability --- from unwarranted trust in either direction (Wischnewski et al., 2023). This distinction is not merely semantic. A 2023 CHI systematic review synthesizing 96 empirical studies found that calibrated trust is a continuous measurement challenge, not a one-time orientation outcome, and that static interventions such as training or disclosure statements reliably fail to produce durable calibration.Controlled evidence supports the measurement program's adaptive logic: a study of 360 participants reviewing 5,000 AI-generated outputs found that adaptive explanations --- tailored to user expertise, output confidence, and contextual risk --- improved error detection rates by up to 16% over no-explanation conditions, without increasing decision time (Tennakoon et al., 2025). Critically, static explanations produced the same time cost as adaptive explanations with significantly lower calibration improvement --- confirming that explanation type, not explanation presence, drives the effect.Research from cognitive neuroscience adds a mechanism: human calibration of trust in AI is mediated by metacognitive sensitivity --- the degree to which an individual's confidence accurately tracks their actual performance (Lee et al., 2025). High metacognitive sensitivity enables optimal joint decisions; low metacognitive sensitivity renders confidence ratings useless as trust calibration signals. Preliminary evidence suggests that LLMs exhibit task-specific metacognitive sensitivity that is comparable to or exceeds humans in some domains and falls well below human performance in others --- a pattern consistent with jagged intelligence and directly measurable through the framework's trust calibration construct.2.4 The Psychologist GapA mapping review of human factors research in healthcare AI found that AI development studies have systematically ignored two critical factors: ecological validity and human cognition (JMIR Human Factors, 2021). A systematic review of explainable AI research in clinical settings found a notable absence of psychologists from the research literature --- a gap the authors explicitly named as limiting the field's ability to improve appropriate uptake and adoption of AI (JMIR AI, 2024).This absence is not incidental. The questions that matter most in human-AI integration --- how trust forms, what conditions produce psychological safety for AI disclosure, how professional identity interacts with AI adoption, how moral agency is maintained under automation pressure --- are psychological questions. They require psychological expertise to investigate and design for. The EthAiSyn framework and the Human-AI Integration Architect role it generates exist at this gap.2.5 The Construct of Judgment PreservationWe define judgment preservation as a multi-dimensional criterion state in which five conditions are jointly satisfied. Operators still understand the AI system's functional boundaries and known failure modes. They still discriminate correctly when to rely on versus resist AI output. They retain independent task skill that functions adequately when AI assistance is removed. They feel and act as responsible moral agents for outcomes. And they can intervene effectively when the model produces wrong or harmful recommendations.This definition is intentionally demanding. It is not satisfied by high assisted performance alone. An operator whose accuracy is excellent in AI-present conditions but who cannot perform the underlying task independently, does not understand the AI's failure modes, and attributes responsibility for outcomes to the system rather than to themselves has not preserved their judgment. They have outsourced it.The Core Diagnostic StandardThe most useful next question is not "are humans performing well with AI?" It is: can they still perform without it? Do they know when the AI is wrong? Do they still contribute something the AI cannot? Do they still feel and act responsible for the outcome? These four questions are the operational standard for judgment preservation assessment.3. Research FoundationThe EthAiSyn framework is grounded in an eight-source cross-disciplinary evidence base published between 2023 and 2026. Each source contributes a distinct empirical or theoretical layer. Together they constitute a convergent argument for the framework's core claims and the specific design decisions embedded in its measurement architecture.3.1 Trust Calibration: The 30-Year Evidence BaseWischnewski, Krämer, and Müller (2023), in a systematic review presented at the ACM CHI Conference on Human Factors in Computing Systems, synthesized 96 empirical human-subject studies conducted between 1992 and 2023 on the calibration of trust in automated systems. Their principal finding --- that not a single study had been conducted in an actual workplace --- establishes the fundamental gap that this framework is designed to fill. Their four-dimensional taxonomy of trust calibration interventions (exo versus endo; warranted versus unwarranted; static versus adaptive; capabilities versus process-oriented) directly informs the framework's measurement battery design and the distinction between calibration interventions that address symptoms versus those that address structure.3.2 Adaptive Explainability: Controlled EvidenceTennakoon, Danso, and Zhao (2025), published in the Journal on Artificial Intelligence, report a controlled study of 360 participants evaluating 5,000 AI-generated outputs across text, code, and image modalities. Their adaptive explainability engine --- adjusting explanation content based on user expertise, model confidence, and contextual risk --- improved error detection rates by up to 16 percentage points over no-explanation conditions and reduced mean squared calibration error by 33-51% across expertise groups. Critically, these improvements were achieved without increasing decision time, and the improvements were largest for novice users --- establishing that adaptive explanation design closes the expertise gap rather than merely reinforcing existing advantages.3.3 Metacognitive Sensitivity: Neuroscientific GroundingLee, Pruitt, Zhou, Du, and Odegaard (2025), published in PNAS Nexus, present a theoretical and empirical framework for metacognitive sensitivity in human-AI joint decision-making. Drawing on signal detection theory and perceptual metacognition research, they establish that confidence calibration alone is insufficient to support optimal joint decisions --- what matters is metacognitive sensitivity, the degree to which confidence accurately tracks accuracy on a trial-by-trial basis. Their analysis of LLM metacognitive behavior reveals task-specific patterns: LLMs exhibit overconfidence on some reasoning tasks, appropriate calibration on others, and performance superior to humans on still others. This jagged metacognitive profile, combined with evidence that AI hallucinations reflect metacognitive myopia --- failure to know what the system does not know --- provides the neuroscientific grounding for the framework's trust calibration construct.3.4 The Transparency Paradox: Institutional EvidenceBaHammam (2025), in a review published in Nature and Science of Sleep (PMC), documents what he terms the transparency paradox in AI-assisted academic writing: disclosure of AI assistance systematically reduces evaluators' assessment of work quality even when content is identical, with the quality penalty applied to the disclosure itself rather than to the work. Controlled research cited in this review found that evaluators with high writing confidence show the strongest anti-disclosure bias, suggesting that professional identity threat mediates the effect. The paper also documents that non-disclosure often stems from institutional conditions that punish honesty rather than from individual ethical failure --- a finding with direct implications for organizational AI governance design.3.5 The Gray Box Reframe: Practitioner VoiceMorris (2025), writing in AI in Eye Care, offers the practitioner-level reframe that the EthAiSyn framework requires for organizational adoption: human clinical decision-making is already an unexplained, unaudited, bias-carrying system. The question is not whether AI is more opaque than human judgment --- it is which system's biases are more detectable, more correctable, and more systematically monitored. This argument, offered by a 25-year clinical practitioner, provides the rhetorical foundation for EthAiSyn's key positioning claim: the transparency problem is not new with AI. It is newly visible.3.6 Jagged Intelligence and System 0: The Cognitive ArchitectureSaßmannshausen and Wagener (2026), in a peer-reviewed preprint published in Qeios, present the Triadic Framework for adaptive mental models in human-AI collaboration. Their three-layer architecture --- System Layer (how AI works), Collaboration Layer (how humans interact with it), and Metacognitive Layer (how humans conceptualize the relationship) --- maps directly onto the EthAiSyn framework's three levels of analysis. Their concept of jagged intelligence establishes that capability boundaries are invisible and moving, requiring adaptive mental models rather than fixed competency assumptions. Their concept of AI as a pre-cognitive System 0 --- shaping human awareness before deliberate thought begins --- provides the mechanistic explanation for why standard evaluation methods miss the erosion this framework is designed to detect.3.7 Clinician Trust Formation: The Field StudyKelly, Bhardwaj, Holmberg Sainte-Marie, Van de Ven, Melia, Williams, Mathiasen, and Nielsen (2025), published in JMIR Human Factors, report the first field study of clinician trust formation in a live AI mental health deployment. Their qualitative case study of clinical psychologists using an AI screening model in a regional Danish psychiatric service identified a three-stage trust journey: sense-making, risk appraisal, and conditional decision to rely. Critical findings include that trust was contextually bounded to low-risk scenarios even when the model performed well; that intrinsic trust (based on reasoning alignment with clinical norms) was more durable than extrinsic trust (based on performance metrics); and that the concept of causability --- the degree to which AI explanations support causal understanding rather than mere correlation --- was a necessary condition for clinical adoption. This field study represents exactly the evidence gap that Wischnewski et al. (2023) identified: it is the first peer-reviewed evidence of how trust actually forms in a real clinical AI deployment.3.8 Implementation Infrastructure: Organizational EvidenceStrudwick, Kassam, Torous, and Patenaude (2025), writing from Canada's largest mental health and addictions teaching hospital in JMIR Mental Health, provide the organizational evidence base for the framework's governance layer. Their central claim --- that successful adoption, scale-up, and sustainability of digital mental health innovations require intentional infrastructure, not just technology --- defines the gap that EthAiSyn addresses. Using the NASSS (Non-Adoption, Abandonment, Scale-Up, Spread, and Sustainability) implementation science framework, they document seven domains where digital mental health consistently fails to scale, none of which are reducible to technology quality. Shumate et al. (2025), in a 50-state legislative review published in JMIR Mental Health, document that most state AI legislation treats mental health as incidental to broader AI regulation, that explicit mental health provisions are rare, and that clinician and patient perspectives are seldom incorporated into policymaking --- establishing the regulatory vacuum that makes organizational-level governance frameworks both necessary and urgent.Research SynthesisThese eight sources converge on a single conclusion: the measurement and governance infrastructure for responsible human-AI integration in high-stakes professional contexts does not yet exist at scale. The technology exists. The pilots exist. The evidence about what is at stake exists. What does not exist is the organizational architecture, the trained personnel, and the validated measurement instruments required to carry AI from demonstration to durable, safe, and equitable practice. EthAiSyn is that architecture.4. The Measurement Framework: Seven ConstructsThe framework operationalizes judgment preservation through seven constructs. Each construct is defined, theoretically grounded, and measured through behavioral, self-report, and qualitative signals. The sentinel indicator produced by each construct is the operationalized threshold that triggers governance review. No single construct settles the judgment preservation question. The diagnostic signal lives in convergence across all seven.4.1 Construct 1: Mental Model GapsMental model gaps measure the divergence between how the AI actually functions and how the operator believes it functions. The gap is not a static property --- it should narrow over time as genuine calibration occurs. A gap that widens, plateaus, or collapses into learned helplessness is the risk signature. Relevant research: Saßmannshausen & Wagener (2026) establish that LLM capabilities are only discoverable through use, observation, and experimentation --- not through inspection or training alone. This has direct implications for how mental model assessment must be designed.Behavioral Measures Anticipatory Prediction Mapping: Before showing AI output on complex cases, require operators to predict the AI's recommendation and its likelihood of being correct. Score the divergence between human prediction and actual AI output over time. Systematic mismatch analysis: Track when users rely on AI versus when AI is actually strong or weak by case type. Persistent reliance in low-accuracy zones is a direct behavioral signature of model misunderstanding. Self-Report and Elicitation Measures Structured teach-back: Regularly ask operators to explain what the AI does well, what it does poorly, and under what conditions. Score responses against an expert rubric covering: inputs, decision rule, confidence meaning, known failure modes, and when not to use the model. Concept mapping: Ask operators to draw or describe who (human vs. AI) handles which information, checks, and final decisions. Code for complementarity versus over-delegation. Counterfactual items: "If we turned the AI off, which parts of this task would become harder, easier, or unchanged?" Answers reveal implicit mental model assumptions. Sentinel Indicator: Mental Model Gap ScoreWarning threshold: widening or plateaued gap across consecutive time points. Governance trigger: two consecutive quarters of widening gap despite normal operations.4.2 Construct 2: Judgment DisplacementJudgment displacement measures the degree to which operators' final decisions reflect independent judgment versus AI-anchored adjustment. The critical measure is not whether operators follow the AI --- sometimes they should --- but whether their movement toward AI output is appropriate given the evidential weight of their own independent assessment versus the AI's reliability in that case type.Behavioral Measures Weight-of-advice protocol: Capture independent judgment before AI output exposure, then final decision after. Calculate movement toward AI as a proportion of initial human-AI discrepancy. Track whether movement correlates with actual AI accuracy by case type. Confidence asymmetry: Does operator confidence increase when following AI but not when overriding it (regardless of outcome)? This asymmetry is a behavioral signature of displacement rather than calibration. Sentinel Indicator: Judgment Displacement IndexWarning threshold: positive slope over three or more sessions; weight-of-advice consistently above 0.6 regardless of AI reliability zone.4.3 Construct 3: Trust CalibrationTrust calibration measures the alignment between operator trust in AI output and actual AI reliability by case type. Calibration is not fixed --- it should be continuously updated as operators gain experience and as AI performance varies. Static calibration, calibration that does not respond to performance feedback, or calibration that tracks confidence rather than accuracy are all warning signatures. Research grounding: Wischnewski et al. (2023) establish that warranted trust --- trust that accurately matches system reliability --- is the measurement target, and that static interventions reliably fail to produce durable calibration.Behavioral Measures Brier scoring: Require operators to provide probability estimates of AI correctness before seeing outcomes. Track calibration curves over time. Planted-failure trials: Insert known wrong AI recommendations. Track corrective override rate as a function of planted-failure salience and AI confidence display. Reliability-manipulation conditions: Temporarily adjust AI accuracy without operator notification. Measure lag time to calibration adjustment. Sentinel Indicator: Trust Calibration ErrorWarning threshold: calibration curve divergence growing over time; no adjustment to reliability manipulation within two sessions.4.4 Construct 4: Cognitive Load DistributionCognitive load distribution measures how operators allocate limited cognitive resources across the human-AI workflow. The risk signature is not high cognitive load per se --- complex decisions should demand significant cognitive engagement --- but misallocation: excessive monitoring of AI output at the cost of primary evidence evaluation. Research grounding: Virtual nursing workflow analysis found first cognitive fatigue onset at 9.8 minutes --- establishing that cognitive load accumulation is measurable, consequential, and occurring within single work sessions (Khairat et al., 2025, JMIR Human Factors).Behavioral Measures Time-on-task decomposition: Measure time spent on pre-AI evidence review versus post-AI adjustment. Track ratio over time. Eye-tracking or attention proxies: Where available, track attention allocation across workflow stages. NASA-TLX adapted: Modified for AI-workflow context to distinguish mental demand from AI monitoring versus primary task. Sentinel Indicator: Cognitive Redistribution RatioWarning threshold: ratio increasing with stable or declining independent accuracy --- indicating that monitoring AI is crowding out primary evidence engagement.4.5 Construct 5: Moral DiffusionMoral diffusion measures the degree to which operators still locate themselves as accountable agents for AI-assisted decisions. The risk is not that operators acknowledge AI contribution --- they should --- but that the presence of AI becomes a mechanism for moral disengagement: diffusing responsibility to the system, the organization, or the algorithm itself.Behavioral Measures Outcome attribution task: Following specific decisions, ask operators to allocate causal and moral responsibility across themselves, the AI, the organization, and the situation. Track allocation patterns over time under identical outcome conditions. Error response behavior: When AI-assisted decisions produce poor outcomes, does the operator investigate their own contribution or default to system critique? Self-Report Measures Moral disengagement indicators: Adapted from Bandura's (1999) moral disengagement mechanisms. Track endorsement of mechanisms like moral justification ("the AI made the call"), displacement of responsibility ("the organization approved this workflow"), and dehumanization of impact ("these are just cases, not people"). Sentinel Indicator: Moral Diffusion IndexWarning threshold: responsibility allocation shifting toward AI or organization by more than 15 points over baseline under identical outcome conditions.4.6 Construct 6: DeskillingDeskilling measures the decline in unassisted task performance over time as AI assistance becomes routine. The risk is not that operators use AI for routine cases --- appropriate automation of routine work is a legitimate efficiency gain --- but that skill atrophy in unassisted conditions compromises their capacity to handle edge cases, unusual presentations, and system failure scenarios.Behavioral Measures Withdrawal block performance: Measure task performance under timed AI-withdrawal conditions at each time point. Track trajectory across the study period. Transfer task performance: Use structurally novel cases that require underlying domain skill to solve, administered without AI. Track accuracy and confidence calibration. Decomposed skill assessment: Identify the component skills (evidence recognition, differential generation, threshold application, documentation) and assess each independently under AI-absent conditions. Sentinel Indicator: Deskilling SlopeWarning threshold: any negative slope on unassisted accuracy sustained across two or more time points.4.7 Construct 7: Override BehaviorOverride behavior is the most ecologically valid signal in the measurement battery. When operators override AI recommendations with evidence-based corrections, it indicates that they are processing primary evidence independently, maintaining an adequate mental model of AI reliability, feeling morally located in the outcome, and retaining sufficient skill to recognize and correct AI error. Research grounding: Tennakoon et al. (2025) found that adaptive explainability increased error detection rates --- a direct measure of override quality --- by 16% across expertise levels without time penalty, establishing that override quality is both measurable and improvable.Behavioral Measures Override rate by case type: Track the frequency of AI overrides as a function of AI reliability in that case type. Correct calibration predicts higher override rates in low-reliability zones. Override accuracy: For each override, compare operator's final decision to ground truth. Track proportion of overrides that were correct. Override latency: Time from AI output presentation to override decision. Both very fast (impulsive override) and very slow (agonizing) override patterns are informative. Sentinel Indicator: Override Quality RateWarning threshold: declining override frequency combined with declining override accuracy; corrective override accuracy below 50%.5. The Seven Sentinel IndicatorsEach construct produces a sentinel indicator: a quantified threshold that, when crossed, triggers governance review. The indicators function both individually and collectively. A single indicator crossing its threshold warrants monitoring. Three or more indicators in the same direction within the same review period constitute a red flag convergence pattern requiring active intervention.Indicator What It Captures Warning ThresholdMental Model Gap Score Boundary and failure-mode misunderstanding; divergence between stated model and expert rubric Widening or plateaued gap across consecutive time pointsJudgment Displacement Index Movement toward AI after controlling for initial judgment and evidence weight Positive slope over 3+ sessions; weight-of-advice consistently above 0.6Trust Calibration Error Mismatch between expected AI accuracy, actual AI accuracy, and reliance choices Calibration curve divergence growing over time; no adjustment to reliability manipulationCognitive Redistribution Ratio Attention and time on AI monitoring versus primary evidence evaluation Ratio increasing with stable or declining independent accuracyMoral Diffusion Index Dispersion of responsibility attribution away from the human decision-maker Responsibility allocation shifting toward AI or organization by more than 15 points over baselineDeskilling Slope Decline on unassisted and transfer task performance across sessions Any negative slope on unassisted accuracy sustained across 2+ time pointsOverride Quality Rate Proportion of evidence-based corrective overrides of wrong AI recommendations Declining rate combined with declining override frequency; corrective override accuracy below 50%Red Flag ConvergenceThree or more sentinel indicators trending in the same direction (toward erosion) within the same quarterly review period constitutes a red flag convergence pattern requiring active organizational intervention. No individual indicator is sufficient; the convergence pattern is the diagnostic signal.6. Clinical Domain Applications6.1 Mental Health AI: The Trust JourneyThe most clinically significant recent evidence for this framework comes from a field study of AI deployment in a live mental health setting (Kelly et al., 2025). Clinical psychologists using an AI screening model in a Danish psychiatric service demonstrated a three-stage trust journey: sense-making (initial encounter with the system and formation of preliminary mental models), risk appraisal (evaluation of the system's boundaries and known failure modes), and conditional decision to rely (bounded trust in specific low-risk contexts contingent on safety protocols).Several findings from this study have direct measurement implications. First, trust was contextually bounded --- clinicians trusted the model for pre-interview screening but not for complex clinical formulation --- a pattern this framework would classify as appropriately calibrated rather than as insufficient adoption. Second, intrinsic trust (based on reasoning alignment) proved more durable than extrinsic trust (based on performance metrics), supporting the framework's emphasis on mental model assessment over outcome tracking alone. Third, the concept of causability --- the degree to which AI explanations supported causal understanding rather than mere correlation --- emerged as a necessary condition for sustained clinical adoption. Organizations that deploy AI without supporting causable explanation are building on extrinsic trust that will not survive the first significant error.6.2 The Digital Therapeutic AllianceA systematic review of AI-powered mental health chatbots identified the emerging construct of digital therapeutic alliance (DTA) --- the degree to which AI-mediated interactions can replicate the trust, collaboration, and goal-alignment that characterize therapeutic relationships between clinicians and patients (Malouin-Lachance et al., 2025, JMIR Mental Health). The DTA construct has direct implications for the framework's moral diffusion and trust calibration measures: when AI systems are positioned as therapeutic agents, the human operator's role becomes ambiguous, and moral location for outcomes becomes genuinely contested.The Woebot shutdown in July 2025 --- the most prominent AI therapy chatbot in history --- illustrates the consequence of deploying AI in therapeutic contexts without resolving these questions in advance. The shutdown was not driven by technical failure but by unresolved accountability, scope-of-practice boundaries, and the limits of AI in high-stakes human relationships --- exactly the integration architecture questions this framework addresses.6.3 Healthcare Administration: Prior Authorization and Benefits VerificationPrior authorization and benefits verification workflows represent an ideal measurement context for this framework. The AI operates in a rule-intensive, high-volume, high-consequence domain where human judgment is consequential, AI failure modes are documentable, and the cost of undetected deskilling or trust miscalibration is measurable in patient access, financial outcome, and organizational liability.The specific risk profile in prior authorization includes: AI overgeneralization of payer rules across plan types and carve-outs; failure to account for patient-specific exception criteria; and confident presentation of incorrect determinations that human reviewers, under volume pressure, accept without independent verification. The deskilling risk is particularly salient: reviewers who have processed thousands of cases with AI assistance may lose the granular plan knowledge that enables them to catch the edge cases AI misclassifies.This risk is structurally identical to the atypical presentation risk in clinical triage: the AI operating outside its reliable competence zone while the human operator has lost the independent judgment to catch the error. The measurement program detects this pattern in both contexts using the same analytical logic. This cross-domain consistency is not coincidental --- it is the diagnostic signature of judgment erosion, appearing wherever AI is deployed in high-volume, rule-intensive professional workflows.7. The Human-AI Integration Architect RoleThe EthAiSyn framework generates a new organizational function that does not exist before its arrival. The Human-AI Integration Architect is not a technology role, a compliance role, or a training role. It is a role that operates across all three simultaneously, grounded in the psychological theory of how humans form, maintain, and miscalibrate trust in systems that share their cognitive workspace.Research support for this role is now documented across multiple peer-reviewed sources. Wischnewski et al. (2023) establish that trust calibration requires continuous, adaptive intervention --- not one-time training. The JMIR AI systematic review (2024) explicitly names the absence of psychologists from AI design as a field-level gap. Strudwick et al. (2025) establish that successful digital mental health implementation requires intentional infrastructure and trained personnel. Torous et al. (2025) document that the digital navigator role --- the practical implementation of what the Integration Architect does --- has been called for since 2015 and remains largely unfilled.7.1 Core Functions Construct mapping: Translating the seven abstract measurement constructs into domain-specific behavioral signatures and case taxonomies for a given organizational context. Measurement instrumentation: Designing the behavioral logging protocols, withdrawal conditions, and planted-failure trials that generate the data the measurement battery requires. Governance architecture: Establishing the review cadence, threshold triggers, and organizational response structure that converts measurement data into accountability. Mental model calibration: Facilitating the structured reflection processes --- teach-backs, concept mapping, counterfactual probing --- that support operators in maintaining accurate mental models of AI system behavior. Trust journey design: Engineering the organizational conditions --- trialability, contextually bounded initial deployment, ongoing evaluation data provision --- that support appropriate trust formation over time. Ethics by design: Ensuring that psychological safety, disclosure conditions, equity considerations, and human dignity are embedded in deployment architecture from the outset rather than retrofitted after harm. 7.2 What This Role Is NotThe Human-AI Integration Architect is not an AI trainer who teaches people how to use tools. It is not a data analyst who monitors system performance metrics. It is not a compliance officer who ensures policy adherence. And it is not a communications professional who manages messaging about AI. It is the person who designs and governs the conditions under which humans and AI systems can work together without the humans losing what makes their contribution irreplaceable.The Positioning DistinctionEthAiSyn does not build the AI. It designs the conditions under which humans can use AI safely, maintain appropriate trust, preserve their independent judgment, and remain genuine moral agents for the outcomes their AI-assisted work produces. That is integration architecture. It is a different job from AI development, AI training, and AI compliance --- and it is the job the field has been calling for without yet naming it.8. Regulatory Context and Governance MandateA 50-state legislative review of mental health AI regulation found that most state laws treat mental health as incidental to broader AI or healthcare regulation, that explicit mental health provisions are rare, and that clinician and patient perspectives are seldom incorporated into policymaking (Shumate et al., 2025). The result is a fragmented and uneven environment that risks leaving patients unprotected and clinicians overburdened.This regulatory vacuum is not a reason to wait for governance clarity before deploying measurement frameworks. It is a reason to deploy organizational-level governance frameworks now, before regulatory requirements crystallize in forms that may not reflect clinical reality. The organizations that develop internal judgment preservation measurement programs before regulatory mandates arrive will be better positioned to shape those mandates --- and will have the operational evidence to demonstrate that their AI deployments are responsible rather than merely compliant.Relevant regulatory developments inform the governance architecture of this framework. The EU AI Act classifies AI systems used in healthcare, employment, and legal decisions as high-risk, requiring human oversight, transparency, and audit capability. US federal guidance on AI in healthcare emphasizes human-in-the-loop requirements and documentation of decision support boundaries. State-level employment AI legislation increasingly requires disclosure and impact assessment. These regulatory directions converge on the same organizational requirements the framework addresses: documented human oversight, calibration evidence, override capability, and accountability traceability.8.1 The Disclosure ProblemThe transparency paradox documented by BaHammam (2025) has a direct governance implication: disclosure requirements that penalize honesty will produce strategic non-disclosure, creating a hidden ecosystem of AI use that is invisible to both organizational governance and regulatory oversight. The framework's behavioral measurement approach --- logging AI use, override behavior, and decision patterns through system telemetry rather than self-report alone --- addresses this problem by making measurement independent of disclosure willingness.Organizations cannot build trustworthy AI integration on a foundation of strategic non-disclosure. The governance architecture must create conditions where honest disclosure is institutionally safe --- where acknowledging AI use, acknowledging uncertainty about AI limitations, and acknowledging override decisions does not carry professional penalty. This requires not just policy but culture design, which is a core function of the Human-AI Integration Architect role.9. Implementation SequenceThe sequence below reflects both the methodological validity requirements of the measurement program and the practical constraints of fielding it in real organizations. Skipping phases or compressing timelines produces instruments that look like they are working while measuring the wrong things.Phase 1: Concept Elicitation and Construct MappingConduct critical-incident interviews, cognitive task analysis, shadowing, and artifact review to define what good judgment looks like in the specific domain. Build a case taxonomy: identify the range of case types, difficulty levels, and AI reliability zones that will anchor the measurement battery. Produce a construct map that connects each of the seven constructs to domain-specific behavioral signatures. Do not draft items or tasks until this step is complete. Measurement validity begins with construct validity, and construct validity requires domain specificity.Phase 2: Instrument DevelopmentWrite items and task designs only after the construct map and case taxonomy exist. Develop the behavioral logging protocol for the product or platform. Adapt existing validated scales (NASA-TLX, trust-in-automation scales, Moral Disengagement Scale) for the specific AI context and operator population. Design the withdrawal conditions, planted-failure trials, and reliance-manipulation conditions.Phase 3: Cognitive Interviews and PilotVerify that operators interpret items, prompts, and task instructions as intended. Identify confounded or ambiguous items before piloting. Test the pre-AI judgment capture protocol specifically --- operators must understand they are expected to form an independent view before seeing AI output. Run realistic cases with pre/post judgment capture, trace logs, and think-alouds. Validate that the behavioral logging system produces clean, analyzable data.Phase 4: Psychometric TestingFactor structure and reliability for self-report scales. Multilevel stability across operators and over time. Convergent and discriminant patterns across constructs. Sensitivity to known manipulations. Consequences validation: test whether scores on the measurement battery actually predict unsafe reliance events, missed corrective overrides, skill decay, or accountability failures in the domain.Phase 5: Longitudinal OperationsThe study runs the same domain task under three conditions at each time point: Human-Only (baseline and residual skill), Human-AI Assisted (performance and behavioral patterns under normal operations), and AI Withdrawal/Stress Test (what survives when the AI is removed --- the strongest validity test). Minimum time points: T0 (pre- or low-AI integration), T1 (mid-integration), T2+ (ongoing, including at least one planned withdrawal block per year).Phase 6: Governance and ResponseLink sentinel indicator thresholds to specific organizational responses: enhanced coaching for individual operators, system re-evaluation for pattern-level flags, workflow redesign for sustained red flag convergence. Build in the governance authority to act on measurement findings --- a measurement program without institutional action capacity is data collection theater.10. Discussion10.1 Limitations and Boundary ConditionsThis framework is designed for high-stakes professional decision contexts where operator judgment is consequential and where the erosion of that judgment carries organizational and ethical risk. It is not designed for consumer AI applications, creative tools, or low-stakes productivity assistance. The validity of any measurement battery derived from this framework depends entirely on the quality of the domain-specific construct mapping performed in Phase 1.The framework also presupposes organizational access. The most powerful elements of the design --- longitudinal tracking, withdrawal blocks, behavioral telemetry, and planted-failure conditions --- require cooperation from the organizations operating the AI systems and cannot be implemented from outside. This is both a practical constraint and a governance implication: organizations that deploy AI in consequential decision contexts bear an affirmative responsibility to enable the evaluation of their systems' effects on human judgment.10.2 Future DevelopmentSeveral measurement challenges remain underdeveloped within this framework. The measurement of team-level judgment preservation --- in contexts where AI is embedded in collaborative decision-making rather than individual workflows --- requires additional theoretical development beyond the individual-operator focus of the present framework. The question of how AI explanation quality interacts with calibration and override behavior across operator expertise levels also warrants dedicated investigation. The development of standardized, cross-domain versions of the sentinel indicators, with normative benchmarks, would substantially improve the framework's operational utility.10.3 The Deeper ArgumentUnderlying the technical architecture of this framework is a claim about what AI systems in professional contexts are actually doing to the humans who use them. The field's dominant metaphor --- that AI is a tool that augments human capability --- treats the relationship as additive: the human plus the AI is more capable than either alone. This framework is built on a different and more cautious assumption: that the relationship is also transformative, and that the direction of transformation --- toward greater human capability or toward dependency, passivity, and moral disengagement --- is not determined by the technology alone but by how the technology is designed, deployed, and evaluated.Organizations that evaluate their AI systems only by what those systems produce are measuring the wrong thing. The human judgment that remains when AI systems are absent, wrong, or operating at the edge of their competence is the ultimate organizational resource that AI integration either protects or consumes. This framework is an instrument for telling the difference.11. ConclusionThis paper has presented a unified measurement framework for detecting whether human-AI systems preserve or erode human judgment over time. The framework is organized around seven constructs --- mental model gaps, judgment displacement, trust calibration, cognitive load distribution, moral diffusion, deskilling, and override behavior --- implemented through a longitudinal, withdrawal-sensitive, mixed-methods design. It produces seven sentinel indicators, a red flag convergence pattern for intervention triggering, a seven-phase implementation sequence, and a governance response structure linking measurement outcomes to organizational action.The framework advances a practical and non-negotiable standard for evaluating human-AI systems: measure not only what humans produce with AI, but what they can still produce without it. The system is preserving judgment only if performance stays high without hidden erosion in the sentinel indicators. If short-term assisted accuracy rises while mental model quality, unassisted performance, responsibility ownership, or corrective override rate trend downward, the system is quietly weakening what it was supposed to be strengthening.The most useful immediate next step for any organization implementing AI in a consequential professional workflow is to identify a specific domain, conduct the concept elicitation interviews described in Phase 1, and build the case taxonomy that transforms this general framework into a specific, valid measurement instrument for that context. The architecture presented here provides the structural logic. Domain specificity provides the validity.ReferencesAmerican Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. AERA.BaHammam, A. S. (2025). The transparency paradox: Why researchers avoid disclosing AI assistance in scientific writing. Nature and Science of Sleep, 17, 2569--2574. https://doi.org/10.2147/NSS.S568375Bandura, A. (1999). Moral disengagement in the perpetration of inhumanities. Personality and Social Psychology Review, 3(3), 193--209.Endsley, M. R. (1995). Toward a theory of situation awareness in dynamic systems. Human Factors, 37(1), 32--64.Goddard, K., Roudsari, A., & Wyatt, J. C. (2012). Automation bias: A systematic review of frequency, effect mediators, and mitigators. Journal of the American Medical Informatics Association, 19(1), 121--127.Jacovi, A., Marasovic, A., Miller, T., & Goldberg, Y. (2021). Formalizing trust in artificial intelligence. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 624--635.Kelly, A., Bhardwaj, N., Holmberg Sainte-Marie, T. T., Van de Ven, P., Melia, R., Williams, J. E., Mathiasen, K., & Nielsen, A. S. (2025). Investigating how clinicians form trust in an AI-based mental health model: Qualitative case study. JMIR Human Factors, 12, e79658. https://doi.org/10.2196/79658Khairat, S., Morelli, J., Liao, W. T., Aucoin, J., Edson, B. S., & Jones, C. B. (2025). Association of virtual nurses' workflow and cognitive fatigue during inpatient encounters: Cross-sectional study. JMIR Human Factors, 12, e67111. https://doi.org/10.2196/67111Lee, D., Pruitt, J., Zhou, T., Du, J., & Odegaard, B. (2025). Metacognitive sensitivity: The key to calibrating trust and optimal decision making with AI. PNAS Nexus, 4(5), pgaf133. https://doi.org/10.1093/pnasnexus/pgaf133Malouin-Lachance, A., Capolupo, J., Laplante, C., & Hudon, A. (2025). Does the digital therapeutic alliance exist? Integrative review. JMIR Mental Health, 12, e69294. https://doi.org/10.2196/69294Morris, S. (2025). Transparency paradox: Questioning AI fears by examining ourselves. AI in Eye Care. https://aiineyecare.com/transparency-paradox/Onnasch, L., Wickens, C. D., Li, H., & Manzey, D. (2014). Human performance consequences of stages and levels of automation: An integrated meta-analysis. Human Factors, 56(3), 476--488.Parasuraman, R., & Manzey, D. H. (2010). Complacency and bias in human use of automation: An attentional integration. Human Factors, 52(3), 381--410.Ramachandram, D., Joshi, H., Zhu, J., Gandhi, D., Hartman, L., & Raval, A. (2025). Transparent AI: The case for interpretability and explainability. arXiv:2507.23535.Saßmannshausen, T. M., & Wagener, S. (2026). Rethink your mental model in the age of generative AI: A triadic framework for human-AI collaboration. Qeios. https://doi.org/10.32388/GAG6KDSchemmer, M., Hemmer, P., Kuehl, N., Benz, C., & Satzger, G. (2022). Should I follow AI-based advice? Measuring appropriate reliance in human-AI decision-making. Companion Proceedings of the 27th International Conference on Intelligent User Interfaces.Shumate, J. N., Rozenblit, E., Flathers, M., Larrauri, C. A., Hau, C., Xia, W., & Torous, J. (2025). Governing AI in mental health: 50-state legislative review. JMIR Mental Health, 12, e80739. https://doi.org/10.2196/80739Skitka, L. J., Mosier, K. L., & Burdick, M. (1999). Does automation bias decision-making? International Journal of Human-Computer Studies, 51(5), 991--1006.Strudwick, G., Kassam, I., Torous, J., & Patenaude, S. (2025). Building the infrastructure for sustainable digital mental health: It is "prime time" for implementation science. JMIR Mental Health, 12, e78791. https://doi.org/10.2196/78791Tennakoon, S., Danso, E., & Zhao, Z. (2025). Calibrating trust in generative artificial intelligence: A human-centered testing framework with adaptive explainability. Journal on Artificial Intelligence, 7(1), 517--547. https://doi.org/10.32604/jai.2025.072628Torous, J., & Cipriani, A. (2025). A paradigm shift in progress: Generative AI's evolving role in mental health care. JMIR Mental Health, 12, e82369. https://doi.org/10.2196/82369Torous, J., Ledley, K. T., Gorban, C., Strudwick, G., et al. (2025). Accelerating digital mental health: The Society of Digital Psychiatry's three-pronged road map for education, digital navigators, and AI. JMIR Mental Health, 12, e84501. https://doi.org/10.2196/84501Wischnewski, M., Krämer, N., & Müller, E. (2023). Measuring and understanding trust calibrations for automated systems: A survey of the state-of-the-art and future directions. Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, Article 755. https://doi.org/10.1145/3544548.3581197EthAi Syn | Human-AI Integration Architecture | Preserving Human Judgment in Human-AI Systems