Trustworthy AI? What AI Reveals About Human Trust Mechanisms and How It Redefines It
by Eva Simone Lihotzky and Imane Berjamy
Abstract
The question of trust in artificial intelligence is too often framed as a narrow inquiry into whether a given tool is reliable. If it is framed at all. We argue that this narrative needs to be extended.
In contemporary AI environments, trust is no longer directed only toward a discrete system or output, but toward a layered assemblage of infrastructures, institutions, incentives, and cultural assumptions that remain largely opaque to users. The problem of trust in AI is therefore not merely technical, but at once organizational, political, epistemic, and anthropological.
This paper proposes that AI is transforming trust from a relational outcome into an infrastructural precondition. The distinction matters because, while interfaces become smoother and more socially persuasive, technological systems on the other side, are evolving faster than human comprehension. Hence, the societal, and respectively organizational, systems built around it and with it fail to cope with the pace. We suggest that the central task is not simply to make AI more explainable, but to design institutions, interactions, and its applying arrangements in which trust becomes deserved, contestable, and collectively intelligible.
Trust as an Infrastructural Condition
When we talk about AI and trust, we most likely ask whether AI – be it as a tool, infrastructure or the output created - can be trusted. It assumes that trust is a stable quality that is attributed in a yes or no condition. But it is not.
Trust has always been multilayered. It exists at the interpersonal level, where individuals decide whether to rely on another person or system under conditions of uncertainty. It exists at the organizational level, where institutions shape accountability, routinize reliance, and decide what kinds of uncertainty are acceptable. And it exists at the societal level, where legal frameworks, infrastructures, public discourse, and geopolitical dependencies define the wider horizon within which trust becomes thinkable. This understanding resonates with Niklas Luhmann’s insight that trust functions as a mechanism for reducing social complexity: Trust becomes necessary precisely where full knowledge is impossible, and action must proceed despite uncertainty (Luhmann 1979). Thus, trust guides reliance especially when complexity and unpredictability make complete understanding impractical. Yet, trust is not the opposite of uncertainty, and it is not a response to it neither. Research in developmental psychology, neuroscience, and evolutionary theory suggests that while trust is certainly useful in uncertain situations, its origins lie much deeper. Humans have evolved as highly social and cooperative organisms whose survival depended on their capacity to coordinate, cooperate, and rapidly establish social bonds. Trust is therefore deeply embedded in the cognitive architectures that support social life. The human brain continuously anticipates cooperation, interprets intentions, and projects relational meaning onto others. These mechanisms are not secondary cultural constructions, they are part of the neurocognitive conditions that made collective human life possible. In this sense, uncertainty does not require trust, but experience modulates an already existing trust condition[1]
[1] Developmental psychologist Erik Erikson described this as “basic trust.” According to Erikson, humans enter the world with an initial orientation toward relational openness and dependence. Trust therefore comes before rational evaluation. Experiences throughout life may strengthen or erode this baseline disposition, but they do not create it from nothing.
The issue is therefore not simply whether AI systems are trustworthy, but how deeply they interact with pre-existing human predispositions toward trust and cooperation. Humans engage with AI through inherited mechanisms originally evolved for human-to-human cooperation. This may explain why opaque AI systems can generate trust even in the absence of genuine understanding. Users frequently rely less on technical comprehension than on cognitive shortcuts such as fluency, familiarity, consistency, and perceived authority. Now, if we assume that people project coherence, intentionality, authority, and personality onto AI systems, especially when these systems communicate fluently, we can say that AI activates pre-existing cognitive mechanisms of social trust.
Therefore, in AI-mediated environments, trust increasingly shifts from an outcome of interaction to an input of use. One does not first understand a system and then decide to trust it. One must already trust enough to enter the system at all. Trust should be questioned in interface design, in defaults, in workflow integration, in the reduction of friction, and in the coherence of outputs that appear immediately useful. This shift becomes particularly visible when AI systems are compared to the technologies that preceded them. In conversations with practitioners building AI assessment frameworks, a recurring observation is that traditional technologies are in essence deterministic: identical conditions reliably produce identical results, and the reasons are traceable. AI systems, by contrast, are probabilistic, adaptive, and frequently opaque: their outputs can change over time, sometimes in ways that resist anticipation even by those who built them. Trust in such systems cannot be verified once and then held. It must be continuously re-earned through ongoing testing, monitoring, and transparency. What makes this historically distinctive is the speed at which acceptance has formed regardless. It took ultrasound technology approximately ten years to gain sufficient trust for widespread clinical use. Generative AI achieved comparable social acceptance in a fraction of that time, but without equivalent institutional scaffolding having been built around it first. The pace of adoption has outrun the pace of accountability.
In the context of AI, several layers of trust enter the sphere: One is asked to trust the platform that hosts it, the data pipelines that sustain it, the cloud infrastructures that execute it, the organizational actors that procure and deploy it, and the regulatory environment that permits its circulation. What appears as a single “tool” is only the visible surface of a denser assemblage. As Kate Crawford has argued, AI is not an immaterial intelligence but a material and political infrastructure shaped by extractive supply chains, labor regimes, environmental costs, and concentrated ownership (Crawford 2021).
Further, trust increasingly resides in the space between capability and comprehension. AI systems are not only becoming more capable, but also more layered while presenting themselves through increasingly seamless and familiar interfaces. The user encounters a conversational surface; what recedes from view are the infrastructures, incentives, values, and dependencies beneath it. And, less visibly still, the way the system learns from the interaction itself: If designed accordingly, every exchange can feed back into the model's understanding of what the user accepts, corrects, or ignores. Comprehension, in this sense, is not just obscured by opacity. It is gradually displaced by a system that adapts faster than the user's awareness of that adaptation.
In such a setting, trust can be destabilized not only by opacity, but by the ongoing techcceleration and increasing complexity itself. Systems evolve faster than humans, institutions, and legal frameworks can metabolize them. In this sense, trust breaks because people can no longer make sense of the speed, and consequences of the systems on which they are already expected to rely. This is why the question we posed on trust in the beginning of this paper cannot be reduced to a matter of personal confidence toward a tool. The larger picture is whether trust has become a central factor at the level of system design and governance. The OECD AI Principles define trustworthy AI in relation not only to robustness and security, but also to human rights, democratic values, transparency, accountability, and responsible stewardship (OECD 2019). Likewise, the NIST[2] AI Risk Management Framework emphasizes that public trust depends on managing risks across the entire lifecycle of AI systems, including social, organizational, and institutional dimensions, not merely technical performance (NIST 2023). These frameworks implicitly recognize that trust in AI is not exhausted by the question of whether an output is correct but also define – amongst others - who designed the system, under what incentives, for whose benefit, and with what capacity for contestation.
[2] American National Institute of Standards and Technology. 2023. AI Risk Management Framework (AI RMF 1.0).
Complexity, Nuance, and Miscalibrated Reliance
The widespread assumption that opacity produces mistrust while transparency restores trust misses the interwovenness of factors influencing trust on top: In AI systems, the relationship between opacity, explanation, and trust is not linear; some forms of opacity are clearly corrosive but, in many instances, an increase in explanation does not necessarily mean better judgment. In practice, users may oscillate between two equally problematic positions: deferential trust based on usability and fluency, or skeptical withdrawal triggered by complexity that exceeds their capacity to interpret. But trust seldomly is ‘either – or’. It is context and use case dependent and asks for the perspective of the individuals as much as the perspective provided from an organizational point of judgement.
What matters, therefore, is not transparency in the abstract, but intelligibility in context. To say that a system is transparent is not yet to say that its functioning has become meaningfully understandable to those affected by it. This distinction is crucial because AI is altering not only decision processes but the cognitive environment in which trust is formed. Its rhythm privileges speed, synthesis, and compressed coherence. It offers summaries, recommendations, and fluent formulations that often feel immediately actionable. Yet serious judgment frequently begins where coherence breaks down: in contradiction, ambivalence, contextual tension, and the slow work of interpretation. Complexity, in this sense, requires the active preservation of nuance rather than its premature resolution into narrative.
The empirical literature shows its importance: Studies on automation bias have consistently demonstrated that people tend to over-accept automated advice even when it is incorrect, and this pattern holds with generative AI. A 2024 empirical study of over 300 healthcare professionals in Germany found that non-specialists using AI decision support were most susceptible to automation bias, despite being the group with the most to gain from such tools (Kücking et al. 2024). A recent PRISMA-compliant systematic review of 35 studies confirmed that automation bias manifests consistently across both experienced and inexperienced users, with 2023 and 2024 representing the most productive years of research in this domain (Romeo and Conti 2025). Recent theoretical integration of the field shows that human responses to algorithmic advice are not unidirectional: people simultaneously exhibit algorithm aversion and algorithm appreciation depending on task type, interface design, and context (Jussupow, Benbasat, and Heinzl 2024). In the context of LLMs specifically, users tend to systematically overestimate response accuracy in ways that are difficult to self-correct without explicit calibration cues (Steyvers et al. 2025). These findings suggest that AI does not merely support judgment. Under certain conditions, it reorganizes the conditions under which judgment is exercised.
What gives these findings additional weight is that miscalibration operates not only at the level of individual decisions but at the level of self-knowledge. Before one can calibrate trust in AI, one first has to trust one's own agency and judgment. The problem of algorithm appreciation is not simply that people over-trust a competent tool, but that the tool becomes a substitute for a kind of self-trust that was already uncertain. So you could say that the system has absorbed an existing passivity.
What follows is that trust in AI is often miscalibrated — too high in situations of fluency and deference, too low where visible imperfection triggers rejection. In both cases, the problem is the mismatch between reliance and actual understanding.
This matters even more because the domains in which AI is used are no longer restricted to technical assistance. Public conversations around AI increasingly present these systems as companions, organizers, tutors, therapists, and meaning-making devices. Once AI moves from being a specialized tool to becoming a structuring presence in daily cognition, the trust question deepens. The issue becomes what habits of dependence, simplification, delegation, and epistemic passivity are being normalized. If users are habitually exposed to synthesized consensus rather than sustained uncertainty, they may gradually lose not only patience for nuance but the very capacity to remain with it. In that sense, the complexity of applying trust is inseparable from a larger question about what kinds of subjects and institutions AI encourages us to become. This concern is not hypothetical. Those working at the intersection of AI and creative practice have observed something concrete[3]: that when people consistently outsource cognitive tasks to AI tools without genuine engagement - reading its outputs rather than thinking through problems - the brain does not form the neural connections it would otherwise build. The risk, as it has been articulated in these conversations, is not only individual cognitive atrophy but a generational one: the gradual emergence of people who have been shaped, from early in their working lives, by environments that do not require them to question, construct, or sit with difficulty. Add to this the observation that more than half of the professional content now circulating on LinkedIn is AI-generated[4]- meaning that people are increasingly reading what machines wrote, without necessarily knowing it - and the epistemic environment begins to look less like a resource for independent judgment and more like a closed loop. The habits of synthesized consensus are not just normalizing but becoming the infrastructure of how knowledge feels.
[3] Source: the MIT Media Lab study "Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task" (Kosmyna et al., arXiv, June 2025). It used EEG to track brain activity across writing conditions and found that participants who used LLMs showed the lowest neural engagement and learning outcomes. Over repeated sessions, ChatGPT users progressively reduced their cognitive effort, and their essays were described by teachers as "soulless" — well-structured but devoid of original thought.
[4] Source: study by Originality.AI analyzing 8,795 long-form LinkedIn posts found that as of October 2024, 54% were estimated to be AI-generated — and a 2025 follow-up found that over 50% of LinkedIn's long-form content was likely AI across 99 influential profiles spanning tech, finance, healthcare, and other industries.
Designing Institutions Worthy of Trust
For this reason, trust in AI must be treated institutionally before it is treated psychologically. Rather than focusing on whether individuals trust a system, the emphasis should be on whether institutions have created the conditions for that trust to be deserved, contestable, and governable. Why? Because trust should not be understood as a feeling that emerges spontaneously from exposure to a competent tool, but as a fragile institutional achievement that must be designed into interactions long before it appears in outcomes.
This requires moving beyond the language of implementation alone. Too often, organizations approach AI governance as a matter of adding policy after the tool is designed and proposed, or of inserting ethical controls once systems are already operational. But if trust is shaped before it is measured, then comprehension must become a design requirement rather than a communication add-on. This means asking, from the outset, whether those affected by a system can form a judgment proportionate to the complexity they face. It means asking what assumptions are built into problem framing, what forms of exclusion are sedimented in the data, what dependencies are created by infrastructurechoices, and what kinds of behavior a system induces once embedded in organizational routines.
Three propositions follow from this perspective. First, enlightened trust breaks when systems evolve faster than humans do. Second, robust trust grows when people learn to navigate nuance rather than merely choose sides. Third, trust is designed in interactions long before it appears in outcomes. Taken together, these propositions suggest a shift from viewing trust as an emergent property to treating it as a design objective.
First, organizations need to create processes in which comprehension is co-produced rather than passively expected. This requires forms of design review that ask not only whether a system is efficient, but whether its introduction made relationships of reliance more intelligible. In practical terms, this may mean revisiting the way teams document assumptions, how they communicate limitations, and how they stage moments of reflection before systems become normalized.
Second, individuals and institutions need to preserve paradox long enough for learning to occur. One of the costs of contemporary AI environments is that they intensify the pressure for closure. They reward quick synthesis, actionable conclusions, and singular narratives. Digital design and digital environment further amplify that pressure. Research into scenarios of what might be called fake futures, which designate the chain that runs from manipulated information, through normalized distortion, to solidified social outcomes; suggests that the danger is not merely that individual facts become unreliable, but that the architecture of shared reality gradually shifts. Fake inputs, as this analysis has shown, do not stay fake: they consolidate into expectations, into beauty standards, into political candidates, into what people believe the world is and therefore what they plan for. In parallel, the democratic values that depend on a shared factual baseline - the capacity to disagree about interpretation while still agreeing on evidence - become harder to sustain, not because trust is actively betrayed, but because the conditions under which trust can be contested and repaired are progressively eroded. The institutional task of preserving paradox is therefore inseparable from a prior task: maintaining the shared reference points within which productive disagreement remains possible. Yet trustworthy judgment often requires the opposite: the ability to hold together competing truths, unresolved tensions, and divergent futures without collapsing complexity too early. This is especially important in governance contexts, where the pressure to produce certainty can conceal ethical ambiguity rather than resolve it.
Third, trust must be understood in relation to power and dependency. The more organizations, publics, and states rely on AI systems built elsewhere; the less trust can be separated from questions of sovereignty. This point has become particularly acute given the dependence on infrastructures developed predominantly in the United States and China. It has intensified concerns about cultural representation, regulatory asymmetry, and strategic vulnerability. In conversations with infrastructure architects and platform engineers building within this environment, a sharper formulation of the problem has emerged: Every interaction with an AI system built and hosted elsewhere is, in a structural sense, a one-way knowledge export: the capabilities of the organization using the system are deposited into infrastructures it does not own, incrementally strengthening systems that are not accountable to it. The geopolitical implications of this have become increasingly concrete. The Romanian presidential elections offered one unusually visible illustration: a candidate rose to frontrunner status through activity on foreign-controlled digital platforms, without appearing in a single domestic news outlet, and without most citizens in the country having any prior awareness of who he was. Digital sovereignty and national sovereignty, in that moment, turned out to be the same thing. Europe's response - building consumer-facing regulations while the infrastructure itself remains elsewhere. This could be described by some as "playing defensive while others play offensive." The question being raised is not simply whether we trust the tool, but whether we have preserved enough of the conditions that would make it possible to answer that question for ourselves. This reformulation shifts the object of trust from the interface performance to the infrastructural power.
Here the role of regulation remains indispensable, but not sufficient. Regulation can establish floors, obligations, and rights. It can limit harmful uses, assign liability, and increase procedural accountability. But trust cannot be produced by compliance alone. If governance is conceived only as ex post control, it will remain reactive to systems whose pace already exceeds institutional response. What is needed is a combination of governance and design, of public frameworks and relational practices, of accountability structures and forms of participation that allow those affected to remain subjects of the system rather than merely objects within it. This brings a completely different view on what institutional governance and regulation must become in the age of AI, moving from conceptual top-down frameworks, to becoming embedded conditions of technological exercise, where the objective is not only to regulate systems, but to preserve the conditions for equal, meaningful, and sovereign participation in increasingly mediated social and technological environments.
Conclusion
The challenge before us is not to decide once and for all whether AI deserves trust. The deeper challenge is to understand that AI is transforming trust from a relational outcome into an infrastructural precondition. We increasingly enter systems we do not fully understand, rely on layers we cannot directly inspect, and inhabit environments shaped by actors, incentives, and dependencies that remain only partially visible. Under these conditions, trust cannot be reduced to confidence in a tool. It becomes a question about the social, political, and institutional architecture within which that tool is embedded.
This is why the future of trustworthy AI will not be secured by technical performance alone. It will depend on whether we can build institutions capable of making complexity intelligible without pretending to eliminate it, of preserving nuance in environments biased toward speed, and of distributing agency in systems that otherwise centralize power. Trust, in this emerging landscape, is no longer a soft sentiment attached to usability. It is a hard institutional achievement. If AI is reorganizing the conditions under which trust is formed, then our task is not only to improve machines. It is to make the worlds around them more worthy of reliance.
About the Authors
Eva Simone Lihotzky is a technology and ethics specialist working at the intersection of artificial intelligence, organizational transformation, and systems thinking. She served as Managing Director of the Serviceplan Group AI Lab and contributes to the Value AI Institute as Chair on Moral AI. Her work focuses on integrating ethical considerations, human agency, and trust into the design and deployment of AI systems across organizations. She is co-author of 10 Moral Questions: How to Design Tech and AI Responsibly and host of The in-between Tech & Trust Podcast, exploring trust, technology, and societal transformation .
Imane Berjamy is a business strategist and a tech for industries expert. She is a Cultural Studies researcher and the program coordinator of the Value AI Institute’s Chair Program, dedicated to connecting thought leaders exploring the societal and strategic implications of AI. She holds a master degree in Corporate Strategy from Sciences Po Paris, and a research master in Cultural Studies from University of Paul Valery of Montpellier.
About the Value AI Institute
The Value AI Institute is an independent and global think-thank dedicated to advancing human-centered and responsible artificial intelligence. It brings together experts from technology, business, governance, and the arts to explore the practical, ethical, societal, and governance challenges raised by AI. Through its Chair programs, the Institute aims to foster interdisciplinary dialogue and develop though leadership and frameworks for ensuring that AI systems create value while remaining aligned with human values, responsibility, and trust.
References
Bostrom, Nick, and Eliezer Yudkowsky. “The Ethics of Artificial Intelligence.” In The Cambridge Handbook of Artificial Intelligence, edited by Keith Frankish and William M. Ramsey, 316–334. Cambridge: Cambridge University Press, 2014.
Crawford, Kate. Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence. New Haven: Yale University Press, 2021.
Dietvorst, Berkeley J., Joseph P. Simmons, and Cade Massey. “Algorithm Aversion: People Erroneously Avoid Algorithms After Seeing Them Err.” Journal of Experimental Psychology: General 144, no. 1 (2015): 114–126.
Dunbar, Robin. Grooming, Gossip, and the Evolution of Language. Cambridge, MA: Harvard University Press, 1996.
Erikson, Erik H. Childhood and Society. New York: W. W. Norton & Company, 1950.
Goddard, Kate, Babak Roudsari, and Jeremy C. Wyatt. “Automation Bias: A Systematic Review of Frequency, Effect Mediators, and Mitigators.” Journal of the American Medical Informatics Association 19, no. 1 (2012): 121–127.
Goddard, Kate, Babak Roudsari, and Jeremy C. Wyatt. “Automation Bias: Empirical Results Assessing Influencing Factors.” International Journal of Medical Informatics 83, no. 5 (2014): 368–375.
Henrich, Joseph. The Secret of Our Success: How Culture Is Driving Human Evolution, Domesticating Our Species, and Making Us Smarter. Princeton: Princeton University Press, 2015.
Jussupow, Ekaterina, Izak Benbasat, and Armin Heinzl. “An Integrative Perspective on Algorithm Aversion and Appreciation in Decision-Making.” MIS Quarterly 48, no. 4 (2024): 1575–1590.
Kücking, Florian, Ursula Hübner, Mareike Przysucha, Niels Hannemann, Jan-Oliver Kutza, Maurice Moelleken, Cornelia Erfurt-Berge, Joachim Dissemond, Birgit Babitsch, and Dorothee Busch. “Automation Bias in AI-Decision Support: Results from an Empirical Study.” Studies in Health Technology and Informatics 317 (2024): 298–304.
Lee, John D., and Katrina A. See. “Trust in Automation: Designing for Appropriate Reliance.” Human Factors 46, no. 1 (2004): 50–80.
Lihotzky, Eva Simone, Marc Roman Franke, and Patrick Laske. 10 Moral Questions: How to Design Tech and AI Responsibly. Munich: House of Beautiful Business / Q Collective, 2024.
Logg, Jennifer M., Julia A. Minson, and Don A. Moore. “Algorithm Appreciation: People Prefer Algorithmic to Human Judgment.” Organizational Behavior and Human Decision Processes 151 (2019): 90–103.
Luhmann, Niklas. Trust and Power. Chichester: Wiley, 1979.
Nass, Clifford, and Byron Reeves. The Media Equation: How People Treat Computers, Television, and New Media Like Real People and Places. Stanford, CA: CSLI Publications, 1996.
National Institute of Standards and Technology (NIST). Artificial Intelligence Risk Management Framework (AI RMF 1.0). Gaithersburg, MD: U.S. Department of Commerce, 2023.
Organisation for Economic Co-operation and Development (OECD). OECD Principles on Artificial Intelligence. Paris: OECD, 2019.
Romeo, Giuseppe, and Daniela Conti. “Exploring Automation Bias in Human–AI Collaboration: A Review and Implications for Explainable AI.” AI & Society. Springer, 2025. https://doi.org/10.1007/s00146-025-02422-7
Steyvers, Mark, Heliodoro Tejeda, Aakriti Kumar, Catarina G. Belém, Sheer Karny, Xinyue Hu, Lukas W. Mayer, and Padhraic Smyth. “What Large Language Models Know and What People Think They Know.” Nature Machine Intelligence 7 (2025): 221–231. https://doi.org/10.1038/s42256-024-00976-7
Tomasello, Michael. Why We Cooperate. Cambridge, MA: MIT Press, 2009.
Turkle, Sherry. Alone Together: Why We Expect More from Technology and Less from Each Other. New York: Basic Books, 2011.
Weizenbaum, Joseph. Computer Power and Human Reason: From Judgment to Calculation. San Francisco: W. H. Freeman, 1976.