Michael Timothy Bennett -

Computational Dualism and Objective Superintelligence

Michael Timothy Bennett

November 12, 2024

Awarded "Best Student Paper" at the 17th Conference on Artificial General Intelligence, 2024The concept of intelligent software is flawed. The behaviour of software is determined by the hardware that "interprets" it. This undermines claims regarding the behaviour of theorised, software superintelligence. Here we characterise this problem as "computational dualism", where instead of mental and physical substance, we have software and hardware. We argue that to make objective claims regarding performance we must avoid computational dualism. We propose a pancomputational alternative wherein every aspect of the environment is a relation between irreducible states. We formalise systems as behaviour (inputs and outputs), and cognition as embodied, embedded, extended and enactive. The result is cognition formalised as a part of the environment, rather than as a disembodied policy interacting with the environment through an interpreter. This allows us to make objective claims regarding intelligence, which we argue is the ability to "generalise", identify causes and adapt. We then establish objective upper bounds for intelligent behaviour. This suggests AGI will be safer, but more limited, than theorised.

Emergent Causality and the Foundation of Consciousness

Michael Timothy Bennett

April 17, 2024

Awarded “Best Student Paper” and published in Proceedings of The 16th International Conference on Artificial General Intelligence, Stockholm, 2023.To make accurate inferences in an interactive setting, an agent must not confuse passive observation of events with having intervened to cause them. The do operator formalises interventions so that we may reason about their effect. Yet there exist pareto optimal mathematical formalisms of general intelligence in an interactive setting which, presupposing no explicit representation of intervention, make maximally accurate inferences. We examine one such formalism. We show that in the absence of a do operator, an intervention can be represented by a variable. We then argue that variables are abstractions, and that need to explicitly represent interventions in advance arises only because we presuppose these sorts of abstractions. The aforementioned formalism avoids this and so, initial conditions permitting, representations of relevant causal interventions will emerge through induction. These emergent abstractions function as representations of one’s self and of any other object, inasmuch as the interventions of those objects impact the satisfaction of goals. We argue that this explains how one might reason about one’s own identity and intent, those of others, of one’s own as perceived by others and so on. In a narrow sense this describes what it is to be aware, and is a mechanistic explanation of aspects of consciousness.

Is Complexity an Illusion?

Michael Timothy Bennett

June 07, 2024

Simplicity is held by many to be the key to general intelligence. Simpler models tend to “generalise”, identifying the cause or generator of data with greater sample efficiency. The implications of the correlation between simplicity and generalisation extend far beyond computer science, addressing questions of physics and even biology. Yet simplicity is a property of form, while generalisation is of function. In interactive settings, any correlation between the two depends on interpretation. In theory there could be no correlation and yet in practice, there is. Previous theoretical work showed generalisation to be a consequence of “weak” constraints implied by function, not form. Experiments demonstrated choosing weak constraints over simple forms yielded a 110-500% improvement in generalisation rate. Here we show that all constraints can take equally simple forms, regardless of weakness. However if forms are spatially extended, then function is represented using a finite subset of forms. If function is represented using a finite subset of forms, then we can force a correlation between simplicity and generalisation by making weak constraints take simple forms. If function is determined by a goal directed process that favours versatility (e.g. natural selection), then efficiency demands weak constraints take simple forms. Complexity has no causal influence on generalisation, but appears to due to confounding.In Press: Accepted for publication in the Proceedings of The 17th Conference on Artificial General Intelligence, 2024

The Optimal Choice of Hypothesis Is the Weakest, Not the Shortest

Michael Timothy Bennett

April 17, 2024

If A and B are sets such that A is a subset of B, generalisation may be understood as the inference from A of a hypothesis sufficient to construct B. One might infer any number of hypotheses from A, yet only some of those may generalise to B. How can one know which are likely to generalise? One strategy is to choose the shortest, equating the ability to compress information with the ability to generalise (a “proxy for intelligence”). We examine this in the context of a mathematical formalism of enactive cognition. We show that compression is neither necessary nor sufficient to maximise performance (measured in terms of the probability of a hypothesis generalising). We formulate a proxy unrelated to length or simplicity, called weakness. We show that if tasks are uniformly distributed, then there is no choice of proxy that performs at least as well as weakness maximisation in all tasks while performing strictly better in at least one. In other words, weakness is the pareto optimal choice of proxy. In experiments comparing maximum weakness and minimum description length in the context of binary arithmetic, the former generalised at between 1.1 and 5 times the rate of the latter. We argue this demonstrates that weakness is a far better proxy, and explains why Deepmind’s Apperception Engine is able to generalise effectively.

Are Biological Systems More Intelligent Than Artificial Intelligence?

Michael Timothy Bennett

January 07, 2025

Is a biological self-organising system more `intelligent' than an artificial intelligence? If so, why? We frame intelligence as adaptability, and explore this question using a mathematical formalism of enactive causal learning. We extend it to formalise the multilayer, multiscale, bottom-up distributed computational architecture of biological self-organisation. We then show that this architecture allows for more efficient adaptation than the static top-down interpreters typically used in computers. To put it provocatively, biology is more intelligent because cells adapt to provide a helpful inductive bias, and static interpreters do not. We call this multilayer-causal-learning. However it inherits a flaw of biological self-organisation. Cells become cancerous when isolated from the collective informational structure, reverting to primitive transcriptional behaviour. We show that, in the context of our formalism, failure states like cancer occur when systems are too tightly constrained by the abstraction layer in which they exist. This suggests control should be distributed (bottom-up rather than top-down) to ensure graceful degradation. We speculate about what this implies for systems in general, from machine learning hardware to human organisational and economic systems. Our result shows how we can design more robust systems and, though theoretical in nature, it lays a foundation for future empirical research.

Why Is Anything Conscious?

Michael Timothy Bennett

and 2 more

December 27, 2024

We tackle the hard problem of consciousness taking the naturally selected, embodied organism as our starting point. We provide a formalism describing how biological systems self-organise to hierarchically interpret unlabelled sensory information according to valence. Such interpretations imply behavioural policies which are differentiated from each other only by the qualitative aspect of information processing. Natural selection favours systems that intervene in the world to achieve homeostatic and reproductive goals. Quality is a property arising in such systems to link cause to affect to motivate interventions. This produces interoceptive and exteroceptive classifiers and determines priorities. In formalising the seminal distinction between access and phenomenal consciousness, we claim that access consciousness at the human level requires the ability to hierarchically model i) the self, ii) the world/others and iii) the self as modelled by others, and that this requires phenomenal consciousness. Phenomenal without access consciousness is likely common, but the reverse is implausible. To put it provocatively: death grounds meaning, and Nature does not like zombies. We then describe the multilayered architecture of self-organisation from rocks to Einstein, illustrating how our argument applies. Our proposal lays the foundation of a formal science of consciousness, closer to human fact than zombie fiction.

On the Computation of Meaning, Language Models and Incomprehensible Horrors

Michael Timothy Bennett

April 17, 2024

Accepted for full oral presentation at the 16th Conference on Artificial General Intelligence, taking place in Stockholm, 2023. We integrate foundational theories of meaning with a mathematical formalism of artificial general intelligence (AGI) to offer a comprehensive mechanistic explanation of meaning, communication, and symbol emergence. This synthesis holds significance for both AGI and broader debates concerning the nature of language, as it unifies pragmatics, logical truth conditional semantics, Peircean semiotics, and a computable model of enactive cognition, addressing phenomena that have traditionally evaded mechanistic explanation. By examining the conditions under which a machine can generate meaningful utterances or comprehend human meaning, we establish that the current generation of language models do not possess the same understanding of meaning as humans nor intend any meaning that we might attribute to their responses. To address this, we propose simulating human feelings and optimising models to construct weak representations. Our findings shed light on the relationship between meaning and intelligence, and how we can build machines that comprehend and intend meaning.

Computable Artificial General Intelligence

Michael Timothy Bennett

November 30, 2022

Artificial general intelligence (AGI) may herald our extinction, according to AI safety research. Yet claims regarding AGI must rely upon mathematical formalisms – theoretical agents we may analyse or attempt to build. AIXI appears to be the only such formalism supported by proof that its behaviour is optimal, a consequence of its use of compression as a proxy for intelligence. Unfortunately, AIXI is incomputable and claims regarding its behaviour highly subjective. We argue that this is because AIXI formalises cognition as taking place in isolation from the environment in which goals are pursued (Cartesian dualism). We propose an alternative, supported by proof and experiment, which overcomes these problems. Integrating research from cognitive science with AI, we formalise an enactive model of learning and reasoning to address the problem of subjectivity. This allows us to formulate a different proxy for intelligence, called weakness, which addresses the problem of incomputability. We prove optimal behaviour is attained when weakness is maximised. This proof is supplemented by experimental results comparing weakness and description length (the closest analogue to compression possible without reintroducing subjectivity). Weakness outperforms description length, suggesting it is a better proxy. Furthermore we show that, if cognition is enactive, then minimisation of description length is neither necessary nor sufficient to attain optimal performance. These results undermine the notion that compression is closely related to intelligence. We conclude with a discussion of limitations, implications and future research. There remain several open questions regarding the implementation of scale-able general intelligence. In the short term, these results may be best utilised to improve the performance of existing systems. For example, our results explain why Deepmind’s Apperception Engine is able to generalise effectively, and how to replicate that performance by maximising weakness. Likewise in the context of neural networks, our results suggest both limitations of “scale is all you need”, and how those limitations can be overcome.