Reading Galileo and Descartes Critically from the Lens of Determinism and Deep Learning

This post was the result of reading the 2 following sections from the Introduction, written by Peter Millican, of An Enquiry Concerning Human Understanding, Oxford University Press 2007:

From Ancient to Modern Cosmology
From Aristotelian to Cartesian Intelligibility

In particular, the following excerpt ignited my inquiry to this question: Is that the beginning of AI’s Determinism?, which as it turns out signifies the entire concept of AI’s determinism not only possible, but inevitable

Beginning of the 2nd paragraph of From Aristotelian to Cartesian Intelligibility, page xiv
“Galileo and Descartes between them established a new way of understanding the physical world, replacing purposive strivings (what Aristotle had called ‘final’ causes) by mathematically formulated laws framed exclusively in terms of mechanical, ‘efficient’ causation”.
- Peter Millican

Philosophical Ground Zero for Computational Determinism#

The shift from Aristotle’s “final causes” (see below) to the mechanistic, mathematical framework of Galileo and Descartes is arguably the foundational philosophical moment for modern determinism, which is the bedrock on which AI operates. Both of them signaled a crucial turning point in Western thought.

The Aristotelian World: Purpose and Striving

The concept of the “final cause” most systematically introduced in Aristotle’s first volume of Physics, which is LCL 228 in the Loeb series. The section begins at this Bekker number 194b 16

In the Aristotelian view, everything in nature has an intrinsic purpose or goal (telos). To understand anything, we had to understand its final cause - the ultimate goal for which it exists. For example,

An acorn’s final cause is to become an oak tree. All its biological processes are directed toward this end.

A rock falls because its final cause is to reach its natural resting place at the center of the earth. It “desires” to be there.

The universe is a purposeful system, filled with strivings and inherent aims. The operative question is “Why?”, i.e. final causation

This revolution was about changing the very question we ask about the world. Instead of asking “Why?” in a purposive sense, science began to ask “How?” in a mechanical sense. To understand this connection, it’s essential to grasp the monumental change in worldview that occurred.

Galileo and Descartes argued that the universe is not striving for anything; it is simply operating according to fixed, mathematical laws. A rock falls because the force of gravity acts upon its mass. The universe is not an organism but a mechanism, like a giant clock (a clockwork mechanism). To understand it, we don’t need to know its purpose; we need to know the mathematical laws governing its parts. The only question that matters for science is “How?”, focusing efficient causation, i.e. processes that bring things about rather than purposes

Galileo focused on describing motion with mathematical formulas ( $s = \frac{1}{2}at^2$ ). He didn’t ask why the ball rolled down the incline in a purposive sense; he described precisely how it did so based on forces acting upon it.

The beginning of Day 3 from the book Two New Sciences (original Italian title: Discorsi e Dimostrazioni Matematiche, intorno a due nuove scienze), where Galileo Galilei introduced concept of inertia, which overthrew the Aristotelian physics
Descartes went further, proposing that all of physical reality (res extensa) is just matter in motion. The entire physical world, including animal bodies, could be understood as a complex automaton, governed entirely by mechanical, efficient causation (the direct push-pull of physical contact).
TIP
The book that focuses on this topic is René Descartes’ most comprehensive work on physics and cosmology, Principles of Philosophy (Principia Philosophiae), published in 1644.

The famous Latin phrase “cogito, ergo sum” as appeared in in original (1644) Principia Philosophae (the bottom of the page above)

This is Descartes’ magnum opus on natural philosophy, intended to be a systematic replacement for the Aristotelian texts used in universities. The specific ideas mentioned in Peter Millican’s commentary are laid out in detail here:
- Part II: The Principles of Material Things: This is where Descartes formally argues that the essence of matter is simply spatial extension. From this single idea, he derives its properties, including its passivity and the principle of inertia.
- Part III: Of the Visible World: This section is dedicated to cosmology. Descartes explains why the universe must be a plenum (completely filled with matter) and then elaborates his famous theory of vortices. He uses these celestial whirlpools of matter to provide a purely mechanical explanation for the orbits of the planets around the Sun, without resorting to forces acting at a distance like gravity.
The Earlier, Unpublished Version: The World (c. 1633) 🤫
It is also important to know that Descartes had fully articulated this entire mechanical universe, including the vortex theory, in an earlier book titled The World (Le Monde ou Traité de la lumière).
He wrote this book around 1633 but famously suppressed its publication after hearing that Galileo had been condemned by the Inquisition for defending a similar heliocentric system. He did not want to enter into conflict with the Church. The work was only published posthumously in 1664.
Therefore, while The World contains the original formulation of these ideas, Principles of Philosophy is the definitive, mature, and intentionally published work where Descartes formally presented his complete mechanical vision of the universe.

This shift from a purposeful cosmos to a mechanical one is the birth of modern scientific determinism. If the universe is just a machine executing a set of rules, then its future is, in principle, perfectly predictable from its present state. If we knew the initial position, mass, and velocity of every particle, and we knew the mathematical laws of physics, we could calculate the entire future of the cosmos. This is the very definition of mechanical determinism. The most famous expression of this idea came later from Pierre-Simon Laplace, who imagined an intellect (now called Laplace’s Demon) that could know the entire past and future from a single snapshot of the present. There is no choice, no striving - only the relentless ticking of cause and effect.

From the Clockwork Universe to AI#

Once the universe is seen as a giant, deterministic machine, it’s a short leap to see the human mind - and by extension, intelligence itself - in the same way. This created a direct intellectual line to modern AI.

Building upon the Clockwork Universe by Galileo, Descartes, and Newton that the physical world is a deterministic system governed by mathematical laws, thinkers like Thomas Hobbes then argued that human reasoning was nothing more than a form of calculation (“Reasoning is but reckoning”). This extended the mechanical view from physics to thought itself. This idea culminated in Alan Turing’s theoretical “Turing Machine,” an abstract model of a device that could compute anything that is computable. A Turing machine is the epitome of determinism: given a set of instructions (a program) and an initial state (input), its every subsequent action is rigidly determined. A modern computer is a physical realization of a Turing machine. A classical AI program is a set of instructions running on this deterministic hardware. Given the same input, it will always produce the same output, following its programmed rules without any purpose or “striving” of its own. It is the perfect expression of a universe of efficient, not final, causes.

AI is Deterministic
Given the same input and the same internal state (its weights), a standard AI model will always produce the exact same output. The “randomness” we often see is either pseudo-randomness (an algorithm that appears random but is perfectly predictable) or a result of slightly different inputs (like conversation history). The underlying process is as deterministic as a clock.

So, when Galileo and Descartes threw out the universe’s “soul” and “purpose” and replaced it with mathematical laws of motion, they were laying the necessary philosophical groundwork. They created a vision of reality as a computable system. Without that fundamental shift, the idea of creating an artificial intelligence by manipulating symbols and numbers according to formal rules—the very project of AI—would be philosophically inconceivable.

From AI to Modern AI#

While this deterministic lineage holds true for classical AI, it’s worth noting that modern machine learning introduces a fascinating wrinkle. In a deep learning model, while the underlying computation on the silicon chip is still perfectly deterministic, the resulting behavior is emergent. We don’t program the rules for “recognizing a cat” directly; we create a system that learns the rules from data. The final system works, but its internal logic is often a “black box,” not transparently designed like a simple clockwork mechanism.

So, while the Galilean and Cartesian revolution was the necessary starting point for any form of computation, modern AI is evolving from a simple deterministic clockwork into a system whose complexity makes its behavior appear almost organic, creating a new and fascinating chapter in this long history.

Does that mean the deep learning or future AI will be non-deterministic? It should be noted that current deep learning is not truly non-deterministic, but its behavior is so complex that it appears to be. Even the most advanced deep learning model today is, at its core, a deterministic system running on a deterministic machine. If we could perfectly control every variable, the result would be perfectly repeatable.

Underlying Hardware: The silicon chips (CPUs, GPUs) that run AI models operate on the principles of classical physics. Given the same input and the same initial state, they will execute the same instructions and produce the exact same output.
Pseudo-randomness: The “randomness” used in training AI models (e.g., for initializing the network’s weights or in processes like “dropout”) is not true randomness. It’s pseudo-randomness - a complex but deterministic algorithm that produces a sequence of numbers that looks random. If we use the same starting “seed” for this algorithm, we will get the exact same sequence of “random” numbers every time.

TIP
A key distinction exists between emergent behavior and non-determinism. The behavior of a flock of birds is emergent; it arises from simple, deterministic rules followed by each bird, but the flock’s overall pattern is complex and not explicitly programmed. Similarly, an AI’s ability to “recognize a cat” is an emergent property of a vast network of simple, deterministic calculations. The behavior is unpredictable in practice but deterministic in principle.

From Modern AI to Future AI - The Path to True Non-Determinism#

While current deep learning is not truly non-deterministic. Future AI, however, could become genuinely non-deterministic if incorporated with new types of hardware. For an AI to become truly non-deterministic, it would have to be based on processes that are fundamentally random, not just complex. This is where future hardware comes in.

Quantum Computing is the most promising path. A quantum computer’s operations are based on the principles of quantum mechanics, which are inherently probabilistic. An AI running on quantum hardware could base a “decision” on the outcome of a genuinely random quantum event (like the measurement of a qubit’s state). This would break the deterministic chain of cause and effect, making its behavior fundamentally unpredictable.

This image depicts a “quantum corral” - a precisely arranged ring of atoms on a surface that traps and confines electrons. The ripples within the corral represent the probability waves of these electrons. In quantum mechanics, the exact position or momentum of an electron cannot be simultaneously known with absolute certainty; instead, we can only predict the probability of finding it in a particular location. This intrinsic unpredictability, or non-determinism, is a fundamental characteristic of the quantum world, illustrating that reality at the smallest scales is not fixed but rather a realm of probabilities.”

A Quantum Revolution (1925-1926) - A Glimpse into the Probabilistic Universe#

By 1924, the old Bohr model of the atom was failing. Physicists knew a new theory was needed. Two prominent figures De Broglie and Heisenberg provided the first two breakthroughs (although they attacked the problem from completely opposite directions).

The Bohr Model
The Bohr model was an early, pivotal model of the atom, proposed by Niels Bohr in 1913, which pictured electrons orbiting the nucleus in specific, fixed circular paths, much like planets orbiting the sun. It was the first model to introduce the idea of quantization to the structure of the atom, serving as a crucial bridge between classical physics and the new world of quantum mechanics.
Key Features of the Bohr Model include:

Fixed Orbits: Unlike the classical model where an electron could orbit at any distance, Bohr proposed that electrons could only exist in specific, discrete circular orbits or “shells” around the nucleus.

Quantized Energy Levels: Each of these allowed orbits corresponded to a specific, fixed energy level. Electrons in orbits closer to the nucleus had lower energy, while those in orbits farther away had higher energy.

Quantum Jumps: Bohr stated that an electron could “jump” between these allowed orbits by absorbing or emitting a precise amount of energy (a photon) corresponding to the exact difference in energy between the two levels. It could not exist in between orbits.

Although the Bohr model perfectly explained the spectral lines of the hydrogen atom by matching its calculated energies of the photons emitted during the “quantum jumps” in this model and the observed emission spectrum of hydrogen with incredible accuracy, it was ultimately a dead end, because it was an ad-hoc mixture of classical and quantum ideas that completely failed to predict the spectral lines for any atom with more than one electron (like helium). It also couldn’t explain why some spectral lines are brighter than others or the effect of magnetic fields on spectra.
The Bohr model was a brilliant and necessary stepping stone, but it was replaced around 1925 by the far more complete and correct theories of matrix mechanics and wave mechanics.

In late 1924, Louis de Broglie presented his radical hypothesis that all matter exhibits wave-like properties in his PhD thesis. He proposed that if waves (light) could act like particles, then particles (like electrons) could act like waves. The central equation is

\lambda=\frac{h}{p}

where $\lambda$ is de Broglie wavelength, $h$ the Planck constant, and $p$ the momentum.

On July 1925, Heisenberg’s Matrix Mechanics was born. Werner Heisenberg, seeking to build a theory based only on observable quantities, submitted his groundbreaking paper “On a Quantum-Theoretical Re-interpretation of Kinematic and Mechanical Relations.” This was the foundation of matrix mechanics.

Heisenberg was deeply troubled by the contradictions of the old Bohr model, which still talked about unobservable things like the precise path and position of an electron in its “orbit.” He decided that a true physical theory must be built only on quantities that can actually be measured (observables). What can we actually observe from an atom? We can’t see the electron orbiting. What we can see is the light it emits or absorbs when it jumps between energy levels. This light has two measurable properties:

Frequency (which gives its color)
Intensity (which gives its brightness)

Heisenberg’s revolutionary idea was to build a new mechanics from the ground up using only these observable transition properties, completely throwing away any notion of electron orbits. He started by trying to organize the relationships between these transitions and represented the collection of all possible transitions in an atom as a two-dimensional array of numbers. For example, the number in the $m$ ‘th row and $n$ ‘th column would represent the transition from energy state $n$ to state $m$ .

When Heisenberg showed his work to his mentor, Max Born, Born realized that Heisenberg’s arrays and the strange multiplication rules he had devised were identical to the matrices of pure mathematics. This was the breakthrough. The physical properties of an atomic system (like position, momentum, energy, etc.) would no longer be represented by simple numbers, but by infinite-dimensional matrices.

Why Don’t We Learn It First?
Why does Quantum Mechanics taught in universities begins with Schrödinger’s equation even though matrix mechanics came first? There are 2 reasons

It’s more intuitive: The idea of waves is easier for the human mind to visualize than abstract, infinite-dimensional matrices.

The math is more familiar: Most physics students learn differential equations before they learn the advanced linear algebra required for matrix mechanics.

Later, Schrödinger and others proved that wave mechanics and matrix mechanics were mathematically equivalent. They are two different languages describing the same, strange quantum reality.

Matrix Mechanics#

Finding a conventional textbook that teaches quantum mechanics from the historical matrix mechanics perspective is challenging, as the approach was quickly superseded for pedagogical reasons by Schrödinger’s wave mechanics. What we can do in another direction is to take a first step by looking at the very original paper Heisenberg

The paper that founded matrix mechanics

Original Title: “Über quantentheoretische Umdeutung kinematischer und mechanischer Beziehungen”

English Translation: “On a Quantum-Theoretical Re-interpretation of Kinematic and Mechanical Relations”

Author: Werner Heisenberg

Submitted: July 1925

Reading this paper cold is exceptionally difficult, but not because the math is overly complex. It’s difficult because Heisenberg was inventing a new way of thinking, and the familiar language of matrices hadn’t even been applied yet. He was working with arrays of numbers and a novel multiplication rule he had discovered.

We will need a guide as the next step: luckily there is an indispensable tool for this exact purpose: Sources of Quantum Mechanics by B. L. van der Waerden. This book provides a superb English translation of Heisenberg’s paper, but more importantly, it wraps it in historical commentary that explains the context, the terminology, and the importance of each step in the argument. It’s less of a textbook and more of a guided tour of the original discovery.

Learning Quantum Mechanics

Introduction to Quantum Mechanics, 3rd Edition, David J. Griffiths & Darrell F. Schroeter

It is vital to recognize that Quantum Mechanics is rooted in intuition. This subtle beginning liberates the field from the rigid logic of observable reality. It describes a universe that sits outside our standard definition of ‘existence’ - a definition typically tethered to the deterministic laws of our sensory experience. Because the theory cannot be derived from classical first principles, it must arise from a leap in thought. Thus, the Schrödinger equation stands as a heuristic masterpiece - essentially, a guess.

TIP
Schrödinger didn’t derive
$i\hbar \frac{\partial}{\partial t}\Psi(x,t) = \hat{H}\Psi(x,t)$
He postulated it based on the wave-particle duality intuition (De Broglie).