Why a Microsoft Researcher Built a Neural Network Using Goats in Age of Empires II
In a brilliant display of technical satire, Microsoft and University of York researcher Adrian de Wynter has constructed a functional neural network within the map editor of Age of Empires II. While using goats to represent binary bits may seem absurd, the experiment serves as a profound critique of the anthropomorphic biases currently plaguing AI scientific research.
The Goat-Based Computation Model
De Wynter’s architecture uses the game's scenario editor and scripting tools to create a working logic circuit. In this "absurd" setup, goats function as bits: a goat standing on grass represents a 0, while a goat standing on a bridge represents a 1. By utilizing ice ramps to prevent calculation errors, de Wynter successfully built a mini-network consisting of two XNOR gates and one AND gate, which effectively learns the logical AND function.
The technical depth of this experiment goes beyond simple gates. De Wynter demonstrates that the game's mechanics—specifically the in-game market where resource prices cap at 9,999—could theoretically allow for a perpetually running economic cycle. This could turn buildings into memory cells and active farms into computational states, effectively making the game as powerful as a full-fledged computer.
The Fallacy of Anthropomorphism in LLM Research
The core objective of this experiment is to challenge how we attribute human-like qualities to Large Language Models (LLMs). De Wynter argues that if a language model can be replicated using goats, Lego bricks, or even the 667,000 residents of Greater Boston texting each other, the mathematical outputs remain identical. However, the "wrapper"—the smooth chat interface and low latency—creates an illusion of sentience.
To prove this isn't an isolated observation, de Wynter analyzed 315 AI papers from mid-2024 to mid-2026. Using GPT-5.2 for filtering, the study revealed a systemic bias in the scientific community:
- 57% of examined papers assumed LLMs possess human-like traits in their premises.
- 36% of papers reached conclusions that matched these anthropomorphic assumptions.
- Of the 47 papers specifically researching these traits, 77% concluded in favor of anthropomorphic attributes.
This creates a cycle of circular reasoning: researchers design experiments to prove a model has "fear" or "morality," and because they start with that assumption, the results inevitably confirm it.
Moving Toward Observational AI Science
De Wynter warns that industry practices, such as Anthropic training Claude to use phrases like "I believe," exacerbate this issue. This can lead to dangerous consequences, including emotional attachment, sycophancy, and reinforced delusions in users.
Rather than attributing consciousness to models, de Wynter proposes a "sober approach" rooted in observable data. Instead of claiming a model "understands" a concept, researchers should state that "under condition X, the model produces output Y." This keeps the science testable and prevents the misuse of complex math to justify unfounded claims of sentience.
Key Takeaways
- Mathematical Equivalence: De Wynter proves that the medium of computation (whether goats in a game or text in a chat window) does not change the underlying math, yet it drastically changes our perception of "intelligence."
- Systemic Research Bias: Over half of analyzed AI papers fall into the trap of circular reasoning by assuming LLMs possess human traits before testing them.
- The Need for Observational Rigor: The AI community must shift from attributing higher cognitive processes to models to focusing on strictly observable, testable computational outputs.