Cosmological Simulations: The Universe in a Computer

6 min read

Cosmological simulations have become essential because we cannot experimentally manipulate the actual universe.

N-body methods track dark matter through hierarchical algorithms, with resolution and box size in perpetual tension.

Hydrodynamical simulations must rely on subgrid prescriptions for star formation and feedback, calibrated rather than derived.

Modern surveys depend on simulations for covariance matrices, pipeline validation, and parameter emulation.

The boundary between theory, observation, and computation has grown genuinely difficult to draw.

Consider a peculiar inversion of the scientific method. In most disciplines, we conduct experiments to test our theories against reality. But cosmology denies us this luxury. We cannot rerun the Big Bang, perturb the initial conditions, or wait a Hubble time for structures to form. The universe offers itself only once, and we are latecomers to its unfolding.

And yet, over the past four decades, cosmologists have discovered a remarkable workaround. Inside supercomputers spanning continents, we now grow synthetic universes from quantum seeds, watch dark matter halos collapse, and witness galaxies ignite within their gravitational wells. These are not mere visualizations but rigorous numerical experiments, each one a digital cosmos governed by general relativity, hydrodynamics, and the messy thermochemistry of baryons.

The implications cut to the heart of how cosmology operates as a science. Simulations have become the indispensable bridge between the linear, analytically tractable physics of the early universe and the deeply nonlinear, observationally rich present. They are how we translate the parameters of Lambda-CDM into predictions for galaxy clustering, weak lensing maps, and the Lyman-alpha forest. To understand modern cosmology is, increasingly, to understand the architecture, limitations, and quiet philosophical strangeness of the computational universe.

N-body Methods: Choreographing Dark Matter

At the foundation of cosmological simulation lies a deceptively simple problem: track a collection of massive particles under their mutual gravitational attraction within an expanding spacetime. This is the N-body problem, and for cosmology it must be solved with billions of particles, across cosmic time, in a representative volume of the universe.

The naive approach—computing pairwise forces between all particles—scales as N squared and becomes catastrophic beyond a few thousand particles. Modern simulations sidestep this through hierarchical algorithms. Tree codes partition space recursively, approximating distant clusters of particles as single multipole sources. Particle-mesh methods solve Poisson's equation on a grid via fast Fourier transforms, capturing long-range forces with logarithmic efficiency. Hybrid TreePM schemes marry both, computing short-range forces directly and long-range forces spectrally.

Resolution, however, is the eternal currency. A simulation's spatial resolution is bounded by its softening length, below which gravitational forces are deliberately weakened to prevent unphysical two-body scattering. Mass resolution is set by the particle count: too coarse, and small halos vanish into noise; too fine, and the computational cost becomes prohibitive. Every cosmologist running such a simulation makes a Faustian bargain between box size and resolution.

Codes like GADGET, RAMSES, and AbacusSummit have pushed these tradeoffs to extraordinary limits, with trillion-particle simulations now feasible on heterogeneous GPU architectures. The Euclid Flagship and Uchuu simulations cover gigaparsec volumes while resolving galactic substructure—a dynamic range spanning eight orders of magnitude in mass.

What emerges from these calculations is the cosmic web in its full nonlinear glory: filaments threading between voids, knots crystallizing into clusters, the universe's skeleton drawn in dark matter. It is, in a precise sense, the only way we can see this skeleton at all.

Takeaway
Every simulation embodies a fundamental epistemic tradeoff: we cannot simultaneously resolve the small and survey the large. Cosmology's computational reach is bounded not by ignorance but by arithmetic.

Hydrodynamics and the Subgrid Problem

Dark matter is computationally polite. It interacts only through gravity, requiring no thermodynamics, no chemistry, no radiative transfer. Baryons are a different beast entirely. To simulate gas, stars, and black holes, one must couple the Euler equations to gravity, track cooling and heating across orders of magnitude in temperature, and somehow capture the violent feedback that regulates galaxy formation.

Two paradigms dominate hydrodynamical simulation. Smoothed Particle Hydrodynamics represents gas as a swarm of fluid elements, naturally adaptive but historically poor at resolving shocks and instabilities. Adaptive Mesh Refinement and moving-mesh schemes like AREPO discretize space itself, refining grid cells where physics demands it. Each method carries its biases, and convergence between them remains an active concern.

Yet even with perfect hydrodynamics, the deeper problem persists: the physics that matters most—star formation in molecular clouds, supernova explosions, accretion onto supermassive black holes—occurs on scales of parsecs or smaller, far below any cosmological simulation's resolution. These processes cannot be computed from first principles within the simulation. They must be prescribed.

This is the subgrid problem, and it is cosmology's worst-kept secret. Star formation efficiencies, supernova energy coupling, AGN feedback prescriptions—these are tuned to reproduce observed scaling relations like the galaxy stellar mass function. Projects like IllustrisTNG, EAGLE, and SIMBA each make different subgrid choices, and the resulting galaxy populations differ in subtle but measurable ways.

The honest position is that we are calibrating, not predicting, when it comes to baryonic physics. The cosmological skeleton is robust; the flesh upon it remains an inference.

Takeaway
When a simulation reproduces observations, ask whether it predicted them or was tuned to match them. The distinction marks the boundary between discovery and curve-fitting.

From Mock Universes to Precision Cosmology

Simulations have transformed from theoretical curiosities into essential infrastructure for observational cosmology. When Euclid measures weak lensing across a third of the sky, or DESI maps thirty million galaxy redshifts, the interpretation of those data depends on simulations at nearly every step.

Consider the covariance matrix problem. To extract cosmological parameters from a survey, one must quantify the statistical uncertainty in summary statistics like the matter power spectrum. Analytic predictions break down in the nonlinear regime; empirical estimates from the data itself are circular. Suites of thousands of mock catalogs—each a slightly different realization of the same cosmology—provide the only path to robust error estimation.

Simulations also serve as testbeds for analysis pipelines. Before any cosmological constraint can be trusted, the pipeline must recover the input parameters from synthetic data generated with known cosmology. Systematic biases in photometric redshifts, intrinsic alignments, or selection effects can all be diagnosed in this controlled setting.

More ambitiously, emulators trained on simulation grids now provide differentiable mappings from cosmological parameters to observables, replacing slow Boltzmann codes in Markov chain inference. Machine learning models built on simulation outputs are beginning to extract information beyond traditional two-point statistics, mining the full nonlinear field for cosmological signal.

There is something philosophically peculiar here. We use simulated universes to interpret the real one, then use the real one to constrain the parameters of further simulations. The loop is tight, and the question of where theory ends and computation begins has become genuinely difficult to answer.

Takeaway
Modern cosmology operates as a feedback loop between observation and simulation. The computer is no longer a tool of theory—it has become a third epistemic pillar alongside experiment and analysis.

There is a quiet vertigo in contemplating what cosmological simulations actually are. Inside silicon, we instantiate not models of universes but functional universes themselves—governed by the same laws, seeded with the same fluctuations, evolving toward statistically indistinguishable endpoints. The distinction between simulation and reality becomes, at minimum, technically subtle.

Yet we must not overstate the case. Every simulation is a compromise: finite volume, finite resolution, prescribed subgrid physics, assumed cosmology. The map is detailed and useful, but it is not the territory. The real universe still holds surprises that no simulation has yet generated, from anomalies in the Hubble tension to unexpected features in the cosmic web.

Perhaps this is the deeper lesson. Simulations do not replace the universe; they sharpen our ability to ask it questions. They are instruments of inquiry, calibrated against reality, illuminating the gaps in our understanding precisely where their predictions fail.