What is a proton made of? The question sounds simple enough, but the answer resists every attempt at a clean, static picture. A proton is not three quarks sitting quietly in a bag. It is a seething quantum field system—quarks and gluons appearing, interacting, and dissolving at every energy scale. The deeper you look, the more structure you find.

Parton distribution functions, or PDFs, are the mathematical objects that capture this internal complexity. They tell us the probability of finding a quark or gluon carrying a particular fraction of the proton's momentum at a given resolution. They are not derived from first principles in any closed form. They must be extracted from nature, painstakingly, through experiment.

This makes PDFs a remarkable bridge between the abstract elegance of quantum chromodynamics and the concrete reality of particle collisions. Every prediction the Large Hadron Collider makes about new physics depends, at its foundation, on how well we understand the quarks and gluons inside a proton. The story of PDFs is the story of learning to read the proton's inner life.

Bjorken Scaling: The Clue That Protons Have Parts

In the late 1960s, experiments at the Stanford Linear Accelerator Center began firing high-energy electrons at protons and measuring how they scattered. If the proton were a smooth, featureless blob of charge, the scattering patterns would depend sensitively on both the energy of the collision and the angle of deflection—two independent variables tangled together in complicated ways.

Instead, physicists observed something striking. The structure functions describing the scattering depended primarily on a single dimensionless ratio: the fraction of the proton's momentum carried by whatever the electron had struck. This phenomenon, known as Bjorken scaling, was the fingerprint of point-like constituents inside the proton. Richard Feynman called them partons. They were the quarks and gluons of quantum chromodynamics, seen for the first time through their momentum signatures.

Bjorken scaling says, in essence, that at sufficiently high energies, the internal structure of the proton looks the same regardless of how hard you probe it. The structure function F₂(x) depends on the momentum fraction x but not on the momentum transfer . This is exactly what you would expect if the electron were bouncing off free, structureless particles inside the proton—a deeply counterintuitive result, given that quarks are permanently confined.

The resolution to this paradox is asymptotic freedom: at short distances, the strong coupling constant becomes small, and quarks behave as if they are nearly free. Bjorken scaling is not exact—it is violated logarithmically, and those violations turn out to be profoundly informative. But the approximate scaling was the key that unlocked the parton model and gave us the conceptual vocabulary of parton distribution functions in the first place.

Takeaway

Bjorken scaling revealed that protons contain point-like constituents by showing that scattering depends on momentum fraction rather than absolute energy—structure emerges most clearly when you ask the right dimensionless question.

QCD Evolution: Structure That Changes With Resolution

Bjorken scaling is approximate. When you probe a proton at higher and higher energies—increasing the momentum transfer —the parton distributions shift. Quarks carrying large momentum fractions become rarer, while the population of low-momentum quarks and gluons grows. The proton, viewed at finer resolution, reveals an increasingly busy interior, with gluons splitting into quark-antiquark pairs and quarks radiating gluons.

This evolution is not arbitrary. It is governed by the DGLAP equations—named after Dokshitzer, Gribov, Lipatov, Altarelli, and Parisi—which are the renormalization group equations of QCD applied to parton distributions. They describe how the probability of finding a parton with momentum fraction x changes as the resolution scale increases. The equations encode the splitting functions: the calculable probabilities for a quark to radiate a gluon, or a gluon to split into a quark pair.

What makes the DGLAP framework so powerful is that while the parton distributions themselves cannot be calculated from QCD, their evolution can. If you measure the PDFs at one energy scale, the DGLAP equations predict what they will look like at any other scale. This is the essence of QCD's predictive power for collider physics—a single set of measurements, evolved through rigorous perturbative equations, enables predictions across an enormous range of energies.

The logarithmic violations of Bjorken scaling that the DGLAP equations describe were among the earliest and most compelling confirmations of QCD as the correct theory of the strong force. Every time a collider measurement at a new energy matches the evolved prediction, it is a quiet triumph of quantum field theory—the running of the coupling and the branching of fields made visible in data.

Takeaway

You cannot calculate the proton's internal structure from first principles, but you can calculate exactly how it changes with energy—QCD's predictive power lives in the evolution, not the initial conditions.

Global Fits: Assembling the Proton From Many Experiments

No single experiment can determine parton distribution functions. Deep inelastic scattering reveals certain combinations of quark distributions. Drell-Yan processes—where a quark from one hadron annihilates with an antiquark from another to produce a lepton pair—illuminate the antiquark sea. Jet production at hadron colliders constrains the gluon distribution. W and Z boson production probes the flavor structure of the quark sea. Each measurement provides a different window into the proton.

The process of combining all this data into a coherent set of PDFs is called a global fit. Groups like CTEQ, MMHT, and NNPDF parameterize the parton distributions at a low starting scale, evolve them using DGLAP equations, compute predictions for hundreds of experimental observables, and then adjust the parameters to achieve the best agreement across all data simultaneously. It is an enormous optimization problem, involving thousands of data points from experiments spanning decades.

What emerges is not a single answer but a family of distributions with quantified uncertainties. The NNPDF collaboration, for instance, uses neural networks to parameterize the PDFs, avoiding restrictive functional forms and letting the data speak with minimal theoretical bias. The resulting uncertainty bands propagate directly into predictions for processes at the LHC—including searches for new physics, where PDF uncertainties can be a dominant source of theoretical error.

There is something philosophically striking about this enterprise. The proton—the most common hadron in the visible universe—cannot be fully described from theory alone. Its internal structure must be learned empirically, one experiment at a time, and assembled through careful statistical inference. PDFs are a living, evolving body of knowledge, updated with each new measurement. They are our best collective portrait of what lives inside a proton.

Takeaway

The proton's internal structure is not derived from a single elegant equation but painstakingly assembled from the combined evidence of many experiments—our understanding of the most basic building block of matter is fundamentally empirical.

Parton distribution functions sit at a fascinating boundary in physics—where the calculable meets the empirical, where the symmetries of QCD meet the irreducible complexity of a bound state. They are not elegant in the way a Lagrangian is elegant. They are elegant in the way a map is elegant: faithful to the territory.

Every collision at the LHC begins with two protons, and every prediction for what emerges depends on knowing what those protons contain. PDFs are the quiet infrastructure beneath every headline discovery in particle physics.

The proton, it turns out, is not a thing you can simply write down. It is a thing you must continually learn.