Structured Light and Time-of-Flight: How Robots See in 3D

6 min read

Structured light sensors project known infrared patterns onto a scene and use triangulation to compute high-resolution depth maps at close range.

Time-of-flight sensors measure the round-trip time of modulated light, producing per-pixel depth with stronger ambient light immunity but lower spatial resolution.

Structured light excels in controlled indoor environments for precision manipulation, while ToF is better suited for longer ranges and variable lighting conditions.

Multi-sensor interference is a critical and often overlooked selection criterion, especially in multi-robot workcells where overlapping fields of view are unavoidable.

The right depth sensor choice depends on matching a technology's physics-based tradeoffs to the specific constraints of the robotic application.

For a robot to pick up an object, it needs to know more than what the object looks like. It needs to know where it is — not just in two dimensions, but in three. Passive stereo cameras can approximate depth, but they struggle with textureless surfaces, uniform lighting, and the kind of speed that industrial manipulation demands.

Active depth sensing solves this by projecting energy into the scene and measuring what comes back. Two dominant technologies have emerged: structured light and time-of-flight. Both produce dense 3D point clouds. Both integrate with standard robot perception pipelines. But they work on fundamentally different physical principles, and those differences have real consequences for system design.

Understanding how each technology generates depth data — and where each one breaks down — is essential for selecting the right sensor for a given robotic application. The choice affects everything from grasp planning accuracy to cycle time to whether your system will function reliably when a second robot is running three meters away.

Structured Light Principles: Projecting Geometry to Recover Depth

Structured light systems work by projecting a known pattern — typically infrared dots or coded stripe sequences — onto a scene and then observing how that pattern deforms across surfaces. A camera offset from the projector captures the distorted pattern. Because the geometry between projector and camera is calibrated precisely, the system uses triangulation to compute the depth of each point where the pattern lands.

The Intel RealSense D400 series is a widely deployed example. It projects a pseudo-random dot pattern in the near-infrared spectrum and uses a pair of IR cameras to perform stereo matching on the textured scene. The projected pattern gives the stereo algorithm consistent features to lock onto, even on surfaces — like cardboard boxes or white plastic housings — that would defeat passive stereo entirely. This is why structured light sensors became the default for bin-picking and tabletop manipulation.

Resolution and accuracy are genuine strengths. At working distances under two meters, structured light systems routinely achieve sub-millimeter depth precision. This matters when a robot needs to distinguish between tightly packed objects or estimate the pose of a part with fine geometric features. The dense, high-resolution point clouds these sensors produce feed directly into algorithms for surface normal estimation, object segmentation, and grasp candidate generation.

The engineering tradeoff is range and ambient light sensitivity. Structured light patterns wash out in direct sunlight because the IR projector cannot compete with the sun's broadband infrared output. Effective range also drops off quickly — most systems are optimized for 0.3 to 3 meters. Beyond that, the projected pattern becomes too sparse to produce reliable depth. For indoor, short-range manipulation tasks, these constraints are acceptable. For outdoor mobile robots or long-range logistics, they are not.

Takeaway
Structured light excels at close-range precision by imposing known geometry on a scene. Its fundamental limitation is that projected patterns must remain visible against ambient illumination — a constraint that defines where it can and cannot be deployed.

Time-of-Flight Operation: Measuring Depth with Photon Round Trips

Time-of-flight sensors take a conceptually simpler approach: flood the scene with modulated infrared light, then measure how long that light takes to return to the sensor. Since the speed of light is known, the round-trip time gives you distance. Every pixel on the sensor independently computes its own depth measurement, which means ToF cameras produce a complete depth map in a single capture — no pattern matching, no stereo correspondence.

There are two main variants. Direct ToF (dToF) emits short pulses and times their return with single-photon detectors or SPADs. This method works well at long range — ten meters and beyond — and is the basis for most LiDAR systems used in autonomous vehicles. Indirect ToF (iToF) emits continuously modulated light and measures the phase shift of the returning signal. iToF sensors like the PMD or Melexis families are compact, inexpensive, and common in robotic arms and mobile platforms operating at room scale.

The key advantage of ToF for robotics is ambient light immunity. Because the sensor is measuring modulation frequency or pulse timing rather than pattern visibility, it tolerates bright environments far better than structured light. This makes ToF viable for outdoor applications, warehouse environments with skylights, and any scenario where lighting conditions are unpredictable or uncontrolled.

The tradeoff is resolution and multipath interference. ToF sensors typically produce lower-resolution depth maps than structured light — often 320×240 or 640×480 compared to megapixel structured light outputs. Multipath artifacts, where reflected light bounces off multiple surfaces before reaching the sensor, can corrupt depth readings in corners or near reflective objects. These errors are systematic and predictable, but they require careful calibration and sometimes algorithmic correction in the perception pipeline.

Takeaway
Time-of-flight sensors trade spatial resolution for robustness. Each pixel independently measures depth, which simplifies the computation but introduces unique error modes — multipath interference chief among them — that structured light systems never encounter.

Selection Criteria: Matching Sensor Physics to Application Constraints

Choosing between structured light and ToF is not a question of which technology is better. It is a question of which set of physics-based tradeoffs aligns with your application's constraints. Four factors dominate the decision: working range, depth resolution, ambient light conditions, and multi-sensor interference.

For precision manipulation at close range — bin-picking, assembly verification, quality inspection — structured light is usually the stronger choice. The sub-millimeter depth accuracy and high spatial resolution justify the limitations on range and sunlight tolerance. Most of these tasks happen in controlled indoor environments where those limitations are irrelevant. The Intel RealSense D415, for example, was specifically designed for this envelope, with a narrower field of view and tighter baseline tuned for close-range depth precision.

For mobile robots, outdoor navigation, or applications requiring depth beyond three meters, ToF is the more reliable option. Its per-pixel depth computation is inherently parallel and fast, supporting real-time obstacle avoidance at frame rates of 30 Hz or higher. The lower resolution is acceptable for path planning and collision avoidance, where you need to know that something is at 4.2 meters, not what its surface geometry looks like at micron scale.

Multi-sensor interference is often the overlooked factor. When multiple structured light sensors operate in overlapping fields of view, their projected patterns collide, producing corrupted depth maps. ToF sensors modulated at similar frequencies can also interfere with each other, but frequency-division multiplexing offers a straightforward mitigation. In multi-robot workcells — increasingly common in logistics and fulfillment — this interference characteristic can be the deciding factor. It is worth testing early, because it is expensive to discover in production.

Takeaway
No depth sensor is universally optimal. The right selection comes from honestly mapping your application's range, resolution, lighting, and multi-sensor requirements against each technology's physics — not from defaulting to whichever sensor has the better datasheet headline number.

Structured light and time-of-flight are not competing technologies so much as complementary tools shaped by different physics. One projects geometry and triangulates. The other measures photon travel time. Each approach carries inherent strengths and irreducible limitations.

The practical lesson for system designers is this: start with the application constraints, not the sensor catalog. Define your working range, required depth precision, lighting environment, and whether multiple sensors must coexist. The physics will point you toward the right answer.

Depth sensing is what transforms a robot from a blind actuator into a spatial reasoner. Getting that foundation right — choosing the sensor whose failure modes you can actually manage — determines how robust everything built on top of it will be.