WhyChips

A professional platform focused on electronic component information and knowledge sharing.

800G Guide: SR8 vs DR8 vs FR4 Cost & Power for AI

Glowing blue AI text formed by circuit lines on black backdrop, artificial intelligence digital concept, futuristic tech banner

Introduction: The “Optical Wall” in AI Factories

As we transition from the cloud era to the AI factory era, the fundamental constraints of data center architecture are shifting. In traditional hyperscale clouds, the primary metric was Cost per Gigabyte ($/GB) of storage or compute. In the age of Generative AI, defined by trillion-parameter models like GPT-4 and massive clusters like NVIDIA’s SuperPODs, the new ruling metrics are Power per Bit (pJ/bit) and Interconnect Latency.

The deployment of 800G networking is not just an upgrade; it is a necessity to feed the voracious bandwidth appetite of GPUs like the H100 and the upcoming Blackwell B200. However, optics have become a bottleneck. Industry analysis suggests that in an optimized AI cluster, optical interconnects can account for 20-30% of the total Bill of Materials (BOM) and up to 25% of the total cluster power budget.

This guide provides a deep-dive technical comparison of the three dominant 800G module form factors—SR8, DR8, and 2xFR4—analyzing their trade-offs in cost, power consumption, and link budget. We will also explore how emerging technologies like LPO (Linear Pluggable Optics) and CPO (Co-Packaged Optics) are reshaping these decisions for the 1.6T era.

1. The Contenders: SR8, DR8, and FR4 Explained

Selecting the right transceiver is a function of reach, fiber infrastructure (multimode vs. single-mode), and cost sensitivity.

800G-SR8 (Short Reach)

  • Technology: 8x100G PAM4 lanes using 850nm VCSELs (Vertical-Cavity Surface-Emitting Lasers).
  • Fiber Type: Multimode Fiber (MMF), typically OM4 or OM5.
  • Reach: Up to 50m (OM3) or 100m (OM4).
  • Use Case: Intra-rack connections (Server to ToR switch) or short inter-rack links in high-density pods.
  • Pros: Lowest module cost due to mature VCSEL technology; lower power than EML-based solutions.
  • Cons: Requires bulky, expensive MPO-16 cabling; limited reach makes it unsuitable for large spine-leaf fabrics.

800G-DR8 (Datacenter Reach)

  • Technology: 8x100G PAM4 lanes using SiPh (Silicon Photonics) or 1310nm EMLs.
  • Fiber Type: Single-Mode Fiber (SMF) with MPO-12 or MPO-16 connectors.
  • Reach: Up to 500m.
  • Use Case: Leaf-Spine interconnects; the standard workhorse for hyperscale AI backends.
  • Pros: Future-proof fiber infrastructure (SMF); balanced cost/performance; high volume adoption drives cost down.
  • Cons: Higher module cost than SR8; higher power consumption due to DSP requirements (unless using LPO).

800G-2xFR4 (Fiber Reach)

  • Technology: 2 groups of 4x100G CWDM (Coarse Wavelength Division Multiplexing) wavelengths (1271, 1291, 1311, 1331 nm).
  • Fiber Type: Duplex LC SMF.
  • Reach: Up to 2km (sometimes extended to 10km as LR).
  • Use Case: Inter-building or long-span campus connections; connecting different AI clusters or data halls.
  • Pros: Uses standard duplex LC fiber (saving cabling cost over long distances).
  • Cons: Most expensive module due to MUX/DEMUX components and cooled lasers; highest power consumption.

2. Power Consumption Analysis: The TCO Killer

In an AI cluster with 32,000 GPUs, saving 2W per transceiver translates to nearly 1 Megawatt of power savings at the data center level (considering cooling overhead, PUE 1.2).

The DSP Problem

Traditional 800G pluggable modules rely on a Digital Signal Processor (DSP) to recover signals and perform retiming.

  • Standard DSP Module Power: 14W – 18W per module.
  • The DSP alone accounts for ~50% of this power draw.

The LPO Revolution (Linear Pluggable Optics)

LPO removes the DSP from the module, relying on the host ASIC’s SerDes (Serializer/Deserializer) to drive the signal.

  • LPO Power: < 8W – 10W per module.
  • Latency Benefit: Eliminating the DSP shaves off ~100ns of latency, which is critical for the collective communication patterns (All-Reduce) in AI training.

Verdict: For power-constrained AI racks (e.g., >100kW per rack), shifting from DSP-based DR8 to LPO-DR8 is the most effective immediate lever to reduce power without changing the switch architecture.

3. Cost Analysis & BOM Impact

When calculating the BOM for an AI cluster, one must look at the Total Link Cost, which includes both the transceiver and the fiber cabling.

  • Module Cost: SR8 < DR8 < 2xFR4
    • SR8 is roughly 30% cheaper than DR8 due to high-yield VCSELs.
  • Cabling Cost: MMF (for SR8) >> SMF (for DR8/FR4)
    • Multimode fiber is significantly more expensive per meter than single-mode fiber due to the larger core and manufacturing process.
    • The Crossover Point: For distances > 30-50 meters, the higher cost of MMF cabling negates the savings of the SR8 module.

Strategic Advice: Use SR8 strictly for DAC (Direct Attach Copper) replacements or very short intra-rack links (< 10m). For any structured cabling traversing the data hall, DR8 + SMF is the superior long-term investment, as SMF can support future speeds (1.6T, 3.2T) without ripping out cable trays.

4. Link Budget and Optical Margin

AI workloads demand zero packet loss. The link budget defines how much optical loss (from connectors, splices, and fiber attenuation) a system can tolerate before Bit Error Rate (BER) spikes.

  • SR8 Budget: ~1.9 dB (OM4). Very tight. Allows for only 2-3 connector pairs.
  • DR8 Budget: ~3.0 – 4.0 dB. More robust. Supports structured cabling with patch panels.
  • FR4 Budget: ~6.3 dB. Robust. Designed for complex paths with multiple patch points.

Engineering Note: In PAM4 modulation, the signal-to-noise ratio (SNR) is much lower than NRZ. “Flapping” links are often caused by dirty connectors eating up the slim 1.9 dB margin of SR8 links. Cleanliness is tier-0 priority.

5. Future Outlook: 1.6T, CPO, and the End of Pluggables?

As we look toward 1.6T (Blackwell Ultra / Rubin generation), the power density of pluggables becomes unmanageable.

  • 1.6T Pluggables: Projected to consume 22W – 25W. A 51.2T switch would need >1400W just for optics.
  • CPO (Co-Packaged Optics): Moves the optical engine onto the switch substrate.
    • Target Power: < 5W per 800G.
    • Status: High manufacturing complexity; ecosystem lock-in risks.
  • NVIDIA’s Bet: NVIDIA is pushing aggressive NVLink scalability, which may favor proprietary CPO-like implementations (like the NVLink Switch tray) over standard Ethernet pluggables for the GPU-to-GPU fabric, leaving Ethernet (DR8) for the front-end network.

FAQ: Common Questions on 800G Selection

Q: Can I interoperate SR8 and DR8?

A: No. SR8 uses 850nm wavelength over Multimode fiber. DR8 uses 1310nm over Single-mode fiber. They are physically and optically incompatible.

Q: Is LPO ready for mass deployment in 2025?

A: Yes, but it requires strict qualification. Because LPO lacks a DSP, the host switch ASIC and the module must be tuned together. It is not “plug-and-play” in the traditional sense; it is “plug-tune-and-play.”

Q: Why not use 2xFR4 for everything to be safe?

A: Cost and Power. 2xFR4 modules run hotter and cost 2-3x more than DR8. Using them for 50m links is a waste of CAPEX and OPEX.

Conclusion

For AI Cluster builders in 2025/2026, the “Golden Rule” of 800G selection is:

  1. Intra-Rack (< 3m): Use DAC (Passive Copper). Zero power, lowest cost.
  2. Neighbor Racks (< 10m): Use AEC (Active Electrical Cables) or SR8.
  3. Leaf-Spine / Pod-to-Pod (< 500m): Standardize on DR8 (SiPh). It is the volume leader, balances power/cost, and uses future-proof SMF. Evaluate LPO-DR8 to slash power by 40%.
  4. Campus / DCI (> 500m): Use 2xFR4 or coherent ZR optics for very long reaches.

The “AI factory” is an exercise in efficiency. Every watt saved on optics is a watt that can be given back to the GPU for compute.

发表回复