Blog

DEEP RESEARCH · NVIDIA · SILICON PHOTONICS

NVIDIA's Photonics Gambit: Dissecting the Silicon Photonics Strategy Behind the Rubin AI Platform

A dual playbook — in-house MRM-based CPO with TSMC plus equity stakes in ELS startups — turning the interconnect into the new CUDA

Posted: 2025-10-06 · Semiconductors / AI Infrastructure · Original Naver post

All investment decisions are your own responsibility. This is research, not a buy/sell recommendation.

0. Bottom line first

To break the interconnect bottleneck of its next-gen Rubin AI platform, NVIDIA is executing a sophisticated "make & buy" dual strategy: (1) aggressively co-developing an in-house Co-Packaged Optics (CPO) platform built on Micro-Ring Modulators (MRMs) in partnership with TSMC, and (2) taking strategic equity stakes in key startups — notably Scintil Photonics and Xscape Photonics — to lock in the External Light Source (ELS) supply chain and the future roadmap.

Official fact: The Rubin platform — anchored by the NVLink 6 switch and the CX9 SuperNIC — will be the first generation to fully adopt this CPO architecture. NVIDIA targets more than 3.5× power efficiency and more than 10× resiliency versus traditional pluggable optical modules.

Interpretation: This is not an incremental upgrade — it is a fundamental redesign of the data-center fabric, designed to build a fresh moat around NVIDIA's full-stack AI offering. Picking MRMs over more exotic platforms like TFLN or EOP is a pragmatic call that prioritizes manufacturability with TSMC and ecosystem control over peak lab performance.


1. The interconnect bottleneck: why photonics is core to NVIDIA's AI dominance

The analysis starts with the problem. The shift from generative AI to "Agentic AI" forces models to reason through complex multi-step processes, which explodes token generation per query. That is exactly why Jensen Huang argues we need "100× more" compute than the industry previously assumed.

That sets up two existential threats to data-center scalability — the Power Wall and the Bandwidth Wall. AI factories will consume enormous amounts of power, and interconnect is a major share of it. NVIDIA's one-year cadence roadmap — Blackwell → Rubin → Feynman — entirely hinges on solving this scalability problem.

NVIDIA has near-monopoly grip on the compute layer through GPUs and CUDA. As rivals close the gap with their own accelerators, cluster-level performance is increasingly limited by the network, not the GPU. So NVLink → InfiniBand → CPO is the next battleground. Rubin is not just a new chip — it is a fully integrated rack-scale system in which the network matters as much as the silicon.

From pluggable optics to CPO: an inevitable transition

Pluggable transceivers are running out of room. The long electrical path on the PCB from the switch ASIC to the module cage induces roughly 22 dB of signal loss and significant power. At 1.6T per port, a pluggable module burns about 30W, while CPO can do the same in around 9W.

Official fact: CPO places the photonic engine right next to the switch ASIC on the same substrate. That dramatically shortens the electrical path, improves signal integrity, cuts power by up to 3.5×, and increases bandwidth density.


2. The in-house engine: NVIDIA's CPO platform and Micro-Ring Modulators

Quantum-X vs Spectrum-X Photonics

Quantum-X Photonics (InfiniBand)

Designed for the highest-performance, lowest-latency scale-up fabric. 800 Gb/s × 144 ports. Targets the tightest training/inference clusters.

Spectrum-X Photonics (Ethernet)

For hyperscale and enterprise AI clouds. Up to 400 Tb/s aggregate bandwidth, 200 Gb/s × 2,048 ports. The scale-out fabric for AI factories with millions of GPUs.

These switches will be the backbone of the Vera Rubin NVL144 and Rubin Ultra NVL576 systems.

Technical deep-dive: Micro-Ring Modulators (MRMs)

Official fact: MRMs are tiny resonant structures that modulate light extremely efficiently. NVIDIA's MRMs are engineered for direct 200 Gbps PAM4 modulation per wavelength with an ultra-low footprint — the very density that enables the massive port counts of the new switches.

Interpretation: The catch: MRMs are notoriously thermally sensitive. Resonance frequencies drift and performance degrades. Historically this was the biggest commercialization barrier — which is precisely where TSMC comes in.

The TSMC partnership: from lab to mass production

The deep NVIDIA × TSMC collaboration is the key to solving MRM's historical pain points at volume. TSMC contributes advanced process engineering, precise manufacturing control, and 3D chip stacking via TSMC-SoIC to mitigate thermal sensitivity and guarantee reliable, repeatable performance.

NVIDIA's choice of MRM is a strategic bet on manufacturability and ecosystem control over raw performance. TFLN (thin-film lithium niobate) and EOP (electro-optic polymer) may beat MRM on individual metrics, but they rely on non-standard materials and processes with a fragmented supply chain. By choosing silicon-based MRMs, NVIDIA leverages the dominant CMOS ecosystem controlled by TSMC. It trades a sliver of theoretical peak performance for supply chain stability, mass producibility, a predictable cost roadmap, and deep co-optimization with the world's best foundry.


3. Building the moat: strategic investments in the photonics ecosystem

The CPO paradox: you still need an external light source

Silicon's indirect bandgap means it cannot efficiently generate light. So even with sophisticated on-chip MRMs, a CPO engine still needs a separate, high-power, high-efficiency laser source. NVIDIA's CPO design explicitly calls for high-power ELS modules — and this is where the strategic investments come in.

Case 1 — Scintil Photonics: locking in the near-term supply chain

Core technology

Proprietary SHIP™ (SCINTIL Heterogeneous Integrated Photonics) platform, integrating III-V materials (such as InP) onto silicon photonics chips.

Flagship product

LEAF Light™ — a single-chip DWDM optical engine. A perfect ELS for CPO: many precisely-spaced wavelengths from one compact source.

NVIDIA's investment

Participated in the $58M Series B.

Interpretation: Scintil offers a direct answer to the most pressing question facing NVIDIA's CPO: where do we source a scalable, high-performance multi-wavelength light source? This is not a speculative bet — it is a direct supply-chain de-risking move for the Rubin launch.

Case 2 — Xscape Photonics: a bet on next-gen bandwidth

Core technology

Light source built on optical frequency combs. The ChromX platform aims to deliver hundreds of wavelengths from a single source — a massive leap over the tens of wavelengths in today's DWDM systems.

NVIDIA's investment

Participated in the $44M Series A.

Strategic meaning

If Scintil is for the Rubin era, Xscape represents the post-Rubin (e.g. Feynman) future. Comb sources could unlock petabit-per-second interconnects.

NVIDIA's corporate venture capital arm functions less as a passive investor and more as a strategic R&D and supply-chain tool. Equity stakes — rather than mere supply contracts — secure preferential supply, deep visibility into startup roadmaps, and influence over development priorities. They also create a kingmaker effect, signaling which technologies are likely to become industry standard.

CompanyCore techRecent fundingNVIDIA's roleStrategic value
Scintil PhotonicsIII-V laser heterogeneous integration (SHIP™) → DWDM optical engine (LEAF Light™)$58M Series BParticipantNear-term (Rubin): secure core ELS supply chain for in-house CPO, de-risk product roadmap
Xscape PhotonicsFrequency-comb-based multi-wavelength source (ChromX)$44M Series AParticipantLong-term (post-Rubin): access to next-gen bandwidth scaling, strategic bet on the future interconnect

4. Comparative technology assessment: NVIDIA's strategic position

Optical modulator platform comparison

The table below quantitatively justifies NVIDIA's choice. While TFLN and EOP may win on individual metrics, the MRM — combined with TSMC's manufacturing capabilities — offers the best balance of performance, density, manufacturing readiness, and ecosystem integration.

PropertyStandard Si (MZM)NVIDIA (MRM)Thin-film LiNbO₃ (TFLN)Electro-optic polymer (EOP)
Physical effectPlasma dispersionResonant plasma dispersionPockels effectPockels effect
Bandwidth (GHz)~50–70>100 (200 Gbps PAM4)>110 (220 demonstrated)>100 (800 projected)
FootprintLarge (mm-scale)Very small (μm-scale)SmallSmall
Power efficiencyLowHighVery highPotentially highest
Thermal stabilityMediumLow (active control needed)Very highMedium (improving)
CMOS integrationMonolithic (FEOL)Monolithic (FEOL)Heterogeneous (bonding)Heterogeneous (BEOL)
MRL (estimated)9–10 (mass production)7–8 (pilot / early production)7–85–6 (prototype)
Key suppliersMultiple foundriesNVIDIA / TSMCHyperLight, LiobateLightwave Logic

Light source integration: a spectrum of approaches

Heterogeneous integration (Scintil's path)

Bonding III-V dies onto silicon. Mature and reliable, but assembly is complex.

Monolithic growth (the 'holy grail')

Growing III-V materials directly on silicon. Ideal for cost and scale, but faces lattice-mismatch and defect challenges. Quantum-dot lasers are a promising solution.

Advanced external sources (Xscape's path)

Frequency combs that create a paradigm shift in wavelength density.

Interpretation: NVIDIA's investment strategy reads as a pragmatic portfolio approach across all of these. Back the mature heterogeneous-integration play (Scintil) for the near term, while exploring potentially paradigm-shifting advanced sources (Xscape) for the long term. Risk is spread, optionality is preserved.


5. Synthesis: the future of interconnect in the Rubin era and beyond

Predicted Rubin platform interconnect architecture

Vera Rubin NVL144 / Rubin Ultra NVL576A fully integrated rack-scale AI factory
GPU layerRubin GPU + HBM, next-gen Tensor Cores
Switch layerNVLink 6 + Spectrum-X Photonics (CPO-embedded)
Optical engineNVIDIA × TSMC MRM-based CPO (200 Gbps/λ)
Light source (ELS)Likely supplied by Scintil LEAF Light™ DWDM
3.5× power efficiency, 10× resiliency — the infrastructure substrate for the Agentic AI era.

The "make & buy" doctrine: ecosystem control as the ultimate moat

NVIDIA's photonics approach is not just about technology — it's about business strategy. By co-manufacturing the core CPO engine with TSMC, NVIDIA controls the central platform and its integration with the GPU. By investing in the ELS ecosystem, NVIDIA secures the supply chain, mitigates risk, and steers the industry toward solutions optimized for its platform. This is a direct copy of the CUDA playbook — provide a core platform, build a dependent ecosystem around it. The interconnect is becoming the new CUDA: a proprietary, performance-critical layer that locks customers into NVIDIA's stack.

Implications for competitors and investors

Official fact: Competing with NVIDIA now requires more than a better chip — it demands a comprehensive interconnect strategy that can match the performance and integration of NVIDIA's CPO platform.

Interpretation: Indicators investors should track: (1) the ramp of NVIDIA CPO switch manufacturing at TSMC, (2) the commercial success of Scintil's ELS modules in volume deployments, and (3) technical milestones from Xscape proving frequency combs are viable in the data center. The success of the Rubin platform will be the ultimate validation of this entire photonics strategy.

Long term, the industry is moving from a single-material (silicon) paradigm to a future of multi-material, heterogeneous integration. Whoever masters the difficult art of integrating silicon with III-V compounds, lithium niobate, and other advanced materials will define computing for the next decade. With its dual strategy, NVIDIA is positioning itself to be the master of that new era.

Sources