Intel After Gelsinger: Packaging, PowerVia, and the Real Work Lip-Bu Tan Must Finish

Pat Gelsinger gave Intel what it needed. A modern blueprint that puts chiplets, packaging, and device physics at the center of performance. He also tried to rebuild too much at once. The ideas are right. The timing was not. Lip-Bu Tan now inherits the correct plan and the hard part. Make the technology deliver on time and at scale, quarter after quarter.

Pat Gelsinger, fairly

Pat Gelsinger is the engineer’s engineer. He can talk transistors, power grids, cache hierarchies, and product cadence without losing the room. His fingerprints are on Core, on Xeon, and on Intel’s pivot to chiplets and advanced packaging. He saw the four simple truths before most people. Chiplets beat monoliths once reticles and thermals hard cap die size. Packaging is a performance feature, not an afterthought. Foundry sourcing has to be flexible. Power delivery must change if you want stable clocks in dense logic.

As a technologist he was right. As a CEO he tried to fix process, foundry, GPUs, AI hardware, and culture at the same time. That is five turnarounds. You can pick two and still need luck. He picked all of them. The result is a direction that is correct and a delivery engine that did not hit calendars often enough to rebuild trust quickly.

Pat’s next chapter

After stepping down, Gelsinger moved into board and advisory roles across semiconductors and infrastructure. That fits him. He is a technologist first. He will influence the next wave of silicon and manufacturing from the boardroom and the lab bench, not from a CEO office. That is a good outcome for the industry.

Where Intel is strong today

Packaging that matters. EMIB gives short lateral hops at high bandwidth without paying for a giant interposer. Foveros stacks logic on logic. Foveros Direct moves to hybrid bonding that cuts parasitics and lowers energy per bit. This is real performance per watt and real latency reduction. It is not a slide.

Device physics that buy headroom. RibbonFET restores gate control as geometries shrink. PowerVia moves power to the backside so top metal can carry signals. That reduces IR drop and stabilises rails under heavy switching. Fewer micro dips. More predictable clocks. Better link margins across die to die paths.

Culture that is more honest. Engineers have a voice again. Managers talk in limits and test coverage more than adjectives. Intel still overshares roadmaps, but teams sound realistic about risk. That is how trust starts to return.

Where it broke

Too many revolutions at once. Running five transformations in parallel split focus and created integration debt. Hero fixes took the place of predictable builds. Calendars drifted. Customers padded Intel’s dates. Once that happens you are no longer on the critical path for your buyers.

Foundry without repeat orders. A foundry is judged by second orders. The first tape out is easy. The second order proves you can deliver without a war room. Intel did not move fast enough from pilots to quiet, repeatable shipments.

Platform churn. Client sockets moved. Server SKUs multiplied. Firmware stacks shifted too often. OEMs and hyperscalers pay for that churn in validation cycles. Predictability is a feature. Intel forgot that for a while.

Deep technical analysis: what Intel’s stack really buys you

Foveros and Foveros Direct

Foveros puts an active die on top of another active die. The early versions used micro bumps that limit pitch and add resistance. Foveros Direct uses hybrid bonding. Copper meets copper. Pitch shrinks. Resistance falls. The vertical path shortens. You lose fewer picoseconds and joules in the stack. In practice, this helps shared last level cache structures and fabric agents that chat constantly. Fewer repeaters. Fewer serializers. Lower energy per bit. Lower latency at steady state, not just at boot.

Failure modes. Hybrid bonds need clean surfaces and tight planarity. Small defects kill channels. If bond yield varies, package yield drops even when wafers look fine. Rework windows are narrow. That forces early binning and honest yield bands by pitch class or you spill good die on dead stacks.

EMIB for short lateral links

EMIB is a silicon bridge embedded in the organic substrate. It links adjacent tiles over short distances at high bandwidth. You get multi terabit class aggregate links without paying the full interposer tax. That is enough to keep fabric latency low across compute and cache tiles and still leave headroom for HBM channels.

Failure modes. Bridge placement tolerance, coplanarity, and underfill control drive yield. If attach processes wander, link margins shift. Bring up stretches. Errata accumulate. Customers feel that as longer validation and later shipments.

UCIe and custom fabrics

UCIe standardises die to die links so vendors can mix and match IP. Intel also uses custom fabrics where it wants lower energy per bit or a fabric tuned to a specific cache hierarchy. The practical question is the IP catalog. Do customers get golden PHYs, stable controllers, and training algorithms that do not change every PDK. If yes, adoption rises. If not, customers waste quarters rolling their own glue logic.

PowerVia and backside power

Moving power rails to the backside frees top metal for signals. That reduces via stacks and wire length on critical nets. It also stabilises rails under bursty traffic. In packages with close vertical and lateral paths, this improves link jitter and reduces retries. The wins are boring and real. Fewer frequency dips. Better sustained throughput. Less lot to lot variance.

RibbonFET versus classic FinFET

Gate all around devices improve electrostatic control. Leakage falls. Subthreshold slope improves. You can hit the same work at lower voltage, or more work at the same voltage if thermals permit. But library timing changes and PDK churn can blow up schedules if the ladder moves too often. Intel needs fewer PDK drops with wider corners so design teams can stabilise flows.

Yield math and the cost of known good packages

Buyers do not purchase wafers. They purchase known good packages. The only number that matters is cost per known good package.

C_pkg = (C_wafers + C_assembly + C_test + C_rework) / Y_pkg

Now expand Y_pkg into tile yields, EMIB yield, hybrid bond yield, substrate yield, and test escapes. Add one more 95 percent tile and your package yield drops by 5 percent before bonding. Stack three planes at 98 percent bond yield and that term becomes about 94 percent. You multiply these. Headroom vanishes fast. The lesson is simple. Early binning, clear rework steps, and honest yield bands by pitch class. Otherwise cost curves go the wrong way and ship dates slide.

Packaging telemetry as part of the product

  • PDN droop versus clock stability across 30, 60, and 90 minute soaks. Steady frequency under soak matters more than short boosts.
  • Vertical link error rates across temperature cycles. If error correction carries the link, performance falls and power rises.
  • Thermal maps across logic tiles and HBM at fixed inlet temperatures. Hot edges throttle quietly a month later.
  • Derate curves versus altitude and inlet temperature. Data centers do not live at room temperature with perfect air.

Thermal and mechanical stack design

Multi die packages bow and warp. Contact pressure must be even so edge HBM stacks do not starve. Lids and interface materials need to survive many cycles without pump out. Publish bow and warpage statistics. Publish long soak stability plots. Do not bury them. Platform builders will trust the package that looks boring after an hour.

Client and server platform discipline

Client buyers want sockets and power envelopes that stay put for two cycles. Firmware should evolve without breaking validation each quarter. Server buyers want a clean split between efficiency dense and performance core lines. They want memory channels, CXL lanes, and cache hierarchy that stay predictable for two generations. They buy total cost of ownership at the one hour mark, not peak numbers at ten seconds.

Accelerators with a reason to exist

Chasing a moving general GPU target with parity features is a losing game. Intel’s accelerator edge must come from package topology and memory access paths. If shorter vertical and lateral links produce more tokens per joule under steady load, ship frameworks that show it without hero tuning. Publish end to end pipelines that strangers can reproduce. When that works, customers reorder.

Software that proves the hardware

  • Reference stacks for AI inference, analytics, and media that run outside Intel labs. Real scripts. Reproducible results.
  • Schedulers that place work with locality in mind. Keep hot paths on tiles that sit close vertically and laterally.
  • Telemetry that feeds job managers so operators can trade energy for deadlines with signals that come from the package, not a guess.

Supply chain alignment

Substrates, bonders, underfill, lids, and HBM allocations decide success. Intel should give customers a capacity map under NDA. If HBM slips, offer a variant with fewer stacks and publish the throughput and power impact. Buyers can plan around bad news. They cannot plan around silence.

Comparative analysis: how AMD, TSMC, and Apple approach the same problems

AMD: chiplet pragmatism and 3D V-Cache

AMD’s approach is simple and effective. Small CCDs on a mature node for yield and cost. An IOD on a cheaper node for I O, memory controllers, and PHYs. Glue with Infinity Fabric that is good enough at realistic distances. Then add 3D V-Cache where latency matters. You get modularity, yield resilience, and a clean cost structure. In servers this means wide core counts without a giant monolith. In client it means flexible SKUs and margins that survive price cuts.

Where AMD trades off. Infinity Fabric energy per bit across package is higher than a tight hybrid bonded vertical stack. Lateral links are longer than logic on logic. AMD works around this with cache hierarchy and smart firmware, but physics still shows up under certain loads. AMD also depends on TSMC for both logic and packaging capacity, which means lead time and allocation risk during AI booms.

What Intel can learn. Bin hard and often. Keep tiles small. Put expensive process only where it pays back. Use packaging to add value without forcing hero yields on every tile. AMD’s playbook is boring by design. That is why it works.

TSMC: CoWoS and SoIC as the industry’s backbone

TSMC’s CoWoS dominates the lateral high bandwidth world. Large interposers and HBM farms are how AI training systems get fed today. SoIC brings hybrid bonding for logic on logic and memory on logic. TSMC sells capacity and predictability. Packaging slots are the new currency. If you want a giant AI accelerator, CoWoS plus HBM allocation decides whether you ship this year or next.

Where TSMC trades off. CoWoS is expensive and interposer size is a constraint. Thermal density rises fast. Yield math pushes customers toward partitioning that fits the packaging line rather than the ideal microarchitecture. TSMC still wins because the line is mature and customers trust the calendar.

What Intel can learn. Treat packaging lead time as a product. Publish slot allocation rules under NDA. Put boring first. If your EMIB plus Foveros flow ships on the day you promised, customers will design for it even if a slide claims a better pitch elsewhere.

Apple: vertical integration with tight thermal control

Apple optimises for power efficiency and sustained performance inside thin thermals. Its packaging is conservative on paper but ruthless in integration quality. Apple avoids exotic stacks unless it moves user experience. That keeps yields high and schedules intact. Where Apple does push is in unified memory, interconnect locality, and power management. The result is predictable performance for hours in fan limited designs.

Where Apple trades off. Less headline grabbing packaging density. Fewer dramatic leaps. Slower adoption of risky process features. But Apple ships. On time. In volume. With few surprises. That is the lesson.

What Intel can learn. Pick your fights. If a packaging trick does not change real workloads, do not ship it. Spend risk budget where it buys sustained performance or battery life. Customers will not thank you for exotic stacks that create RMA headaches.

What Intel must do next

Focus and follow through. Choose fewer bets. Finish them. Measure success in yields, not slides. Make packaging telemetry a standard deliverable. Publish tested good die rates per tile, bond yield bands by pitch class, and rework windows. Tell the truth early. Customers can plan around honesty.

Pick a foundry persona. Intel does not need to be TSMC. It needs to be indispensable to designs that value vertical density and short lateral hops. Land three lighthouse customers. Assign executive sponsors who live in factories. Force a reorder inside twelve months. That is the only scoreboard that matters.

Stabilise client and server roadmaps. Lock sockets and power envelopes for two cycles. Reduce SKUs. Keep the server split between efficiency dense and performance core lines. Prove TCO at the one hour mark, not the ten second burst.

Ship software that proves the hardware. Provide reference stacks and schedulers that respect packaging locality. Deliver predictable throughput per watt at cluster scale without vendor engineers in the room. When third parties can reproduce your wins, your pipeline fills on its own.

Twelve month playbook

Quarter 1. Freeze client sockets and power envelopes for two generations. Publish the packaging telemetry template and make it mandatory. Lock a slim PDK ladder with wider corners and fewer drops.

Quarter 2. Confirm three external tape outs on 18A that use EMIB plus Foveros. Start monthly yield band reports to those customers under NDA. Publish rework windows by step for those packages.

Quarter 3. Ship one packaging heavy product with the full telemetry pack and uniform bring up scripts. No one off code. Field diagnostics that customers can run.

Quarter 4. Secure at least one quiet reorder with a shorter lead time than the first run. If something slips, publish the cause and the gate you fixed. Move on.

Twenty four month targets

  • External customers on 18A with EMIB plus Foveros who reorder inside twelve months.
  • Telemetry packs that look the same across products because the line is mature.
  • Client and server platforms with fewer SKUs, fewer socket changes, and stronger sustained performance per watt.
  • Gross margin that stabilises and rises because capex backs programs that actually ship.

My bottom line

Intel has the right ingredients. Packaging capability that rivals cannot match end to end. PowerVia and RibbonFET that buy real headroom. A culture that is moving back toward engineering reality. The gap is consistency. Lip-Bu Tan does not need a new slogan. He needs fewer slides and more wafers. When a second and third external customer reorder without drama, the reset becomes real. The plan is right. Now build it. On time. At scale.

Be the first to comment

Leave a Reply

Your email address will not be published.


*