The Future of Chiplet Reliability: Interconnect Failure Prediction With 100% Lane Coverage

Interconnect failure prediction with 100% lane coverage

Chipmakers are increasingly turning to advanced packaging to overcome the reticle size limit of silicon manufacturing without increasing transistor density. This method also allows hybrid devices with dies in different process nodes while improving yield, which decreases exponentially with size.

However, 2.5D/3D designs introduce a fair share of new challenges, one of the most significant being poor visibility into the interconnect.

The toll of poor visibility into die-to-die (D2D) interconnects

Engineers spend months designing a chiplet, only to discover that almost no internal die pins are accessible to the test program – it's a quality nightmare. Even if the individual dies undergo thorough testing, the numerous lanes that connect them in the advanced package are often left in the dark.

Traditional testing methods based on DFT BIST offer limited relief to engineers. They are useful merely in test mode, leaving a big question mark on what might happen in real-life scenarios. Also, they only provide sample lane coverage, which may lead to oversight of critical malfunctions.

Therefore, when assembling the dies in the SiP (system in package), a variety of D2D interconnect defects can go undetected:

Solder microbumps: voids, cracks or missing balls
TSV problems: cracks or partial fill of the drilled holes
Lane trace issues: bridge shorts due to residual material

What if engineers had 100% lane coverage in mission mode

As mentioned earlier, common practices provide sample lane coverage in test mode only.

On the other hand, with 100% lane coverage, quality risks can be averted. And if tests can run in mission-mode, that’s the holy grail of reliability. Engineers can now detect defects under real-life conditions in every lane, even if there are thousands of them.

This level of coverage provides complete confidence that the interconnect will perform well in the field.

Interconnect failure prediction using parametric lane grading

Another downside of typical testing methods is their black-or-white output. They only indicate if a lane passes or fails without providing more nuanced insights in cases of marginal performance. This approach becomes problematic when underperforming lanes, which tend to degrade faster, make their way into the final product leading to premature failures.

In contrast, parametric lane grading goes beyond just pass/fail testing. It provides a grade of marginality per lane. This granularity allows setting performance thresholds for swapping lanes with spare ones even if their tests currently pass. It also enables yield improvement in the manufacturing line, followed by quality and safety enhancements as well as lifetime extension in the field.

Best known method vs. parametric lane grading

proteanTecs D2D interconnect monitoring solution

At proteanTecs, we combined 100% lane coverage in mission-mode with parametric lane grading, providing the best of both worlds:

See exactly what's happening with each of the numerous lanes
Run either in test mode or in mission-mode under real-life conditions
Predict when a chip may fail and replace degraded lanes before it happens

Dedicated proteanTecs Agent IP is integrated into the design to enable complete lane coverage without area penalty using empty bump array silicon. The readings of these agents are sent to a data analytics platform that grades each lane based on several parameters, enabling early failure prediction:

Maximum eye width measurement (green arrows below)
Minimum eye width measurement (red arrows below)
Eye width crossing jitter, max-min (orange arrows below)
Eye right-hand side (data to clock) and left-hand side (clock to data) are measured separately for symmetry analysis
Measurements are done for each of the clock phases (typically two for DDR or four for QDR) separately

Here we demonstrate an eye diagram and the measurements taken, for one of the clock phases. These measurements are useful during New Product Introduction (NPI) to characterize lane performance under a range of processes, voltages, and temperatures when the first chips arrive.

At mass production, outlier detection of an assembled unit improves yield by replacing defective or weak lanes with spare ones instead of discarding the entire chip.

Finally, in the field, detecting lane degradation enables predictive maintenance by proactively swapping to spare lanes before failures occur, or replacing modules when no spare lanes are left.

‍

In conclusion, proteanTecs’ 100% coverage, parametric lane grading in-missןon mode, addresses one of the most significant challenges in heterogeneous integration testing: poor interconnect visibility. It enhances chiplet quality, reliability and safety, and when combined with spare lane and lane repair, improves yield.

proteanTecs’ D2D solutions support a wide range of interfaces and substrates, including HBM3 and UCIe.

Want to learn how you can benefit from the future of chiplet reliability? Download this whitepaper or contact us here.

‍

The Future of Chiplet Reliability

Interconnect failure prediction with 100% lane coverage

The toll of poor visibility into die-to-die (D2D) interconnects

What if engineers had 100% lane coverage in mission mode

Interconnect failure prediction using parametric lane grading

proteanTecs D2D interconnect monitoring solution

Interested in learning more about our solutions?