RMAs: Root Problem Found

Written By
Yuval Bonen
Co-Founder & VP Software Development

Deep Data needed to address the elephant in the room

For decades, costs of production and maintenance have been driven down through manufacturing, process and logistical innovation, creating more breathing room for margin to maintain viable growth. There are other costs, however, that we seemingly accept as inevitable and simply get better at factoring in as par-for-the-course, or ‘eggs broken’ to make the omelet. The ubiquitous presence of Return Material Authorization (RMA) is a testament to this fact. It’s a status-quo the value chain has learned to live with, budget for, and upkeep.

For an advanced electronic system to function reliably in the field, countless contributions are made from multiple sources. All of them need to align in precise detail along the value chain for outcomes to remain robust and feasible at scale and over time. However, despite staggering progress in many corners, gaps still exist that create critical blind spots in the value chain, compromising field reliability.

How does the industry fill those gaps? The RMA process - a somewhat cumbersome and labored set of forensic sub-processes that keep semiconductor manufacturers and OEMs up at night. If a system fails in the field, the race is on to determine why, how and mainly who’s to blame. Engineers are on a perpetual hamster wheel of recreating problems in the lab, instead of focusing on creating new innovations. The case for a seamless, at-the-root solution couldn’t be clearer.

The cost-benefit equation of RMAs

In an ideal world, the benefits and lessons to fall out of the RMA process would bring an upward trend in process and test quality. RMA would undergo continuous improvements and become a minimal, negligible consideration over time. But this has yet to happen.

From customer defect data and out-of-box audits, to functional verification and in-circuit tests, the RMA process creeks and aches under the strain of its own painstakingly granular complexity, as it is asked to push the likely failure point and moment of accountability further and further back up the value chain.

As much as the insight of RMA can add value through iterative testing or post-mortem reenactments on failed hardware, it can also escalate into an overactive process of rising costs passed around the value chain, from system to board to chip manufacturers. You need only look at the many stakeholders in the process and the slow, granular nature of the RMA methodology to realize the potential for flareup into an active pain point in need of treatment.

No Trouble Found

We are left with a situation where many parts fail for no apparent reason, leading to frustratingly high No Trouble Found (NTF) rates, commonly above 50%. FutureDial has found that within the consumer electronics sector, 68% are classified as NTF. If we look at Facebook’s intra data centers’ study between 2011 and 2018, an eye-watering 29% of incidents involving a technician resulted in an inconclusive root cause.

In an industry that holds reliability and innovation at such high standards, this should be unacceptable to everyone down the value chain. But it seems to be an unavoidable and accepted statistic.

Universal Chip Telemetry & the future of RMA

As electronics advance and complexities scale, quality-related issues peak. The cost-benefit ratio of RMA starts to skew. Before long, the opportunity cost quickly becomes hard to ignore. The ‘elephant in the room’ nature of RMA is staring data-centric industries in the face. It’s time we started making direct eye contact.

For too long RMA has been heavily reliant on outdated modes of operation that contrast as obsolete against a backdrop of breakneck innovation in other areas. Chip telemetry is starting to fill the gap of innovation around how electronics value chains can integrate reliability assurance through Deep Data. Rather than retrospective forensics and after-the-fact RMA inquisitions, suppliers, manufacturers and customers could all be on the same digital page.

Universal Chip Telemetry (UCT) empowers the chip with its own means of data creation and interpretation to play an active role in its own production, maintenance and function. The mystery of what went wrong can be solved at the click of a button, it doesn’t have to drain the resources of the value chain and everyone can have clear visibility. Together.

With parametric precision, the exact source of the issue can be pinpointed, paving the way for commonality analysis, and preventing epidemics. Not only that, UCT provides early indication of problems in the field, and alerts on faults before they become failures. This protects against excruciating costs and liabilities associated with system failures and provides an “insurance policy” to alleviate unnecessary urgencies and stress.

How Does Chip Telemetry Work?

Integrated at the pre-silicon design stage, proteanTecs embeds on-chip Agents™. These non-disruptive nano-monitors enable high-coverage parametric measurements that paint a detailed picture of the chip vital signs during application. Algorithms analyzing changes in time and expert systems are applied to the Agent readouts to provide a comprehensive analysis of the system’s health and performance.

Chips can be either the source of the issue or serve as a system sensor. With them being able to offer active status and health reports delivered through a common analytics dashboard, the value chain can start speaking the same fact-based, data-driven language to quickly converge around root issues. As mutual visibility and proactive collaboration increases, timescales drop and quality and accountability improve. We are now literally talking about milliseconds instead of months. Ultimately, RMA costs start to flatten out, finally arriving at a new destination of innovation to take its rightful place as a value contributor instead of resource consumer. Just imagine a world where for every RMA, you are sending back data, not hardware.

Interested in learning more about our solutions?

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.