Abstract
Obtaining quadrillion-transistor logic systems despite imperfect manufacture, hardware failure, and incomplete system specification
Lisa J.K. Durbeck and Nicholas J. Macias
 
in Nano, Quantum and Molecular Computing Sandeep K. Shukla and R. Iris Bahar, eds. 2004 Kluwer Academic Publishers, Boston ISBN 1-4020-8067-0 pp. 109-131
Copyright© 2004 Kluwer Academic Publishers
 
New approaches to manufacturing low-level logic—switches, wires, gates—are under development that are stark departures from current techniques, and may drastically advance logic system manufacture. At some point in the future, possibly within 20 years, logic designers may have access to a billion times more switches than they do now. It is sometimes useful to allow larger milestones such as this to determine some of the directions of contemporary research. What questions must be answered so that we sooner and more gracefully reach this milestone at which logic systems contain a billion times more components? Some problems include how to design, implement, maintain, and control such large systems so that the increase in complexity yields a similar increase in performance. When logic systems contain 1017 switches or components, it will be prohibitively difficult or expensive to manufacture them perfectly. Also, the handling and correction of operating errors will consume a lot of system resources. We believe these tendencies can be minimized by the introduction of a low-cost redundancy so that, in essence, if one switch or transistor fails, the one next to it can take over for it. This reduces effective hardware size by a factor in exchange for a way both to use imperfect manufacturing techniques, and, through similar means, maintain the system during its life cycle. It may also be possible to use similar basic principles for a more complex problem, designing a system that can catch and compensate for operating errors, but with low enough cost in time and resources to allow incorporation into all large systems. We suggest that such a system will be a distributed, parallel system or mode of operation in which systems failure detection is a hierarchical set of increasingly simple, local tasks run while the system is running. Work toward answering these questions appears to also yield some useful ways to approach a more general question, of constructing systems when their structure and function cannot be completely predetermined.