The architecture of a cell matrix has some natural benefits in terms of fault tolerance. Among these are:
robustness: Since the hardware is simply a very large array of simple cells, there are very few critical points within the cell matrix. Thus, if a manufacturing defect disables one section of the hardware, there's a good chance the rest of the matrix will still function normally (power lines and the single system-wide clock are the exceptions to this). This type of fault tolerance has been observed in manufactured prototypes of the cell matrix.
|
redundancy: During run-time, should a failure occur in part of the system, you won't lose any specialized hardware. Since all the hardware is the same, all you've lost is the configuration information stored in that hardware. If such information is stored redundantly throughout the matrix, it is possible (theoretically) to restore the damaged functionality by simply programming other cells with the lost functionality. While there are formidable programming challenges associated with this, the basic capability is
supported at the hardware level.
|
This fault tolerance is important not only from a reliability standpoint, but from a manufacturing one as well. Since large matrices
are necessary for highly parallel problems (see parallel),
the ability to have manufacturing defects and still use the manufactured matrix is essential.
other features .
core
|
|