←back to thread

37 points luu | 2 comments | | HN request time: 0.452s | source
1. trebligdivad ◴[] No.43698092[source]
Is this limited to lockstep between softcores on a die - so good for low level error failures like soft error, but no good if the package dies? (Still very neatly done)
replies(1): >>43712239 #
2. addaon ◴[] No.43712239[source]
> Is this limited to lockstep between softcores on a die - so good for low level error failures like soft error, but no good if the package dies? (Still very neatly done)

Depends on what you mean by "good for." The intent of lockstep is to convert essentially all undetectable errors to detectable errors, usually to allow fail-silent behavior, rather than to eliminate detectable errors. This property that all failures have defined failure modes is then used at the system level to build robust systems; for example downstream actuators can receive multiple command streams from multiple lockstep systems, and, relying on the invariant that a correctly received message came from a correctly operating system, can safely act on any of them, rather than needing to vote on the received messages. A package failure should be very unlikely to introduce an undetectable error in this context.