If two nodes are on, but the connection between them is negative, this causes energy to be higher.
If one of those nodes switches off, energy is reduced.
With two nodes this is trivial. With 10 nodes it's more difficult to solve, and with billions of nodes it is impossible to "solve".
All you can do then is try to get the energy as low as possible.
This way also neural networks can find out "new" information, that they have not learned, but is consistent with the constraints they have learned about the world so far.