←back to thread

346 points swatson741 | 1 comments | | HN request time: 0.198s | source
Show context
WithinReason ◴[] No.45789232[source]
Karpathy suggests the following error:

  def clipped_error(x): 
    return tf.select(tf.abs(x) < 1.0, 
                   0.5 * tf.square(x), 
                   tf.abs(x) - 0.5) # condition, true, false
Following the same principles that he outlines in this post, the "- 0.5" part is unnecessary since the gradient of 0.5 is 0, therefore -0.5 doesn't change the backpropagated gradient. In addition, a nicer formula that achieves the same goal as the above is √(x²+1)
replies(3): >>45789324 #>>45790005 #>>45791588 #
1. kingstnap ◴[] No.45790005[source]
You do that to make things smoother when plotted. You could in theory add some crazy stairstep that adds a hundred to the middle part. It would make your loss curves spike and increase towards convergence but then those spikes are just visual artifacts from doing weird discontinuous nonsense with yoru loss.