←back to thread

346 points swatson741 | 1 comments | | HN request time: 0s | source
Show context
WithinReason ◴[] No.45789232[source]
Karpathy suggests the following error:

  def clipped_error(x): 
    return tf.select(tf.abs(x) < 1.0, 
                   0.5 * tf.square(x), 
                   tf.abs(x) - 0.5) # condition, true, false
Following the same principles that he outlines in this post, the "- 0.5" part is unnecessary since the gradient of 0.5 is 0, therefore -0.5 doesn't change the backpropagated gradient. In addition, a nicer formula that achieves the same goal as the above is √(x²+1)
replies(3): >>45789324 #>>45790005 #>>45791588 #
slashdave ◴[] No.45791588[source]
square roots are expensive
replies(1): >>45791737 #
1. WithinReason ◴[] No.45791737[source]
they are negligible, especially when the post was written when ops were not fused. The extra memory you need to store the extra tensors when you use the original version is more expensive