|
1 | 1 | # What to return for non-differentiable points |
2 | 2 | !!! info "What is the short version?" |
3 | | - If the function is not differentiable due to e.g. a branch, like `abs`, your rule can reasonably claim the derivative at that point is the value from either branch, *or* any value in-between (e.g. for `abs` claiming 0 is a good idea). |
4 | | - If it is not differentiable due to the primal not being defined on one side, you can set it to what ever you like. |
5 | | - Your rule should claim a derivative that is *useful*. |
6 | | -In calculus one learns that if the derivative as computed by approaching from the left, |
7 | | -and the derivative one computes as approaching from the right are not equal then the derivative is not defined, |
8 | | -and we say the function is not differentiable at that point. |
9 | | -This is distinct from the notion captured by [`NoTangent`](@ref), which is that the tangent space itself is not defined: because in some sense the primal value can not be perturbed e.g. is is a discrete type. |
| 3 | + If the function is not-differentiable choose to return something useful rather than erroring. |
| 4 | + For a branch a function is not differentiable due to e.g. a branch, like `abs`, your rule can reasonably claim the derivative at that point is the value from either branch, *or* any value in-between. |
| 5 | + In particular for local optima (like in the case of `abs`) claiming the derivative is 0 is a good idea. |
| 6 | + Similarly, if derivative is from one side is not defined, or is not finite, return the derivative from the other side. |
| 7 | + Throwing an error, or returning `NaN` is generally the least useful option. |
10 | 8 |
|
11 | 9 | However, contrary to what calculus says most autodiff systems will return an answer for such functions. |
12 | 10 | For example for: `abs_left(x) = (x <= 0) ? -x : x`, AD will say the derivative at `x=0` is `-1`. |
@@ -137,4 +135,4 @@ These rough rules are: |
137 | 135 | - If the derivative from one side is finite and the other isn't, say it is the derivative taken from finite side. |
138 | 136 | - When derivative from each side is not equal, strongly consider reporting the average |
139 | 137 |
|
140 | | -Our goal as always, is to get a pragmatically useful result for everyone, which must by necessity also avoid a pathological result for anyone. |
| 138 | +Our goal as always, is to get a pragmatically useful result for everyone, which must by necessity also avoid a pathological result for anyone. |
0 commit comments