←back to thread

290 points sebg | 1 comments | | HN request time: 0.203s | source
Show context
StableAlkyne ◴[] No.41891352[source]
Random forests are incredible cool algorithms.

The idea that you can take hundreds of bad models that over fit (the individual decision trees), add even more randomness by randomly picking training data and features*, and averaging them together - it's frankly amazing that this leads to consistently OK models. Often not the best but rarely the worst. There's a reason they're such a common baseline to compare against.

*Unless you're using Sklearn, whose implementation of RandomForestRegressor is not a random forest. It's actually bagged trees because they don't randomly select features. Why they kept the misleading classname is beyond me.

replies(2): >>41891898 #>>41898764 #
1. jncfhnb ◴[] No.41891898[source]
With a relatively small variant to make it gradient boosted trees it’s pretty much as good as it gets for tabular data these days