←back to thread

86 points alexop | 1 comments | | HN request time: 0.341s | source
1. ZoomZoomZoom ◴[] No.43327546[source]
I've just written a tag similarity calculation for a custom set of tagged documents, and if you're going to actually use the metrics, it's probably a good idea to fill a similarity table instead of recalculating it all on the spot.

While doing that, the next logical step is precalculating the squared magnitudes for each item, and in my small test case of <1000 items that sped the calculation up almost 300 times. The gains are not exponential, but economy on that constant for each pair considered is not insignificant, especially on small data sets (of course, with large sets a table won't cut it due to memory limitations and requires more sophisticated techniques).