x86 needs to use more complicated logic to deal with the instruction stream than ARM, freeing more of the silicon for things like better reordering and more execution units. OTOH, the SMT somewhat mitigates the delays caused in reordering by working on more than one instruction stream at once. I'd say the 16-thread chip will end up being overall faster than the 8-core one, if cache misses don't create a huge penalty for the slower memory bus of the x86. The i9-9980HK is also two generations behind, which doesn't help it much.
When I said there is no magic, I was warning that we shouldn't expect huge speedups or a crushing advantage, at least not for long. The edge M1 has is due to a simpler ISA (which is less demanding to run efficiently, freeing more resources for optimization and execution) and a faster memory interface (which makes an L3 miss less of a punishment). This fast memory interface also limits it to, for now, 16GB of memory. If the dataset has 17GB, it'll suffer. Another difference is that all of the i9 cores are designed to be fast, whereas only 4 cores of the M1 are. This added flexibility can be put to good use by moving CPU-bound processes to the big cores and IO-bound and low-priority ones to the little ones.
In the end, they are very different chips (in design and TDP). It'd be interesting to compare them with actual measurements, as well as newer Intel ones.