There's a certain change of perspective with modern AI (by "modern" I mean Resnet and beyond). When I was deep into neural nets in the 1990s, they weren't that large, and I would think of them in terms of the number of weights and nodes - but modern deep learning seems to have has moved up a few levels of abstraction. (I stepped away from the field for a while). And there's a certain understanding people seem to have now regarding the "gradient flow" through the net and why certain architectures work well (Resnet, Unets etc). I must say I'm finding it tricky to shift into this new level of thinking. Also Transformers - still looking for an intuitive sense of how they work, haha.
replies(1):