I spent a lot of time with a business partner and an expert looking at the design space for accelerators and it was made very clear to me that the memory interface puts a hard limit on what you can do and that it is difficult to make the most of. Particularly if a half-baked product is being rushed out because of FOMO you’d practically expect them to ship something that gives a few percent of the performance because the memory interface doesn’t really work, it happens to the best of them:
https://en.wikipedia.org/wiki/Cell_(processor)