The short version is that if we can approximate the sensory experience and the motor experience of an organism, and we can successively refine that approximation as measured by similarity in behavior between bat and man-bad, then I would argue that we can in fact imagine what it is like to be a bat.
In short, it is a Chinese Bat Room argument. If you put a human controlling a robot bat and a bat in two boxes and then ask someone to determine which is the human and which is the bat, when science can no longer tell the difference (because we have refined the human/bat interface sufficiently) you can ask the human controlling the robot bat to write down their experience and it would be strikingly similar to what the bat would say if we could teach it English.
The bat case is actually easier than one might suppose, similarly say, a jumping spider, because we can translate their sensory inputs to our nervous system and if we tune our reward system and motor system so that we can get even an approximate set of inputs and similar set of actuators, then we can experience what it is like to be a bat.
Further, if I improve the fidelity of the experimental man-bat simulation rig, the experience will likewise converge. While we will not be able to truly be a bat since that is asymptotically mutually exclusive with our biology, the fact that we can build systems that allow progressive approach to bat sensory motor experience means that we actually do have the ability to image the experience of other beings. That is, our experiences are converging and differ only due to our lack of our technical ability to overcome the limitations of our biological differences.
The harder case is when we literally don't have the molecule that is used to detect something, as in the tetrachormat case. That said one of my friends has always wanted to find a way to do an experiment where a trichromat can somehow have the new photo receptor expressed in one eye and see what happens.
The general argument about why we would expect something similar to happen should the technical hurdles be overcome is because basically all nervous systems wire themselves up by learning. Therefore, as long as the input and output ranges can be mapped to something that a human can learn, then a human nervous system should likewise converge to be able to sense and produce those inputs and outputs (modulo certain critical periods in neural development, though even those can be overcome, e.g. language acquisition by slowing down speech for adults).
Some technical hurdle examples. Converting a trichromat into a tetrachormat by crispering someone's left eye. Learning dolphin by slowing down dolphin speech in time while also providing a way for humans to produce dolphin high frequency speech via some transform on the human orofacial vocal system. There are limitations when we can't literally dilate time, but I supposed if we are going all the way, we can accelerate the human to the fraction of the speed of light that will compensate for the fact that the human motor system can't quite operate fast enough to allow a rapid fire conversation with a dolphin.