Eh, the coordinate frame can really be anything. It's important to disambiguate what is really meant. The convention in images is that images are +X-Y, but for certain applications, the PNG may represent data that is +X+Y, or mirrored -X+Y, landscape, or portrait. Is the coordinate system the camera coordinates or the world coordinates?
It's true that automatic handling of all input images is difficult, but imo it's important to document.
An example I recently encountered is that in neurological imaging, the axes are patient's right, anterior, superior whereas in radiology they are patient's left, anterior superior. Tricky to get right...
http://www.grahamwideman.com/gw/brain/orientation/orientterm...