So yes, any image was extremely ephemeral at the time.
PS: Apparently it’s called a Noddy, it’s a video camera controlled by a servomotor to pan and tilt (or 'nod', hence the name Noddy): https://en.wikipedia.org/wiki/Noddy_(camera)
The Noddy was used since it was a live broadcast and “allowed the idents to be of no fixed length as the clock symbols could continue for many minutes at a time”.
So, it’s not really because they couldn’t store video. It’s because they needed an indefinite amount of video for the clock idents and couldn’t generate them digitally.
In contrast, pointing a TV camera at a spinning globe was much easier. And for showing the time, pointing at a physical clock was much easier than, what, having twelve hours of film footage available and having to synch the right frame?
I think what’s maybe more surprising for people than that moving station idents were typically in camera props, is that broadcasting even a static image pre-digital was also much more easily accomplished by just pointing a camera at a piece of card - even repeating a single frame over and over again was not something that could be easily reproduced some other way; having a camera continually capture and immediately broadcast the frame was just much easier.
Video tape, once it came in, allowed freeze frames but continually reading from the same spot on a tape caused wear so you couldn’t rely on being able to show a single frame from tape indefinitely.
Digital freeze frame machines that could capture a frame of video and repeatedly play it back from a memory buffer only started showing up in the 1980s.