I'm in an adjacent space, so I can imagine some of the difficulties. Basically live streaming is a parallel infrastructure that shares very little with pre-recorded streaming, and there are many failure points.
* Encoding - low latency encoders are quite different than storage encoders. There is a tradeoff to be made in terms of the frequency of key frames vs. overall encoding efficiency. More key frames means that anyone can tune in or recover from a loss more quickly, but it is much less efficient, reducing quality. The encoder and infrastructure should emit transport streams, which are also less efficient but more reliable than container formats like mp4.
* Adaptation - Netflix normally encodes their content as a ladder of various codecs and bitrates. This ensures that people get roughly the maximum quality that their bandwidth will allow without buffering. For a live event, you need the same ladder, and the clients need to switch between rungs invisibly.
* Buffering - for static content, you can easily buffer 30 seconds to a minute of video. This means that small latency or packet loss spikes are handled invisibly at the transport/buffering layer. You can't do this for a live event, since that level of delay would usually be unacceptable for a sporting event. You may only be able to buffer 5-10 seconds. If the stream starts to falter, the client has only a few seconds to detect and shift to a lower rung.
* Transport - Prerecorded media can use a reliable transport like TCP (usually HLS). In contrast, live video would ideally use an unreliable transport like UDP, but with FEC (forward error correction). TCP's reaction to packet loss halves the receive window, which halves bandwidth, which would have to trash the connection to shift to a lower bandwidth rung.
* Serving - pre-recorded media can be synchronized to global DCs. Live events have to be streamed reliably and redundantly to a tree of servers. Those servers need to be load balanced, and the clients must implement exponential backoff or you can have cascading failures.
* Timing - Unlike pre-recorded media, any client that has a slightly fast clock will run out of frames and either need to repeat frames and stretch audio, or suffer glitches. If you resolve this on the server side by stretching the media, you will add complication and your stream will slowly get behind the live event.
* DVR - If you allow the users to pause, rewind, catch up, etc., you now have a parallel pre-recorded infrastructure and the client needs to transition between the two.
* DRM - I have no idea how/if this works on a live stream. It would not be ideal that all clients use the same decryption keys and have the same streams with the same metadata. That would make tracing the source of a pirate stream very difficult. Differentiation/watermarking adds substantial complexity, however.