Hey @richardbrand
It also means these services have to provide their own custom app at the client end to handle TCP!
Strictly, every TCP packet has to be acknowledged as received and correct in an individual message sent back to the server,
I believe this is not fully correct. The TCP stack at the client and server handle some buffering and out of order re-assembly. For every packet received at the client, the TCP stack sends bacn an acknowledgement (ACK). If a packet is not ACK’d in time the sender will attempt to re-transmit until it gets an ACK for that specific packet.
Of course, there are limits and time outs, but this is the process handled most of the time for all TCP streams, until things get TOO bad to fix. What the client app (Netflix, Roon, etc.) does is try to buffer enough of the stream, say 10 seconds as an example, so that when these packets are delayed they don’t affect playback.
The TCP/IP stream may have built in resiliency, but it’s HIGH JITTER (i’m using this term metaphorically, not specifically. That is, if you loose 5 packets, or end up with packets taking 2 separate routs which means constant re-ordering this causes stuttering, or pauses in the network pipeline.
If you have ever started a large download and watched the progress bar you may have noticed that it seems to go quickly, then pause, then run again, then slow down. All this is being handled by the TCP layer for the most part, but you certainly don’t want to hear music or watch a TV stream like that.
The role of the client software buffers is to remove that network induced jitter, to to the human consuming the media it sounds and looks like one complete uninterrupted presentation.
When watching Hulu, or listening to Qobuz, a little delay at the start is preferable to stuttering, but when in a meeting, we want our video and audio in real time, and we're happier to simply loose packets than to wait for them to be reassembled. That's why there's a difference in TCP vs. UDP for media playback.