BT is generally a 1:1 secure connection. Very inaccurate description but serves our purpose. BT was originally designed waaay before smartphones as a way of maintaining low power wireless connectivity to stuff like keyboards and mouses.
It's increased in ability since then, but a lot of the focus has understandably been on increasing the capacity of the 'host' of the party rather than the guests. More people can join the party (devices connected to your phone/pc), but they're doing different activities at different times.
The challenge with audio is they all have to receive the exact same data at the exact same time otherwise humans notice - which the protocol wasn't really designed for. There's been some inroads, but it's a bit of a protocol limit. This is why most BT headsets are a single unit - one receiver (guest), receiving the data then disseminating to its family group. Airpods get around this by having one that actually connects to your phone, then the second one syncs to the first pod. (Someone at the party chatting with a friend elsewhere on the phone, to stretch the analogy)
When you start mixing multiple guests of varying hardware wanting the same thing at the same time with varying latencies from the one host, it can get real messy, and we're really good at picking up audio discrepancies