I, a layman, have had similar experiences. It seems tinfoil-tier to think my phone's mic is always on and running some NLP. If I wasn't loosely technical I would dismiss it as such.
It's basically common knowledge that if you turn on "hear me say OK Google" that it needs to always be listening to hear "OK Google". That's a green light to start parsing everything said all the time, which will be turned around for advertising. Because that's how Google makes money.
Expecting Google to use data to advertise isn't tinfoil?
But detecting “oh, this is voice” is easy. Recording the time where that happens is also easy. Knowing when people are chatting around the phone, and roughly what the fundamental frequency of their voice is, could make $0.001 per person. At Google's scale, that's worth it.
So long as they don't get caught, anyway, because that's all sorts of illegal. Especially at Google's scale. (I don't think Google does this… probably.)