Gonna be a bit of a weirdo and argue for the paper-y bubbles: not only do they add visual interest, they also add to the mood of the scene in an immersive way. Combined with the color scheme and watercolor textures, they make things feel more dreamlike and wistful, with just a hint of macabre. And if that's what you're going for, I'd keep em. I'd even prioritize that OVER readability, if the situation called for it.
I think of it like this: generally, in media with audio, you want characters to speak clearly and loud enough for the audience to hear. But...you don't actually always want that, because that's not always what a scene needs. Sometimes people mumble, sometimes they stutter, sometimes there's a lot of background noise in the setting.
And if you portray these things, your audience may not always catch every word as well as they could...but sometimes that's okay, because as long as the mood of a character or a scene comes across correctly, that's enough, and they can always rewind (or in this case, zoom in) if they want all the details. And of course, when it IS important for them to hear the words clearly, you'll make sure they can.
...On the other hand, you could compromise and just make the paper texture a bit lighter. ^^; But that's just my 2 cents.