Except, again, that "minimum 12-bit" refers to the file format, which must contain a minimum of 12 bits of information per channel.
My point: how many bits of information come off the sensor? If it produces only 8 bits or 10 bits per channel, then encodes that data in the 12-bit file format, we don't magically gain any extra bits of information; the missing bits will all be set to '0'.
Imagine a black and white image; that's 1 bit of information per channel. White is represented by a '1', and black by a '0'. Now encode that information in 8 bits; white becomes 00000001, and black becomes 00000000. No new information is added; just a bunch of zeroes.
Again, it would be great to have 12 bits per channel, and pretty great to have 10 bits per channel. But the file format used by the X5R does not determine how many bits are reported by the sensor. We still have to wait and see on that.