I think they mean whatever's handling the model. A program into which you feed this inherently restricted format, so it takes advantage of those limitations, in order to run more efficiently.
Like if every number's magnitude is 1 or 0, you don't need to do floating-point multiplication.
Oh sure, Google only fucks with how texting works, whether my car GPS cooperates, what the entire UI looks like...
I know this specific abuse was not Google's fault specifically. But it's the same god damn thing. Nobody asked me if I wanted that. They just shoved it onto the machine I paid to own and pay to use, like it's not mine enough to deserve respect, let alone control.
It's trinary, and I understand why they instead say "1-bit," but it still bugs me that they call it "1-bit."
I'd love to see how low they can push this and still get spooky results. Something with ten million parameters could fit on a Macintosh Classic II - and if it ran at any speed worth calling interactive, it'd undercut a lot of loud complaints about energy use. Training takes a zillion watts. Using the model is like running a video game.
Guinan: "More?"
Data: "Please."