Large-Scale Learning with Less RAM via Randomization by Daniel Golovin, D. Sculley, H. Brendan McMahan, Michael Young.
Abstract:
We reduce the memory footprint of popular large-scale online learning methods by projecting our weight vector onto a coarse discrete set using randomized rounding. Compared to standard 32-bit float encodings, this reduces RAM usage by more than 50% during training and by up to 95% when making predictions from a fixed model, with almost no loss in accuracy. We also show that randomized counting can be used to implement per-coordinate learning rates, improving model quality with little additional RAM. We prove these memory-saving methods achieve regret guarantees similar to their exact variants. Empirical evaluation confirms excellent performance, dominating standard approaches across memory versus accuracy tradeoffs.
I mention this in part because topic map authoring can be assisted by the results of machine learning.
It is also a data point for the proposition that unlike their human masters, machines are too precise.
Perhaps it is the case that the vagueness of human reasoning has significant advantages over the disk grinding precision of our machines.
The question then becomes: How do we capture vagueness in a system where every point is either 0 or 1?
Not probabilistic because that can be expressed but vagueness, which I experience as something different.
Suggestions?
PS: Perhaps that is what makes artificial intelligence artificial. It is too precise. 😉
I first saw this in a tweet by Stefano Bertolo.