Comments on the first draft

Comments on the first draft

This is good and well written, but it needs more insight (for the intended audience). I suggest you add:

  • Some hints at the theorems. For example, in the definition of coresets, you talk about "for any C of appropriate size" (why isn't it P?), and then it would be good to say what size is needed. (Can we just select some at random?) You give us the definitions of Bayesian coresets, but then don't tell us what is known about the approximation (are there known bounds?) It would be good to have some details about what is known (for at least one case), instead of teasing us with "several convergence results exist" - perhaps give us the simplest one. It would be nice to have both theoretical and empirical results (or say why some are missing).
  • Some case(s) in more detail. It is usually easiest for a reader to understand a theoretical topic if they have one example in detail. People are good at generalizing from examples (even single examples). I suggest that you choose one of the constriction examples, and do it in more detail (tell us how the examples are chosen, and what bounds are known in theory and in practice). Then say how the other methods relate.

I'd like to see some hints about the relevance for continual learning. Note that "retaining past experiences" is ambiguous; indeed the standard solution used in reinforcement learning is to keep retraining on past experiences that have been stored. But I don't think you mean that. But that seems to be more related to Bayesian coresets than what you gave (I think). Are these proposals theoretical (with proofs) or empirical (with experimental results) or just proposals. It would be nice to know. I'm thinking about a student who knows that you talked about catastrophic forgetting, and wants some idea of what is known.

Minor comments:

  • I kept reading it as "co-resets" not "core-sets". I'm not sure you can do anything about that
  • You need a space after many of your equations
  • I was trying to work out what \overline means in "streaming and parallel computation", but then realized it was a name constructor.

I don't think I'm suggesting a big change, essentially just one case in more detail.

DavidPoole (talk)21:39, 12 February 2019