The multiplicative weights algorithm is simple to define and analyze, and it has several applications, but both its definition and its analysis seem to come out of nowhere. We mentioned that all the quantities arising in the algorithm and its analysis have statistical physics interpretations, but even this observation brings up more questions than it answers. The Gibbs distribution, for example, does put more weight on lower-energy states, and so it makes sense in an optimization setting, but to get good approximations one wants to use lower temperatures, while the distributions used by the multiplicative weights algorithms have temperature ${1/\epsilon}$, where ${2\epsilon}$ is the final “amortized” regret bound, so that one uses, quite counterintuitively, higher temperatures for better approximations.
Furthermore, it is not clear how we would generalize the ideas of multiplicative weights to the case in which the set of feasible solutions ${K}$ is anything other than the set of distributions.