Editorial
Welcome to the fifth issue of the Journal on Selected Areas In Information Theory (JSAIT), dedicated to 鈥淪equential, active, and reinforcement learning鈥.
Welcome to the fifth issue of the Journal on Selected Areas In Information Theory (JSAIT), dedicated to 鈥淪equential, active, and reinforcement learning鈥.
Gradient coding allows a master node to derive the aggregate of the partial gradients, calculated by some worker nodes over the local data sets, with minimum communication cost, and in the presence of stragglers. In this paper, for gradient coding with linear encoding, we characterize the optimum communication cost for heterogeneous distributed systems with arbitrary data placement, with $s \in \mathbb {N}$ stragglers and $a \in \mathbb {N}$ adversarial nodes.
In this paper we consider the problem of best-arm identification in multi-armed bandits in the fixed confidence setting, where the goal is to identify, with probability $1-\delta $ for some $\delta >0$ , the arm with the highest mean reward in minimum possible samples from the set of arms $\mathcal {K}$ . Most existing best-arm identification algorithms and analyses operate under the assumption that the rewards corresponding to different arms are independent of each other.
Actor-critic algorithm and their extensions have made great achievements in real-world decision-making problems. In contrast to its empirical success, the theoretical understanding of the actor-critic seems unsatisfactory. Most existing results only show the asymptotic convergence, which is developed mainly based on approximating the dynamic system of the actor and critic using ordinary differential equations. However, the finite-time convergence analysis of the actor-critic algorithm remains to be explored.
We study the best-arm identification problem in multi-armed bandits with stochastic rewards when the goal is to identify the arm with the highest quantile at a fixed, prescribed level. First, we propose a successive elimination algorithm for strictly optimal best-arm identification, show that it is $\delta $ -PAC and characterize its sample complexity. Further, we provide a lower bound on the expected number of pulls, showing that the proposed algorithm is essentially optimal up to logarithmic factors.
We construct and analyze active learning algorithms for the problem of binary classification with abstention, in which the learner has an additional option to withhold its decision on certain points in the input space. We consider this problem in the fixed-cost setting, where the learner incurs a cost $\lambda \in (0, 1/2)$ every time the abstain option is invoked. Our proposed algorithm can work with the three most commonly used active learning query models, namely, membership-query, pool-based, and stream-based models.