Section summary
We propose a novel semi-bandit feedback model based on pairwise influence (Section 4). Our feedback model is weaker than the edge-level feedback proposed in (Chen et al., 2016; Wen et al., 2017). Under this feedback, we formulate IM semi-bandit as a linear bandit problem and propose a scalable LinUCB-based algorithm (Section 5). We bound the cumulative regret of this algorithm (Section 6) and show that our regret bound has the optimal dependence on the time horizon, is linear in the cardinality of the seed set, and as compared to the previous literature, has a better dependence on the size of the network. In Section 7, we describe how to construct features based on the graph Laplacian eigenbasis and describe a practical implementation of our algorithm. Finally, in Section 8, we empirically evaluate our proposed algorithm on a real-world network and show that it is statistically efficient and robust to the underlying diffusion model.
Model-Independent Online Learning for Influence Maximization

浙公网安备 33010602011771号