Adaptive Stochastic Gradient Descent for Fast and Communication-Efficient Distributed Learning [2208.03134]