Achieving Human Parity on Visual Question Answering [2111.08896]