An Initial Exploration: Learning to Generate Realistic Audio for Silent Video [2308.12408]