Learning to Jointly Transcribe and Subtitle for End-to-End Spontaneous Speech Recognition [2210.07771]