The Program Testing Ability of Large Language Models for Code [2310.05727]