Classification-based detection and quantification of cross-domain data bias in materials discovery [2311.09891]