VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception [2501.00510]