Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining [2410.08102]