Privacy-preserved LLM Cascade via CoT-enhanced Policy Learning [2410.08014]