MatrixVT: Efficient Multi-Camera to BEV Transformation for 3D Perception [2211.10593]