GTR CTRL
"CTRL" (Controllable Text and Representation Learning) encompasses a range of research efforts focused on enhancing the controllability and fidelity of generative models across diverse domains, including image generation, video synthesis, and music composition. Current research emphasizes developing novel architectures and algorithms, such as diffusion models and transformers, to achieve finer-grained control over generated outputs while mitigating issues like hallucinations and inconsistencies. This work is significant for improving the reliability and usability of AI-generated content, impacting fields ranging from medical image analysis and robotics to creative content generation and autonomous driving.
Papers
CtrlA: Adaptive Retrieval-Augmented Generation via Probe-Guided Control
Huanshuo Liu, Hao Zhang, Zhijiang Guo, Kuicai Dong, Xiangyang Li, Yi Quan Lee, Cong Zhang, Yong Liu
Vim-F: Visual State Space Model Benefiting from Learning in the Frequency Domain
Juntao Zhang, Shaogeng Liu, Kun Bian, You Zhou, Pei Zhang, Wenbo An, Jun Zhou, Kun Shao