Grounding Network

Grounding, in the context of artificial intelligence, refers to the process of connecting abstract representations within a model (like language or knowledge graphs) to the real world, ensuring its outputs are accurate and reliable. Current research focuses on improving grounding in various modalities, including vision, audio, and text, often employing large language models (LLMs) and multimodal architectures enhanced by techniques like instruction tuning, contrastive learning, and knowledge graph integration. This work is crucial for developing more trustworthy and robust AI systems, with applications ranging from improved conversational agents and embodied robots to more reliable visual question answering and anomaly detection.

Papers