Low Latency

Low latency, the minimization of delay in information processing, is a critical objective across diverse fields, driving research into efficient algorithms and hardware architectures. Current efforts focus on optimizing large language models (LLMs) for faster inference through techniques like speculative decoding and efficient resource allocation on GPUs, as well as developing low-latency solutions for speech processing, image recognition, and other real-time applications using spiking neural networks and specialized hardware like FPGAs. Achieving low latency is crucial for enabling real-time responsiveness in applications ranging from autonomous vehicles and interactive virtual reality to hearing aids and industrial IoT systems, significantly impacting performance and user experience.

130papers

Papers - Page 2

November 7, 2024

Exploring the Feasibility of Affordable Sonar Technology: Object Detection in Underwater Environments Using the Ping 360
Data Detection Low Cost Obstacle Avoidance Sonar Underwater Simulation Arbitrary Object Underwater Environment Low Latency

November 5, 2024

DroidSpeak: KV Cache Sharing for Cross-LLM Communication and Multi-LLM Serving
Bot Generated Text Low Latency Multi Agent System

November 4, 2024

Modulating State Space Model with SlowFast Framework for Compute-Efficient Ultra Low-Latency Speech Enhancement
Low Latency Faster Model State Space Model Speech Enhancement

November 1, 2024

On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
Edge AI Resource Constraint MLOps System Low Latency Global Impact Mobile Edge Model Performance

October 26, 2024

Estuary: A Framework For Building Multimodal Low-Latency Real-Time Socially Interactive Agents
Social Interaction Generative Artificial Intelligence Universal Model Coastal Area Low Latency Real Time Artificial Intelligence New Framework Interactive Agent

October 21, 2024

CA*: Addressing Evaluation Pitfalls in Computation-Aware Latency for Simultaneous Speech Translation
Simultaneous Speech Translation Latency Metric Low Latency

October 16, 2024

BOXR: Body and head motion Optimization framework for eXtended Reality
Low Latency Human Body Extended Reality Head Motion Latency Metric

October 14, 2024

QIANets: Quantum-Integrated Adaptive Networks for Reduced Latency and Improved Inference Times in CNN Models
Adaptive Network Low Latency Convolutional Neural Network SE SPP DenseNet Deep Convolutional Neural Network CNN Model

October 8, 2024

SwiftQueue: Optimizing Low-Latency Applications with Swift Packet Queuing
Latency Critical Low Latency Latency Prediction Latency Predictor SWIFT DynGFN

October 7, 2024

FogROS2-PLR: Probabilistic Latency-Reliability For Cloud Robotics
Cloud Robotics Delay Distribution Low Latency

October 4, 2024

Mixture of Attentions For Speculative Decoding
Speculative Decoding Mixture Component Smart Device Attention Head Low Latency Linear Speedup

October 2, 2024

ConServe: Harvesting GPUs for Low-Latency and High-Throughput Large Language Model Serving
LLM Inference Environmental Conservation GPU Cluster Low Latency Large Language Model Language Model Inference Task

October 1, 2024

Lotus: learning-based online thermal and latency variation management for two-stage detectors on edge devices
Two Stage Object Two Stage Detector Low Latency

September 27, 2024

Speech Boosting: Low-Latency Live Speech Enhancement for TWS Earbuds
Speech Enhancement Model Low Latency Noisy Environment Active Noise Control Speech Enhancement

September 24, 2024

Ultra-low latency quantum-inspired machine learning predictors implemented on FPGA
Low Latency Scalable Quantum Field Programmable Gate Array Tensor Network Many Body

September 19, 2024

Towards Low-latency Event-based Visual Recognition with Hybrid Step-wise Distillation Spiking Neural Networks
Distillation Learning Low Latency Neuromorphic Datasets Self Distillation SNN Model Visual Recognition

September 16, 2024

Ultra-Low Latency Speech Enhancement - A Comprehensive Study
Low Latency Speech Enhancement Model

September 6, 2024

Towards Energy-Efficiency by Navigating the Trilemma of Energy, Latency, and Accuracy
Emotional Dilemma Extended Reality Energy Policy Research Energy Efficiency Scene Reconstruction Immersive Experience Low Latency

September 2, 2024

What does it take to get state of the art in simultaneous speech-to-speech translation?
Internal State Low Latency Art Specific Information Speech Translation

August 25, 2024

A proof of contribution in blockchain using game theoretical deep learning model
Deep Learning Theory Client Contribution Partial Proof Game Content Blockchain Based Platform Edge Resource Low Latency Edge Computing

Low Latency

Papers - Page 2

Exploring the Feasibility of Affordable Sonar Technology: Object Detection in Underwater Environments Using the Ping 360

DroidSpeak: KV Cache Sharing for Cross-LLM Communication and Multi-LLM Serving

Modulating State Space Model with SlowFast Framework for Compute-Efficient Ultra Low-Latency Speech Enhancement

On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance

Estuary: A Framework For Building Multimodal Low-Latency Real-Time Socially Interactive Agents

CA*: Addressing Evaluation Pitfalls in Computation-Aware Latency for Simultaneous Speech Translation

BOXR: Body and head motion Optimization framework for eXtended Reality

QIANets: Quantum-Integrated Adaptive Networks for Reduced Latency and Improved Inference Times in CNN Models

SwiftQueue: Optimizing Low-Latency Applications with Swift Packet Queuing

FogROS2-PLR: Probabilistic Latency-Reliability For Cloud Robotics

Mixture of Attentions For Speculative Decoding

ConServe: Harvesting GPUs for Low-Latency and High-Throughput Large Language Model Serving

Lotus: learning-based online thermal and latency variation management for two-stage detectors on edge devices

Speech Boosting: Low-Latency Live Speech Enhancement for TWS Earbuds

Ultra-low latency quantum-inspired machine learning predictors implemented on FPGA

Towards Low-latency Event-based Visual Recognition with Hybrid Step-wise Distillation Spiking Neural Networks

Ultra-Low Latency Speech Enhancement - A Comprehensive Study

Towards Energy-Efficiency by Navigating the Trilemma of Energy, Latency, and Accuracy

What does it take to get state of the art in simultaneous speech-to-speech translation?

A proof of contribution in blockchain using game theoretical deep learning model