Sequence Similarity

Sequence similarity analysis focuses on quantifying the resemblance between sequences of various types, such as DNA, protein, or code, to infer relationships, predict properties, or improve computational efficiency. Current research emphasizes developing novel algorithms and models, including those based on abstract syntax trees, Gaussian process regression, and neural network embeddings (e.g., Poisson regression), to improve accuracy and efficiency in tasks like protein variant effect prediction, code error detection, and DNA storage optimization. These advancements have significant implications across diverse fields, enabling more accurate biological predictions, improved software development tools, and more efficient data storage and retrieval methods.

Papers