Purple LLaMA CyberSecEval
Purple Llama CyberSecEval is a benchmark designed to evaluate the cybersecurity robustness of large language models (LLMs) used in code generation, focusing on both the generation of insecure code and susceptibility to malicious prompts. Current research emphasizes improving LLMs' security through various techniques, including fine-tuning with specialized datasets and developing novel architectures like adapters or excitor blocks to enhance instruction following while preserving pre-trained knowledge. This work is significant because it directly addresses the critical need for secure and reliable LLMs in increasingly sensitive applications, providing a standardized evaluation framework for researchers and developers to improve the safety of AI systems.
Papers
Extrapolating Large Language Models to Non-English by Aligning Languages
Wenhao Zhu, Yunzhe Lv, Qingxiu Dong, Fei Yuan, Jingjing Xu, Shujian Huang, Lingpeng Kong, Jiajun Chen, Lei Li
LLaMA-E: Empowering E-commerce Authoring with Object-Interleaved Instruction Following
Kaize Shi, Xueyao Sun, Dingxian Wang, Yinlin Fu, Guandong Xu, Qing Li