How can Large Languages Models for code be used to harm the privacy of users?

Red-Teaming Large Languages Models

More Info
expand_more

Abstract

In recent years, Large Language Models (LLMs) have significantly advanced, demonstrating impressive capabilities in generating human-like text. This paper explores the potential privacy risks associated with Large Language Models for Code (LLMs4Code), which are increasingly used in various sectors. These models, while beneficial for tasks such as code generation and understanding, may inadvertently expose sensitive information contained in their training datasets. We investigate the specific types of personally identifiable information (PII) that can be leaked and explore targeted and untargeted attacks with diverse prompting styles under which these leaks occur. Our analysis reveals that LLMs4Code can leak PII with the targeted attacks, emphasizing the need for robust privacy-preserving measures. This research contributes to the ongoing discourse on AI ethics and privacy, providing insights into the safety of various prompting conditions under targeted and untargeted attacks. Future work should focus on running the experiment with more diverse parameters, implementing more advanced PII detection techniques, and testing a broader range of models to enhance the generalizability of the findings.