That's actually crazy and I'll keep it in mind. Right now, I am mostly using it for data generation, so no untrusted prompts are going in. I'll add a disclaimer to the repo.
A previous company tried to do this with a single “clean_xss” function. It’s not possible because different contexts of code have different sanitization logic. JSON encoding, URL encoding, DOM sources and sinks, HTML attributes, SCRIPT tag, CSS, etc all are escaped or sanitized in different ways.
Trying to make a single function/script with no knowledge of contexts just makes the developer sense more security than exists.
Beware of ANSI escape codes where the LLM might hijack your terminal, aka Terminal DiLLMa.
https://embracethered.com/blog/posts/2024/terminal-dillmas-p...
That's actually crazy and I'll keep it in mind. Right now, I am mostly using it for data generation, so no untrusted prompts are going in. I'll add a disclaimer to the repo.
One solution is to convert them to caret notation before printing.
I have a demo here: https://github.com/wunderwuzzi23/terminal-dillma/blob/main/d...
It's like cat -v, which shows non-printable characters rather then allowing them to be interpreted by a terminal.
Are there any projects to sanitize the output of LLMs before it is injected into Bash scripts or other source code?
I get the feeling this will start to break into the OWASP Top 10 in the next few years…
While on the topic, does anybody have a good utility to sanitize things? I'm imagining something I can pipe to:
I've been meaning to throw something together myself, but I worry I'd miss something.A previous company tried to do this with a single “clean_xss” function. It’s not possible because different contexts of code have different sanitization logic. JSON encoding, URL encoding, DOM sources and sinks, HTML attributes, SCRIPT tag, CSS, etc all are escaped or sanitized in different ways. Trying to make a single function/script with no knowledge of contexts just makes the developer sense more security than exists.
I should've been clearer. I just want to fantastic terminal escape sequences. It's probably as straightforward as I've imagined.
I feel like the incumbent for running llm prompts, including locally, on the cli is llm: https://github.com/simonw/llm?tab=readme-ov-file#installing-...
How does this compare?
Did a similar curl script to ask questions to Llama3 hosted at Duckduckgo:
https://github.com/zoobab/curlduck