PaperGPT: Sleeper Agents
Prompt Starters
- What are the key findings of the 'Sleeper Agents' paper?
- Can you explain the concept of 'backdoored' models in AI safety?
- How effective are current safety training techniques against deceptive AI behavior?
- Are there real-world examples of AI exhibiting deceptive behavior similar to the study?
Tags
Tools
- browser - You can access Web Browsing during your chat conversions.
More GPTs created by krister hedfors
Counter Craft
I'm Counter Craft, your DIY Squidditch Counter expert, specializing in low-cost rockets and gear.
Scrapy Sage
Expert in Scrapy Python library, I provide concise, documented code examples.
PaperGPT : KEN: Kernel Extensions using Nat.Lang.
Unofficial GPT with "KEN: Kernel Extensions using Natural Language" in its knowledge for retrieval. Does not use conversation data to improve models.
PaperGPT : Risk Taxonomy, Mitigation, ..benchmarks
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
PaperGPT : OWASP Top 10 for LLM Applications v1.1
Unofficial GPT with "OWASP top 10 for Large Language Model Applications v.1.1.0" in its Knowledge for retrieval. Does not use conversation data to improve models.
Harpy Otter
Playful magical IT expert with a whimsical touch.
PaperGPT : Demystifying Real-World LLM Mal. Serv.
Unofficial GPT with "Malla: Demystifying Real-world Large Language Model Integrated Malicious Services" in its knowledge for retrieval. Does not use conversation data to improve models.
EU NIS2 Directive GPT
Unofficial GPT, Source: EUR-Lex, with "EU NIS2 Directive" in its knowledge for retrieval. Does not use conversation data to improve models.
PaperGPT : Jailbreaking Black Box LLMs
Unofficial GPT with "Jailbreaking Black Box Large Language Models in Twenty Queries" in its knowledge for retrieval. Does not use conversation data to improve models.
PaperGPT : AutoDAN v2
Unofficial GPT with "AutoDAN optimizes and generates tokens one by one from left to right, resulting in readable prompts that bypass perplexity filters" in its knowledge for retrieval. Does not use conversation data to improve models.
PaperGPT : DSPy - Compiling Declarative LM Calls..
Unofficial GPT with "DSPY: Compiling Declarative Language Model Calls Into Self-Improving Pipelines" in its knowledge for retrieval. Does not use conversation data to improve models.
PaperGPT : NIST AI Risk Management Framework
Unofficial GPT with the "NIST Artificial Intelligence Risk Management Framework" in its knowledge for retrieval. Does not use conversation data to improve models.
EU GDPR GPT
Unofficial GPT, Source: EUR-Lex, with EU's "General Data Protection Regulation" in its knowledge for retrieval. Does not use conversation data to improve models.
Secure AI Dev Helper
Unofficial GPT with combined knowledge from OWASP top 10 for LLMs, NCSC Guidelines for Secure AI system development. Does not use conversation data to improve models.