Goodfire SDK home page
Search...
⌘K
Support
Platform
Platform
Search...
Navigation
Notebooks
Jailbreak Resistance
Documentation
SDK Reference
Get Started
Introduction
Quickstart
Notebooks
Decision Trees
Jailbreak Resistance
On Demand RAG
Dynamic Prompts
Removing Knowledge
Sorting by Features
Notebooks
Jailbreak Resistance
By using
Feature Activations and Contrastive Search
we can build a jailbreak resistant model.
Through this approach we were able to drastically lower the ability to jailbreak the model, using jailbreak prompts from the StrongREJECT dataset.
Decision Trees
On Demand RAG
Assistant
Responses are generated using AI and may contain mistakes.