Prerequisite: You’ll need a Goodfire API key to follow this guide. Get one
through our platform or contact
support.
Quickstart
Ember is a hosted API/SDK that lets you shape AI model behavior by directly controlling a model’s internal units of computation, or “features”. With Ember, you can modify features to precisely control model outputs, or use them as building blocks for tasks like classification. In this quickstart, you’ll learn how to:- Find features that matter for your specific needs
- Edit features to create model variants
- Discover which features are active in your data
- Save and load your model variants
Code
Code
Initialize the SDK
Code
Editing features to create model variants
How to find relevant features for edits
There are three ways to find features you may want to modify:- Auto Steer: Simply describe what you want, and let the API automatically select and adjust feature weights
- Feature Search: Find features using semantic search
- Contrastive Search: Identify relevant features by comparing two different datasets
Auto Steer
Auto steering automatically finds and adjusts feature weights to achieve your desired behavior. Simply provide a short prompt describing what you want, and autosteering will:- Find the relevant features
- Set appropriate feature weights
- Return a FeatureEdits object that you can set directly
Feature search
Let’s reset the model to its default state (without any feature edits)Code
(Advanced) Look at a feature’s nearest neighbors
Get neighboring features by comparing them to either individual features or groups of features. When comparing to individual features,neighbors()
looks at similarity in the embedding space. When comparing to groups, neighbors()
finds features closest to the group’s centroid.
neighbors()
helps you understand feature relationships beyond just their labels. It can reveal which features might work best for your intended model adjustments.
Contrastive Search
Contrastive search lets you discover relevant features in a data-driven way. Provide two datasets of chat examples:- dataset_1: Examples of behavior you want to avoid
- dataset_2: Examples of behavior you want to encourage
Reranking
Contrastive search becomes more powerful when combined with reranking. First, contrastive search finds features that distinguish between your datasets. Then, reranking sorts these features using your description of the desired behavior. This two-step process ensures you get features that are both:- Mechanistically useful (from contrastive search)
- Aligned with your goals (from reranking)
joke_features
.
Note that we could also explore removing some of the helpful_assistant features.
(Advanced) Conditional logic for feature edits
You can establish relationships between different features (or feature groups) using conditional interventions. First, let’s reset the variant and pick out the funny features.Abort when Pirate Features are too strong
Auto Conditional Code
Discover which features are active in your data
Working with a conversation context
You can inspect what features are activating in a given conversation with theinspect
API, which returns a context
object.
Say you want to understand what model features are important when the model tells a joke. You can pass in the same joke conversation dataset to the inspect endpoint.
k
activating features in the context, ranked by activation strength. There are features related to jokes and tongue twisters, among other syntactical features.
(Advanced) Look at next token logits
Get feature activation vectors for machine learning tasks
To run a machine learning pipeline at the feature level (for instance, for humor detection) you can directly export features usingclient.features.activations
to get a matrix or retrieve a sparse vector for a specific FeatureGroup
.
Inspecting specific features
There may be specific features whose activation patterns you’re interested in exploring. In this case, you can specify features such as humor_features and pass that into thefeatures
argument of inspect
.
context
. This might be a more interesting set of features for downstream tasks.
Save and load your model variants
You can serialize a variant to JSON format for saving.Using OpenAI SDK
You can also work directly with the OpenAI SDK for inference since our endpoint is fully compatible.Install OpenAI Code
OpenAI SDK Code
For more advanced usage and detailed API reference, check out our SDK reference and example notebooks.