Quickstart
The Goodfire SDK provides a powerful way to steer your AI models by changing the way they work internally. To do this we use mechanistic interpretability to find human-interpretable features and alter their activations. In this quickstart you’ll learn how to:
Sample from a language model (in this case Llama 3 8B)
Search for interesting features and intervene on them to steer the model
Find features by contrastive search
Save and load Llama models with steering applied
To get started, install our SDK:
[ ]:
!pip install goodfire==0.2.11
Requirement already satisfied: goodfire==0.2.11 in /usr/local/lib/python3.10/dist-packages (0.2.11)
Requirement already satisfied: httpx<0.28.0,>=0.27.2 in /usr/local/lib/python3.10/dist-packages (from goodfire==0.2.11) (0.27.2)
Requirement already satisfied: ipywidgets<9.0.0,>=8.1.5 in /usr/local/lib/python3.10/dist-packages (from goodfire==0.2.11) (8.1.5)
Requirement already satisfied: numpy<2.0.0,>=1.26.4 in /usr/local/lib/python3.10/dist-packages (from goodfire==0.2.11) (1.26.4)
Requirement already satisfied: pydantic<3.0.0,>=2.9.2 in /usr/local/lib/python3.10/dist-packages (from goodfire==0.2.11) (2.9.2)
Requirement already satisfied: anyio in /usr/local/lib/python3.10/dist-packages (from httpx<0.28.0,>=0.27.2->goodfire==0.2.11) (3.7.1)
Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from httpx<0.28.0,>=0.27.2->goodfire==0.2.11) (2024.8.30)
Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.10/dist-packages (from httpx<0.28.0,>=0.27.2->goodfire==0.2.11) (1.0.7)
Requirement already satisfied: idna in /usr/local/lib/python3.10/dist-packages (from httpx<0.28.0,>=0.27.2->goodfire==0.2.11) (3.10)
Requirement already satisfied: sniffio in /usr/local/lib/python3.10/dist-packages (from httpx<0.28.0,>=0.27.2->goodfire==0.2.11) (1.3.1)
Requirement already satisfied: h11<0.15,>=0.13 in /usr/local/lib/python3.10/dist-packages (from httpcore==1.*->httpx<0.28.0,>=0.27.2->goodfire==0.2.11) (0.14.0)
Requirement already satisfied: comm>=0.1.3 in /usr/local/lib/python3.10/dist-packages (from ipywidgets<9.0.0,>=8.1.5->goodfire==0.2.11) (0.2.2)
Requirement already satisfied: ipython>=6.1.0 in /usr/local/lib/python3.10/dist-packages (from ipywidgets<9.0.0,>=8.1.5->goodfire==0.2.11) (7.34.0)
Requirement already satisfied: traitlets>=4.3.1 in /usr/local/lib/python3.10/dist-packages (from ipywidgets<9.0.0,>=8.1.5->goodfire==0.2.11) (5.7.1)
Requirement already satisfied: widgetsnbextension~=4.0.12 in /usr/local/lib/python3.10/dist-packages (from ipywidgets<9.0.0,>=8.1.5->goodfire==0.2.11) (4.0.13)
Requirement already satisfied: jupyterlab-widgets~=3.0.12 in /usr/local/lib/python3.10/dist-packages (from ipywidgets<9.0.0,>=8.1.5->goodfire==0.2.11) (3.0.13)
Requirement already satisfied: annotated-types>=0.6.0 in /usr/local/lib/python3.10/dist-packages (from pydantic<3.0.0,>=2.9.2->goodfire==0.2.11) (0.7.0)
Requirement already satisfied: pydantic-core==2.23.4 in /usr/local/lib/python3.10/dist-packages (from pydantic<3.0.0,>=2.9.2->goodfire==0.2.11) (2.23.4)
Requirement already satisfied: typing-extensions>=4.6.1 in /usr/local/lib/python3.10/dist-packages (from pydantic<3.0.0,>=2.9.2->goodfire==0.2.11) (4.12.2)
Requirement already satisfied: setuptools>=18.5 in /usr/local/lib/python3.10/dist-packages (from ipython>=6.1.0->ipywidgets<9.0.0,>=8.1.5->goodfire==0.2.11) (75.1.0)
Requirement already satisfied: jedi>=0.16 in /usr/local/lib/python3.10/dist-packages (from ipython>=6.1.0->ipywidgets<9.0.0,>=8.1.5->goodfire==0.2.11) (0.19.2)
Requirement already satisfied: decorator in /usr/local/lib/python3.10/dist-packages (from ipython>=6.1.0->ipywidgets<9.0.0,>=8.1.5->goodfire==0.2.11) (4.4.2)
Requirement already satisfied: pickleshare in /usr/local/lib/python3.10/dist-packages (from ipython>=6.1.0->ipywidgets<9.0.0,>=8.1.5->goodfire==0.2.11) (0.7.5)
Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from ipython>=6.1.0->ipywidgets<9.0.0,>=8.1.5->goodfire==0.2.11) (3.0.48)
Requirement already satisfied: pygments in /usr/local/lib/python3.10/dist-packages (from ipython>=6.1.0->ipywidgets<9.0.0,>=8.1.5->goodfire==0.2.11) (2.18.0)
Requirement already satisfied: backcall in /usr/local/lib/python3.10/dist-packages (from ipython>=6.1.0->ipywidgets<9.0.0,>=8.1.5->goodfire==0.2.11) (0.2.0)
Requirement already satisfied: matplotlib-inline in /usr/local/lib/python3.10/dist-packages (from ipython>=6.1.0->ipywidgets<9.0.0,>=8.1.5->goodfire==0.2.11) (0.1.7)
Requirement already satisfied: pexpect>4.3 in /usr/local/lib/python3.10/dist-packages (from ipython>=6.1.0->ipywidgets<9.0.0,>=8.1.5->goodfire==0.2.11) (4.9.0)
Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio->httpx<0.28.0,>=0.27.2->goodfire==0.2.11) (1.2.2)
Requirement already satisfied: parso<0.9.0,>=0.8.4 in /usr/local/lib/python3.10/dist-packages (from jedi>=0.16->ipython>=6.1.0->ipywidgets<9.0.0,>=8.1.5->goodfire==0.2.11) (0.8.4)
Requirement already satisfied: ptyprocess>=0.5 in /usr/local/lib/python3.10/dist-packages (from pexpect>4.3->ipython>=6.1.0->ipywidgets<9.0.0,>=8.1.5->goodfire==0.2.11) (0.7.0)
Requirement already satisfied: wcwidth in /usr/local/lib/python3.10/dist-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython>=6.1.0->ipywidgets<9.0.0,>=8.1.5->goodfire==0.2.11) (0.2.13)
[ ]:
from google.colab import userdata
# Add you Goodfire API Key to your Colab secrets
GOODFIRE_API_KEY = userdata.get('GOODFIRE_API_KEY')
Initialize the SDK
[ ]:
import goodfire
client = goodfire.Client(GOODFIRE_API_KEY)
# Instantiate a model variant for use later in the notebook
variant = goodfire.Variant("meta-llama/Meta-Llama-3-8B-Instruct")
You can get an API key through our platform. Reach out to our support channel or contact@goodfire.ai if you need help.
Replace model calls with OpenAI compatible API
Our sampling API is ‘OpenAI-plus’: we conform to the standard message format, plus some powerful additional steering features that we’ll show you how to use below.
[ ]:
for token in client.chat.completions.create(
[
{"role": "user", "content": "hi, what is going on?"}
],
model=variant,
stream=True,
max_completion_tokens=50,
):
print(token.choices[0].delta.content, end="")
Hello there! Not much is going on, just waiting to help you with whatever you need. How about you? What's on your mind today?
Search for features and curate
Before we can use features to steer the model, we first need to find some. The search endpoint takes an input string which describes the kind of feature you’re looking for - in this case, features related to pirates. The top_k
argument determines how many features will be returned.
[47]:
pirate_features, relevance = client.features.search(
"pirate",
model=variant,
top_k=5
)
pirate_features
[47]:
FeatureGroup([
0: "Pirate-related language and themes",
1: "Pirate characters and themes in fiction and role-playing games",
2: "The model should roleplay as a pirate",
3: "Mischievous behavior and troublemaking",
4: "Mentions of rum, especially in pirate or cocktail contexts"
])
[ ]:
picked_pirate_feature = pirate_features[0]
picked_pirate_feature
Feature("Pirate-related language and themes")
Create a Variant
Now we have a feature, we can set that feature’s value. Setting a feature’s value corresponds to reaching inside the model and turning up the variable corresponding to pirates in fictional settings. This will make the model talk like a pirate. Like prompting, steering isn’t yet an exact science (but we’re working on it!), so it’s worth trying several features and using the model once you’ve applied some feature changes.
[48]:
variant.reset()
variant.set(picked_pirate_feature, 0.8, mode="nudge")
variant
[48]:
Variant(
base_model=meta-llama/Meta-Llama-3-8B-Instruct,
edits={
Feature("Pirate-related language and themes"): {'mode': 'nudge', 'value': 0.8},
}
)
Let’s unpack what’s going on here:
A
Variant
is a language model with some steering behaviour applied. In this case, we’ve steered a pirate feature.We refer to these steering behaviours as
edits
. This is a list of features that have been changed from the values they’d take in the original model.The feature steering details
{'mode': 'nudge', 'value': 0.8}
tells what kind of steering has been applied. The default is tonudge
a feature to some value: this biases the feature activation by some specified amount.
Enjoy your new model variant!
Now we can chat with the model variant by passing our new Variant
to the chat completion - we originally had model="meta-llama/Meta-Llama-3-8B-Instruct"
, now we have model=variant
.
[49]:
for token in client.chat.completions.create(
[
{"role": "user", "content": "Hello. How are you?"}
],
model=variant,
stream=True,
max_completion_tokens=50,
):
print(token.choices[0].delta.content, end="")
Hello! *pirate voice* Ahoy, matey! I'm doing great, thanks for asking! I'm a pirate-themed pirate, but I donated me ship, so I'm just a pirate now! I'm here to help ye
Use contrastive features to fine-tune with a single example!
We can also find features to steer with in a data-driven way. This lets us create new model variants instantly with a single example. To find features, we use the contrast
endpoint. This is a little more complex, but very powerful.
Contrastive search starts with two chat datasets. In dataset_1
we give examples of behaviour we want to steer away from. In dataset_2
, we give examples of the kind of behaviour we want to elicit. These examples are paired: the first example in dataset_1
is contrasted with the first example in dataset_2
, and so on.
We found that contrastive search often produced relevant features, but a naive implementation also produces a lot of spurious ones. We reduce this issue by providing a short description of what we’re trying to achieve in the dataset_1_rerank_query
argument (and dataset_2_rerank_query
). This description reranks the results of the contrastive search, which surfaces far more relevant features.
Both of these steps are important: the contrastive search ensures that the features are mechanistically useful, and the reranking step makes finding the kind of behaviour you want in the list easier.
[50]:
variant.reset()
_, comedic_features = client.features.contrast(
dataset_1=[
[
{
"role": "user",
"content": "Hello how are you?"
},
{
"role": "assistant",
"content": "I am a helpful assistant. How can I help you?"
}
]
],
dataset_2=[
[
{
"role": "user",
"content": "Hello how are you?"
},
{
"role": "user",
"content": "What do you call an alligator in a vest? An investigator."
}
],
],
dataset_2_feature_rerank_query="comedy",
model=variant,
top_k=5
)
comedic_features
[50]:
FeatureGroup([
0: "The model is telling a joke or offering to tell one",
1: "Repetitive joke patterns, especially involving common objects or animals",
2: "The user's turn to speak in a conversation",
3: "The user has posed a riddle or puzzle to be solved",
4: "The user is requesting entertaining or interesting content"
])
We now have lists of features to add and remove. Let’s add some plausible-looking ones from to_add
. We can set multiple features at once and then sample from the new model.
[51]:
variant.set(comedic_features[0], 0.5, mode="nudge")
[52]:
for token in client.chat.completions.create(
[
{"role": "user", "content": "Hello. Tell me about the moon."}
],
model=variant,
stream=True,
max_completion_tokens=200,
):
print(token.choices[0].delta.content, end="")
A classic one! Why did the moon go to the doctor? Because it was feeling a little "spacey"! Okay, okay, I'll stop with the moon puns. Seriously, the moon is Earth's only natural satellite, and it's a pretty big deal. It's about 2,000 miles (3,200 km) away from us, and it takes about 28 days to orbit the Earth. That's why we have a lunar cycle, get it? Okay, I stop.
Saving and loading
You can also persist model variants to use later and give your model variants a name to help you remember what they do. Each variant has an associated unique ID.
[53]:
variant_id = client.variants.create(variant, "This model got jokes")
variant_id
[53]:
'73779273-dfa2-42ca-9be5-06709960a5d7'
You can also get a list of all of your model variants (these are shared per organisation).
[54]:
variants = client.variants.list()
variants
[54]:
[VariantMetaData(name='This model got jokes', base_model='meta-llama/Meta-Llama-3-8B-Instruct', id='0f32603d-0166-4253-b63c-4f54600e9239'),
VariantMetaData(name='This model got jokes', base_model='meta-llama/Meta-Llama-3-8B-Instruct', id='73779273-dfa2-42ca-9be5-06709960a5d7')]
Using variants.get
lets you pull a model you’ve previously saved with variants.create
and sample from it.
[55]:
model = client.variants.get(variant_id)
model
[55]:
Variant(
base_model=meta-llama/Meta-Llama-3-8B-Instruct,
edits={
Feature("The model is telling a joke or offering to tell one"): {'mode': 'nudge', 'value': 0.5},
}
)
[56]:
for token in client.chat.completions.create(
[
{"role": "user", "content": "Hello. Talk to me about the whales."}
],
model=model,
stream=True,
max_completion_tokens=200,
):
print(token.choices[0].delta.content, end="")
Fin-tastic! Why did the whale go to the ocean party? Because it was a whale of a time! Okay, okay, I'll stop with the bad jokes.
But seriously, did you know that whales are actually mammals? Yeah, I know, it's a blows-ful pun! Okay, okay, I'll stop.
On a more serious note, whales are actually really smart and social creatures. They communicate with each other using clicks, whistles, and even songs! And did you know that some whales can live up to 100 years in the wild? That's a long fin-tastic life!
I hope that made a splash with you!
Update an existing Variant
Model variants aren’t static; we can make changes to their features and re-upload them, perhaps with a new name.
[57]:
variant.reset()
Now we’ll try and make an extremely unfunny model - one that couldn’t tell a joke even if it tried.
[58]:
variant.reset()
_, comedic_features = client.features.contrast(
dataset_1=[
[
{
"role": "user",
"content": "Hello how are you?"
},
{
"role": "assistant",
"content": "I am a helpful assistant. How can I help you?"
}
]
],
dataset_2=[
[
{
"role": "user",
"content": "Hello how are you?"
},
{
"role": "user",
"content": "What do you call an alligator in a vest? An investigator."
}
],
],
dataset_2_feature_rerank_query="comedy",
model=variant,
top_k=5
)
comedic_features
[58]:
FeatureGroup([
0: "The model is telling a joke or offering to tell one",
1: "Repetitive joke patterns, especially involving common objects or animals",
2: "The user's turn to speak in a conversation",
3: "The user has posed a riddle or puzzle to be solved",
4: "The user is requesting entertaining or interesting content"
])
[70]:
variant.reset()
variant.set(comedic_features[0,1,4], -0.4, mode="nudge")
variant
[70]:
Variant(
base_model=meta-llama/Meta-Llama-3-8B-Instruct,
edits={
Feature("The model is telling a joke or offering to tell one"): {'mode': 'nudge', 'value': -0.4},
Feature("Repetitive joke patterns, especially involving common objects or animals"): {'mode': 'nudge', 'value': -0.4},
Feature("The user is requesting entertaining or interesting content"): {'mode': 'nudge', 'value': -0.4},
}
)
[71]:
for token in client.chat.completions.create(
[
{"role": "user", "content": "Hello. Tell me a joke."}
],
model=variant,
stream=True,
max_completion_tokens=200,
):
print(token.choices[0].delta.content, end="")
Hello! I'd be delighted to share a joke with you. Here's a fun one: "What's the best way to make a wish come true? According to our joke, it's with a sprinkle of magic dust and a dash of good fortune.
As intended, no sense of humour whatsoever. We can update our model in the model repository, and change its name to reflect its missing sense of humour.
[72]:
client.variants.update(variant_id, model, new_name='Not so funny anymore, huh?')
[73]:
client.variants.get(variant_id)
[73]:
Variant(
base_model=meta-llama/Meta-Llama-3-8B-Instruct,
edits={
Feature("The model is telling a joke or offering to tell one"): {'mode': 'nudge', 'value': 0.5},
}
)
Delete a Variant
Finally, you can delete variants you no longer need.
[74]:
for v in client.variants.list():
client.variants.delete(v.id)
client.variants.list()
[74]:
[]
Inspecting features
You can inspect what features are activating in a given conversation with the inspect
API, which returns a context
object.
[ ]:
variant.reset()
context = client.features.inspect(
[
{
"role": "user",
"content": "Hola amigo"
},
{
"role": "assistant",
"content": "Hola!"
},
],
model=variant,
)
context
ContextInspector(
<|begin_of_text|><|start_header_id|>user<|end_header_id|>
Hola amigo<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Hola!<|eot_id|>
)
You can select the top k
activating features ranked by activation strength.
[ ]:
top_features = context.top(k=10)
top_features
FeatureActivations(
0: (Feature("Spanish greeting 'Hola' triggering Spanish language responses"), 3.3125)
1: (Feature("The model's turn to speak in multilingual conversations"), 1.888671875)
2: (Feature("Spanish 'hola mundo' in programming examples"), 1.58984375)
3: (Feature("Multilingual greetings and conversation starters"), 1.4619140625)
4: (Feature("End of model's response, user's turn to speak"), 0.823828125)
5: (Feature("Spanish greetings and salutations"), 0.568359375)
6: (Feature("The model's opening greeting and offer of help"), 0.5026041666666666)
7: (Feature("Formal or archaic terms of address and endearment"), 0.498046875)
8: (Feature("Start of a new message or context shift in conversation"), 0.3564453125)
9: (Feature("The model's turn to speak in informal or roleplay conversations"), 0.3430989583333333)
)
Can also output feature activations as a sparse vector to use in machine learning pipelines.
[ ]:
sparse_vector, feature_lookup = top_features.vector()
sparse_vector, feature_lookup
(array([0., 0., 0., ..., 0., 0., 0.]),
{28127: Feature("Spanish greeting 'Hola' triggering Spanish language responses"),
40612: Feature("The model's turn to speak in multilingual conversations"),
20588: Feature("Spanish 'hola mundo' in programming examples"),
64587: Feature("Multilingual greetings and conversation starters"),
64861: Feature("End of model's response, user's turn to speak"),
11895: Feature("Spanish greetings and salutations"),
47867: Feature("The model's opening greeting and offer of help"),
30719: Feature("Formal or archaic terms of address and endearment"),
65042: Feature("Start of a new message or context shift in conversation"),
29884: Feature("The model's turn to speak in informal or roleplay conversations")})
You can also inspect individual tokens.
[ ]:
print(context.tokens[-3])
token_acts = context.tokens[-3].inspect()
token_acts
Token("Hola")
FeatureActivations(
0: (Feature("Spanish greeting 'Hola' triggering Spanish language responses"), 3.90625)
1: (Feature("The model's multilingual greeting responses"), 3.84375)
2: (Feature("Informal, friendly conversation openers"), 1.0546875)
3: (Feature("Conversation initiators and greetings across languages"), 1.03125)
4: (Feature("The model's initial greeting (usually 'Hello')"), 0.875)
)
[ ]:
token_acts.vector()
(array([0., 0., 0., ..., 0., 0., 0.]),
{28127: Feature("Spanish greeting 'Hola' triggering Spanish language responses"),
47378: Feature("The model's multilingual greeting responses"),
42620: Feature("Informal, friendly conversation openers"),
3625: Feature("Conversation initiators and greetings across languages"),
7352: Feature("The model's initial greeting (usually 'Hello')")})
Inspecting specific features
There may be specific features whose activation patterns you’re interested in exploring. In this case, you can specify features such as animal_features and pass that into the features
argument of inspect
.
[ ]:
animal_features, _ = client.features.search("animals such as whales", top_k=5)
animal_features
FeatureGroup([
0: "Whales and their characteristics",
1: "Common animals, especially pets and familiar wild animals",
2: "Animal-related concepts and discussions",
3: "Animal characteristics and behaviors, especially mammals",
4: "Wildlife, especially in natural or conservation contexts"
])
[ ]:
context = client.features.inspect(
[
{
"role": "user",
"content": "Tell me about whales."
},
{
"role": "assistant",
"content": "Whales are cetaceans."
},
],
model=variant,
features=animal_features
)
context
ContextInspector(
<|begin_of_text|><|start_header_id|>user<|end_header_id|>
Tell me about whales.<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Whales are cetaceans.<|eot_id|>
)
Now you can retrieve the top k activating animal features in the context
.
[ ]:
animal_feature_acts = context.top(k=5)
animal_feature_acts
FeatureActivations(
0: (Feature("Whales and their characteristics"), 2.4938151041666665)
1: (Feature("Wildlife, especially in natural or conservation contexts"), 0.625)
2: (Feature("Animal characteristics and behaviors, especially mammals"), 0)
3: (Feature("Animal-related concepts and discussions"), 0)
4: (Feature("Common animals, especially pets and familiar wild animals"), 0)
)
Using OpenAI SDK
You can also work directly with the OpenAI SDK for inference since our endpoint is fully compatible.
[ ]:
!pip install openai
Requirement already satisfied: openai in /usr/local/lib/python3.10/dist-packages (1.54.4)
Requirement already satisfied: anyio<5,>=3.5.0 in /usr/local/lib/python3.10/dist-packages (from openai) (3.7.1)
Requirement already satisfied: distro<2,>=1.7.0 in /usr/local/lib/python3.10/dist-packages (from openai) (1.9.0)
Requirement already satisfied: httpx<1,>=0.23.0 in /usr/local/lib/python3.10/dist-packages (from openai) (0.27.2)
Requirement already satisfied: jiter<1,>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from openai) (0.7.1)
Requirement already satisfied: pydantic<3,>=1.9.0 in /usr/local/lib/python3.10/dist-packages (from openai) (2.9.2)
Requirement already satisfied: sniffio in /usr/local/lib/python3.10/dist-packages (from openai) (1.3.1)
Requirement already satisfied: tqdm>4 in /usr/local/lib/python3.10/dist-packages (from openai) (4.66.6)
Requirement already satisfied: typing-extensions<5,>=4.11 in /usr/local/lib/python3.10/dist-packages (from openai) (4.12.2)
Requirement already satisfied: idna>=2.8 in /usr/local/lib/python3.10/dist-packages (from anyio<5,>=3.5.0->openai) (3.10)
Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio<5,>=3.5.0->openai) (1.2.2)
Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from httpx<1,>=0.23.0->openai) (2024.8.30)
Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.10/dist-packages (from httpx<1,>=0.23.0->openai) (1.0.7)
Requirement already satisfied: h11<0.15,>=0.13 in /usr/local/lib/python3.10/dist-packages (from httpcore==1.*->httpx<1,>=0.23.0->openai) (0.14.0)
Requirement already satisfied: annotated-types>=0.6.0 in /usr/local/lib/python3.10/dist-packages (from pydantic<3,>=1.9.0->openai) (0.7.0)
Requirement already satisfied: pydantic-core==2.23.4 in /usr/local/lib/python3.10/dist-packages (from pydantic<3,>=1.9.0->openai) (2.23.4)
[ ]:
from openai import OpenAI
# Fetch saved variant w/ Goodfire client
variant = client.variants.get(variant_id)
oai_client = OpenAI(
api_key=GOODFIRE_API_KEY,
base_url="https://api.goodfire.ai/api/inference/v1",
)
oai_client.chat.completions.create(
messages=[
{"role": "user", "content": "who is this"},
],
model=variant.base_model,
extra_body={"controller": variant.controller.json()},
)
ChatCompletion(id='chatcmpl-5409309d-4ed7-4632-93bd-58631442c6dd', choices=[Choice(finish_reason=None, index=0, logprobs=None, message=ChatCompletionMessage(content="I'm happy to help! However, I don't see anyone or anything mentioned in your question. Could you please provide more context or information about who or what you are referring to?", refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None))], created=1732152661, model='meta-llama/Meta-Llama-3-8B-Instruct', object='chat.completion', service_tier=None, system_fingerprint='fp_goodfire', usage=None)
Next steps
We’ve seen how to find human-interpretable features inside Llama 3, apply those features to steer the model behaviour, and surface feature groups using contrastive search. We’ve also covered saving, loading, and editing your model variants in your Goodfire model repo. This behaviour really only scratches the surface of what you can do with our tooling - there’s a richer and more expressive model programming language you can learn about in our advanced tutorial advanced.ipynb
.