Goodfire SDK documentation

Examples

API Reference

class goodfire.api.client.Client(api_key: str, base_url: str = 'https://api.goodfire.ai')[source]

Client for interacting with the Goodfire API.

features

Interface for features operations

Type:

FeaturesAPI

chat

Interface for chat operations

Type:

ChatAPI

variants

Interface for variants operations

Type:

VariantsAPI

class goodfire.api.features.client.FeaturesAPI(goodfire_api_key: str, base_url: str = 'https://api.goodfire.ai')[source]

A class for accessing interpretable SAE features of AI models.

contrast(dataset_1: list[list[ChatMessage]], dataset_2: list[list[ChatMessage]], model: str | VariantInterface = 'meta-llama/Meta-Llama-3-8B-Instruct', dataset_1_feature_rerank_query: str | None = None, dataset_2_feature_rerank_query: str | None = None, top_k: int = 5)[source]

Identify features that differentiate between two conversation datasets.

Parameters:
  • dataset_1 – First conversation dataset

  • dataset_2 – Second conversation dataset

  • model – Model identifier or variant interface

  • dataset_1_feature_rerank_query – Optional query to rerank dataset_1 features

  • dataset_2_feature_rerank_query – Optional query to rerank dataset_2 features

  • top_k – Number of top features to return (default: 5)

Returns:

Two FeatureGroups containing:
  • Features steering towards dataset_1

  • Features steering towards dataset_2

Each Feature has properties:
  • uuid: Unique feature identifier

  • label: Human-readable feature description

  • max_activation_strength: Feature activation strength

  • index_in_sae: Index in sparse autoencoder

Return type:

tuple

Raises:

ValueError – If datasets are empty or have different lengths

Example

>>> dataset_1 = [[
...     {"role": "user", "content": "Hi how are you?"},
...     {"role": "assistant", "content": "I'm doing well..."}
... ]]
>>> dataset_2 = [[
...     {"role": "user", "content": "Hi how are you?"},
...     {"role": "assistant", "content": "Arr my spirits be high..."}
... ]]
>>> features_1, features_2 = client.features.contrast(
...     dataset_1=dataset_1,
...     dataset_2=dataset_2,
...     model=model,
...     dataset_2_feature_rerank_query="pirate",
...     top_k=5
... )
inspect(messages: list[ChatMessage], model: str | VariantInterface = 'meta-llama/Meta-Llama-3-8B-Instruct', features: Feature | FeatureGroup | None = None)[source]

Retrieve feature activations for a set of messages.

list(ids: list[str])[source]

Get features by their IDs.

rerank(features: FeatureGroup, query: str, model: str | VariantInterface = 'meta-llama/Meta-Llama-3-8B-Instruct', top_k: int = 10)[source]

Rerank a set of features based on a query.

search(query: str, model: str | VariantInterface = 'meta-llama/Meta-Llama-3-8B-Instruct', top_k: int = 10)[source]

Search for features based on a query.

class goodfire.api.chat.client.ChatAPI(api_key: str, base_url: str = 'https://api.goodfire.ai')[source]

OpenAI compatible chat API.

Example

>>> for token in client.chat.completions.create(
...     [
...         {"role": "user", "content": "hello"}
...     ],
...     model="meta-llama/Meta-Llama-3-8B-Instruct",
...     stream=True,
...     max_completion_tokens=50,
... ):
...     print(token.choices[0].delta.content, end="")
class goodfire.api.variants.client.VariantsAPI(api_key: str, base_url: str = 'https://api.goodfire.ai')[source]

Client for interacting with the Goodfire Variants API.

create(variant: VariantInterface, name: str)[source]

Create a new model variant with the specified name.

delete(id: str)[source]

Delete a model variant by ID.

get(variant_id: str, fast_variant: Literal[True] = True) Variant[source]
get(variant_id: str, fast_variant: Literal[False] = False) ProgrammableVariant

Get a model variant by ID.

list()[source]

List all model variants.

update(id: str, variant: VariantInterface, new_name: str | None = None)[source]

Update an existing model variant.

class goodfire.features.features.Feature(uuid: UUID, label: str, max_activation_strength: float, index_in_sae: int)[source]

A class representing a single feature aka a conceptual unit of the SAE.

Handles individual feature operations and comparisons. Features can be combined into groups and compared using standard operators.

uuid

Unique identifier for the feature

Type:

UUID

label

Human-readable label describing the feature

Type:

str

max_activation_strength

Maximum activation strength of the feature in the

Type:

float

training dataset
index_in_sae

Index position in the SAE

Type:

int

class goodfire.features.features.FeatureGroup(features: list[Feature] | None = None)[source]

A collection of Feature instances with group operations.

Provides functionality for managing and operating on groups of features, including union and intersection operations, indexing, and comparison operations.

Example

>>> feature_group = FeatureGroup([feature1, feature2, feature3, feature4])
>>> # Access single feature by index
>>> first_feature = feature_group[0]  # Returns Feature
>>>
>>> # Slice features
>>> first_two = feature_group[0:2]  # Returns FeatureGroup with features 0,1
>>> last_two = feature_group[-2:]   # Returns FeatureGroup with last 2 features
>>>
>>> # Multiple indexes using list or tuple
>>> selected = feature_group[[0, 2]]  # Returns FeatureGroup with features 0,2
>>> selected = feature_group[0, 3]    # Returns FeatureGroup with features 0,3
add(feature: Feature)[source]

Add a feature to the group.

Parameters:

feature – Feature instance to add to the group

intersection(feature_group: FeatureGroup)[source]

Create a new group with features common to both groups.

Parameters:

feature_group – Another FeatureGroup to intersect with

Returns:

New group containing only features present in both groups

Return type:

FeatureGroup

pick(feature_indexes: list[int])[source]

Create a new FeatureGroup with selected features.

Parameters:

feature_indexes – List of indexes to select

Returns:

New group containing only the selected features

Return type:

FeatureGroup

pop(index: int)[source]

Remove and return a feature at the specified index.

Parameters:

index – Index of the feature to remove

Returns:

The removed feature

Return type:

Feature

union(feature_group: FeatureGroup)[source]

Combine this group with another feature group.

Parameters:

feature_group – Another FeatureGroup to combine with

Returns:

New group containing features from both groups

Return type:

FeatureGroup

class goodfire.variants.fast.Variant(base_model: str)[source]

A class representing a variant of a base model with feature modifications.

This class allows for creating variants of a base model by applying feature modifications through either nudging or pinning values.

Parameters:

base_model (str) – Identifier of the base model to create variants from

base_model

The base model identifier

Type:

str

edits

Collection of feature modifications

Type:

FeatureEdits

clear(feature: Feature | FeatureGroup)[source]

Remove modifications for specified feature(s).

Parameters:

feature (Union[Feature, FeatureGroup]) – Feature(s) to clear modifications for

property controller: Controller

Get a controller instance with the variant’s modifications applied.

Returns:

Controller instance with feature modifications

Return type:

Controller

json()[source]

Convert the variant to a JSON-compatible dictionary.

Returns:

Dictionary containing base model and feature configurations

Return type:

dict

reset()[source]

Remove all feature modifications.

set(feature: Feature | FeatureGroup, value: float | None, mode: Literal['nudge'] = 'nudge') None[source]
set(feature: Feature | FeatureGroup, value: float | bool | None, mode: Literal['pin'] = 'pin') None

Set or modify feature values in the variant.

Parameters:
  • feature (Union[Feature, FeatureGroup]) – Feature(s) to modify

  • value (Union[float, bool, None]) – Value to apply: - float: For numerical adjustments - bool: For binary states (pin mode only) - None: To clear the modification

  • mode (Literal["nudge", "pin"], optional) –

    Modification mode: - “nudge”: Bias the feature strength - “pin”: Set the feature strength to a fixed value

    Defaults to “nudge”.

Utils

goodfire.utils.comparison.StreamingMultiplexer(streaming_generators: list[Generator[StreamingChatCompletionChunk, Any, Any]], boxes_per_row: int = 2)[source]

Display multiple streaming outputs simultaneously in a grid layout.

Uses ThreadPoolExecutor to concurrently process multiple streaming generators and display their outputs in a grid of text areas. Each stream gets its own text area that updates in real-time.

Parameters:
  • streaming_generators – List of generators yielding StreamingChatCompletionChunks

  • boxes_per_row – Number of text areas to display per row. Defaults to 2.

Example

>>> generators = [chat.stream() for chat in chats]
>>> StreamingMultiplexer(generators, boxes_per_row=3)

Experimental Features

class goodfire.utils._experimental.LatentExplorer(client: Client, model: str | VariantInterface)[source]

Interactive visualization tool for exploring feature relationships in latent space.

Uses PCA dimensionality reduction and interactive plotting to visualize feature neighborhoods and relationships. Supports clicking features to explore their local neighborhoods.

Parameters:
  • client (Client) – Client instance for API communication

  • model (Union[str, VariantInterface]) – Model or variant to explore

class goodfire.variants._experimental.ProgrammableVariant(base_model: str)[source]

A programmable variant which takes in a controller object. See the conditional feature interventions section of the advanced notebook for example usage.