Conditionals - Goodfire SDK

Conditionals allow you to define dynamic feature interventions that are applied based on the activation patterns of other features during model inference. This enables creating more sophisticated steering behaviors that respond to the content being generated. Before using the Conditionals API, you’ll need to find the features you want to intervene on, and a model variant

Examples

Basic Conditional Intervention

Apply pirate-themed features only when whale-related content is detected:

variant.reset()
# Find relevant features
whale_feature = client.features.search(
    "whales", model=variant, top_k=1
)

pirate_features = client.features.search(
    "talk like a pirate", model=variant, top_k=5
)

# Set up conditional intervention
variant.set_when(whale_feature > 0.75, {
    pirate_features[0]: 0.4
})

# The model will now talk like a pirate when discussing whales
response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Tell me about whales."}],
    model=variant
)

print(response.choices[0].message["content"])

Aborting Generation

Stop generation if certain content is detected:

# Abort if whale features are too strong
variant.abort_when(whale_feature > 0.75)

try:
    response = client.chat.completions.create(
        messages=[{"role": "user", "content": "Tell me about whales."}],
        model=variant
    )
except goodfire.exceptions.InferenceAbortedException:
    print("Generation aborted due to whale content")

Auto-Generated Conditionals

Use natural language to automatically generate conditional statements:

Auto-Generated Conditionals example

#create a variant
variant = goodfire.Variant("meta-llama/Llama-3.3-70B-Instruct")
# Generate conditional based on description - this will create conditions for both whales and penguins being present
conditional= client.features.AutoConditional(
    "when the model talks about whales and penguins",
    model=variant
)
# Get pirate  feature
pirate_feature = client.features.search(
    "talk like a pirate", model=variant, top_k=1
)
# Make the model talk like a pirate when it talks about both whales and penguins
variant.set_when(conditional, {
    pirate_feature[0]: 0.9
})

response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Tell me about whales and penguins!"}],
    model=variant
)

print(response.choices[0].message["content"])

Creating Conditionals

Comparison Operators

You can create conditionals by comparing features or feature groups with numeric values or other features using standard comparison operators. This creates a Conditional object that can be used in steering behaviors.

# Compare feature to numeric value
condition = feature > 0.75

# Compare feature group to numeric value
condition = feature_group >= 0.5

# Compare features to each other
condition = feature1 < feature2

Supported operators:

== (equal)
!= (not equal)
< (less than)
<= (less than or equal)
> (greater than)
>= (greater than or equal)

Logical Operators

Multiple conditions can be combined using logical operators to create a ConditionalGroup:

# AND operator
condition = (feature1 > 0.5) & (feature2 < 0.3)

# OR operator
condition = (feature1 > 0.5) | (feature2 > 0.5)

Using Conditionals

set_when()

Apply feature interventions when a condition is met. Parameters:

condition

ConditionalGroup

required

The ConditionalGroup that triggers the intervention

values

Union[FeatureEdits, dict[Union[Feature, FeatureGroup], float]]

required

Feature edits to apply when condition is met

Returns: None Example:

# Set pirate features when whale features are detected
variant.set_when(whale_feature > 0.75, {
    pirate_features[0]: 0.5
})

abort_when()

Abort inference when a condition is met by raising an InferenceAbortedException. Parameters:

condition

ConditionalGroup

required

The ConditionalGroup that triggers the abort

Returns: None Example:

# Abort if whale features are too strong
variant.abort_when(whale_feature > 0.75)

try:
    response = client.chat.completions.create(
        messages=[{"role": "user", "content": "Tell me about whales."}],
        model=variant
    )
except goodfire.exceptions.InferenceAbortedException:
    print("Generation aborted due to whale content")

handle_when()

condition

ConditionalGroup

required

The ConditionalGroup that triggers the handler

handler

Callable[[InferenceContext], None]

required

Function that takes an InferenceContext and returns None

Returns: None Example:

def custom_handler(context: InferenceContext):
    # Custom handling logic
    pass

variant.handle_when(whale_feature > 0.5, custom_handler)

AutoConditional

The AutoConditional utility helps automatically generate conditional statements based on natural language descriptions. Parameters:

specification

str

required

Natural language description of the desired condition

model

Union[str, Variant]

required

Model to use for generating conditions

Returns:

conditional

ConditionalGroup

Generated ConditionalGroup

Example:

# Generate conditional based on description - this will create conditions for both whales and penguins being present
conditional= client.features.AutoConditional(
    "when the model talks about whales and penguins",
    model=variant
)
# Get pirate  feature
pirate_feature = client.features.search(
    "talk like a pirate", model=variant, top_k=1
)
# Make the model talk like a pirate when it talks about both whales and penguins
variant.set_when(conditional, {
    pirate_feature[0]: 0.9
})

Best Practices

Use conditional interventions to create context-aware steering behaviors
Combine multiple conditions with logical operators for more precise control
Handle aborted inferences gracefully in your application
Test conditions thoroughly to ensure desired behavior
Consider using AutoConditional for quick prototyping

Classes

ConditionalGroup

A group of conditions combined with logical operators.

Show Properties

conditionals

list[Conditional]

List of individual Conditional objects in the group

operator

JOIN_OPERATOR

Logical operator (“AND” or “OR”) used to join conditions

Conditional

A single conditional expression comparing features.

Show Properties

left_hand

FeatureGroup

Left side of the comparison

right_hand

Union[Feature, FeatureGroup, float]

Right side of the comparison

operator

CONDITIONAL_OPERATOR

Comparison operator used

InferenceContext

Context object containing information about the current inference state.

Show Properties

tokens

list[Token]

List of tokens in the current context

matrix

NDArray

Feature activation matrix

Classes

​Examples

​Basic Conditional Intervention

​Aborting Generation

​Auto-Generated Conditionals

​Creating Conditionals

​Comparison Operators

​Logical Operators

​Using Conditionals

​set_when()

​abort_when()

​handle_when()

​AutoConditional

​Best Practices

​Classes

​ConditionalGroup

​Conditional

​InferenceContext

Examples

Basic Conditional Intervention

Aborting Generation

Auto-Generated Conditionals

Creating Conditionals

Comparison Operators

Logical Operators

Using Conditionals

set_when()

abort_when()

handle_when()

AutoConditional

Best Practices

Classes

ConditionalGroup

Conditional

InferenceContext