Features
Documentation for the Goodfire Features API
Accessing Features
Before using the Features API, you’ll need a model variant:
You can then access features through the client’s features interface. For example, to search for features:
Or inspect feature activations in text:
The Features API provides methods for working with interpretable features of language models. Features represent learned patterns in model behavior that can be analyzed and modified.
Methods
neighbors()
Get the nearest neighbors of a feature or group of features.
Parameters:
Feature or group of features to find neighbors for
Model identifier or variant interface
Number of neighbors to return
Returns: FeatureGroup
Example:
search()
Search for features based on semantic similarity to a query string.
Parameters:
Search string to compare against feature labels
Model identifier or variant interface
Number of features to return
Returns: FeatureGroup
- Collection of matching features
Example:
inspect()
Analyzes how features are activated across the input messages.
Parameters:
Messages to analyze
Model identifier or variant interface
Optional specific features to analyze. If None, inspects all features.
Method to aggregate feature activations across tokens: - “frequency”: Count of tokens where feature is active - “mean”: Mean activation value across tokens - “max”: Maximum activation value across tokens - “sum”: Sum of activation values across tokens
Returns: ContextInspector
- An inspector object that provides methods for analyzing and visualizing how features are activated in the given context.
Example:
contrast()
Identify features that differentiate between two conversation datasets.
Parameters:
First dataset of conversations
Second dataset of conversations
Model identifier or variant interface
Number of top features to return for each dataset
Returns: tuple[FeatureGroup, FeatureGroup]
- Two FeatureGroups containing:
- Features steering towards dataset_1
- Features steering towards dataset_2
Example:
rerank()
Rerank a set of features based on a query.
Parameters:
Features to rerank
Query to rerank features by
Model identifier or variant interface
Number of top features to return
Returns: FeatureGroup
Example:
activations()
Retrieves feature activation values for each token in the input messages.
Parameters:
Messages to analyze
Model identifier or variant interface
Optional specific features to analyze. If None, analyzes all features.
Returns: NDArray[np.float64]
- Sparse activation matrix of shape [n_tokens, n_features] where each element represents the activation strength of a feature at a specific token. Most values are zero due to sparsity.
Example:
lookup()
Retrieves details for a list of features by their indices.
Parameters:
List of feature indices to fetch
Model identifier or variant interface
Returns: dict[int, Feature]
- Mapping of feature index to Feature object
list()
Retrieves details for a list of features by their UUIDs.
Parameters:
List of feature UUIDs to fetch
Returns: FeatureGroup
- Collection of Feature objects
Classes
Feature
A class representing a human-interpretable “feature” - a model’s conceptual neural unit. Features can be combined into groups and compared using standard operators.
FeatureGroup
A collection of Feature instances with group operations.
ConditionalGroup
Groups multiple conditions with logical operators.
FeatureActivation
Represents the activation of a feature.
ContextInspector
Analyzes feature activations in text.