-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Problem
Current Situation
Simplexity currently generates metric names for activation analyses that follow this pattern:
[{user_prefix}/]{analysis_type}/[{layer_spec}_{analysis_subtype}_{factor_spec}/]{metric_type}
where strings in brackets [ ] are optional. For metric keys composed of only an {analysis_type} and {metric_type}, the names tend to be quite manageable. However, the current naming structure becomes unwieldy quickly for layer/factor-specific metrics.
Examples:
loss/emaeval/loss/mamodel/max_params_distanceactivations/reg/blocks.2.hook_resid_post_factor_0/r2activations/pca/blocks.0.hook_resid_pre/n_components_95pctactivations/reg/blocks.0.hook_resid_pre_orthogonality_0_1/subspace_overlap
How MLflow Organizes Metrics
MLflow's UI organizes metrics visually, grouping them based on their metric keys. Specifically, it looks for the last forward slash (/) in the key name and splits on it:
- Everything before the last
/becomes the folder prefix - Everything after the last
/becomes the metric name
Any metric keys that share the same prefix will be organized under the same folder. For example, if you have two metric keys a/b/c and a/b/d, MLflow will create a folder a/b with metrics c and d within it.
This results in the current naming convention displaying as:
📁 loss
├─ ema
├─ ma
📁 activations/reg/blocks.2.hook_resid_post_factor_0
├─ r2
├─ rmse
├─ mae
📁 activations/reg/blocks.0.hook_resid_pre_orthogonality_0_1
├─ subspace_overlap
├─ max_singular_value
Issues with Current Convention
MLflow's search bar is quite effective, making it easy to find metrics when you know what you're looking for. However, when you don't know what you're looking for, forget what you're looking for, or simply don't want to rely on searching, it becomes more important to have a sensible prefix and metric name naming convention.
Current problems:
Some existing metric keys are already clean and intuitive (e.g., loss/ema, loss/min). However, many layer/factor-specific metrics suffer from:
- Very long folder names (40+ characters) that are difficult to scan visually
a. Verbose component names contribute to this (e.g.,orthogonality_0_1,participation_ratio,blocks.0.hook_resid_pre) - Poor information hierarchy - the current structure prioritizes analysis method, layer location, and factor information over the actual metric being measured making it
a. Difficult to find metrics when browsing - if you want to see allrmsemetrics, they're scattered across many different folders
b. Hard to compare the same metric across layers (and factors) - related metrics (e.g., allrmseforfactor_0across layers) are in separate folders - Folder explosion - the number of folders scales with
layers × factors × analysis_subtypes, making the folder tree unwieldy as model architecture or factor count changes
Proposed Direction
When I am looking through MLFlow, I tend to search more based on the metric that I am interested in (e.g., rmse, r2, subspace_overlap) than on the layer (or the factor) that I might be interested in. The current naming convention works against this workflow by burying the metric name at the very end of the path.
For layer/factor-specific metrics, I propose a convention that would keep metric-related information in the prefix while keeping layer/factor-related information in what MLflow treats as the "metric name" (i.e., everything after the last /). This means the number of folders scales only with analysis_subtypes × metric_types, independent of model depth or factor count.
For metrics without layer/factor dimensions (like loss/ema), the metric type itself remains the metric name—since there is typically only one instance of these metrics, some of the main search-ability and folder blow-up issues from above are much less of a concern.
Full metric key structure (proposed):
{prefix}/{analysis_type}[/{subtype}...]/{metric_type}[/{layer_spec}-{factor_spec}]
The {layer_spec}-{factor_spec} suffix is only included for metrics that are layer/factor-specific.
Layer and Factor Spec Convention
Layer Spec:
- Single hook:
L{n}.{hook_path}(e.g.,L1.resid.pre,L1.attn.q,L1.mlp.out) - Non-block hooks: bare name (e.g.,
embed,pos_embed,ln_final) - Range:
L{n}:{m}(e.g.,L1:4) - Concatenated:
Lcat
Factor Spec:
- Single:
F{n}(e.g.,F0) - Tuple:
F{n},{m},...(e.g.,F0,1orF0,1,2) - Range:
F{n}:{m}(e.g.,F0:3) - Concatenated:
Fcat - All pairs:
Fpairs
Abbreviation Conventions
In addition to restructuring the metric keys, we should shorten some verbose metric and analysis names. Proposed abbreviations:
Analysis Types:
| Current | Proposed |
|---|---|
orthogonality |
orth |
Metric Types:
| Current | Proposed |
|---|---|
participation_ratio |
p_ratio |
max_singular_value |
sv_max |
min_singular_value |
sv_min |
subspace_overlap |
overlap |
effective_rank |
eff_rank |
variance_explained |
var_exp |
n_components_95pct |
nc_95 |
Factor/Layer Specs (covered above):
| Current | Proposed |
|---|---|
factor_0 |
F0 |
concat |
Fcat / Lcat |
orthogonality_0_1 |
F0,1 |
blocks.0.hook_resid_pre |
L0.resid.pre |
These abbreviations prioritize brevity while remaining recognizable. The goal is to keep the full metric key scannable at a glance.
Examples of Proposed Convention
📁 loss
├─ ema
├─ ma
📁 acts/reg/r2
├─ L1.resid.post-F0
├─ L1.resid.post-F1
📁 acts/reg/orth/overlap
├─ L2.resid.pre-F0,1
├─ L2.resid.pre-F0,2
Alternative Direction
Have the folder name contain layer and factor related information while the “metric name” contains metric related information. I'm not sure what we would do for metrics that aren't factor-layer centric (e.g. loss/ma), but maybe we just keep those as is. Also, as we scale I think this would have some worse folder explosion issues than what is proposed above. However, I think this ultimately comes down to what feels most natural to most people (i.e. finding metrics by type vs. finding metrics by location) and feedback is welcome on whether this convention (or something else entirely!) would feel more natural.