Skip to content

gev_fit_metric

Fit a General Extreme Value (GEV) distribution to calculate cutoff value.

extract_peaks_and_statistics_gev(mip, scaled_mip, best_psi, best_theta, best_phi, best_defocus, correlation_average, correlation_variance, total_correlation_positions, false_positives=1.0, mask_radius=5.0)

Returns peak locations, heights, and pose stats from match template results.

Parameters:

Name Type Description Default
mip Tensor

Maximum intensity projection of the match template results.

required
scaled_mip Tensor

Scaled maximum intensity projection of the match template results.

required
best_psi Tensor

Best psi angles for each pixel.

required
best_theta Tensor

Best theta angles for each pixel.

required
best_phi Tensor

Best phi angles for each pixel.

required
best_defocus Tensor

Best relative defocus values for each pixel.

required
correlation_average Tensor

Average correlation value for each pixel.

required
correlation_variance Tensor

Variance of the correlation values for each pixel.

required
total_correlation_positions int

Total number of correlation positions calculated during template matching. Must be provided if z_score_cutoff is not provided (needed for the noise model).

required
num_bins int

Number of bins to use for histogram when fitting GEV distribution. Default is 128.

required
false_positives float

Number of false positives to allow in the image (over all pixels). Default is 1.0 which corresponds to a single false-positive.

1.0
mask_radius float

Radius of the mask to apply around the peak, in units of pixels. Default is 5.0.

5.0

Returns:

Type Description
MatchTemplatePeaks

Named tuple containing the peak locations, heights, and pose statistics.

Source code in src/leopard_em/analysis/gev_fit_metric.py
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
def extract_peaks_and_statistics_gev(
    mip: torch.Tensor,
    scaled_mip: torch.Tensor,
    best_psi: torch.Tensor,
    best_theta: torch.Tensor,
    best_phi: torch.Tensor,
    best_defocus: torch.Tensor,
    correlation_average: torch.Tensor,
    correlation_variance: torch.Tensor,
    total_correlation_positions: int,
    false_positives: float = 1.0,
    mask_radius: float = 5.0,
) -> MatchTemplatePeaks:
    """Returns peak locations, heights, and pose stats from match template results.

    Parameters
    ----------
    mip : torch.Tensor
        Maximum intensity projection of the match template results.
    scaled_mip : torch.Tensor
        Scaled maximum intensity projection of the match template results.
    best_psi : torch.Tensor
        Best psi angles for each pixel.
    best_theta : torch.Tensor
        Best theta angles for each pixel.
    best_phi : torch.Tensor
        Best phi angles for each pixel.
    best_defocus : torch.Tensor
        Best relative defocus values for each pixel.
    correlation_average : torch.Tensor
        Average correlation value for each pixel.
    correlation_variance : torch.Tensor
        Variance of the correlation values for each pixel.
    total_correlation_positions : int
        Total number of correlation positions calculated during template matching. Must
        be provided if `z_score_cutoff` is not provided (needed for the noise model).
    num_bins : int, optional
        Number of bins to use for histogram when fitting GEV distribution. Default is
        128.
    false_positives : float, optional
        Number of false positives to allow in the image (over all pixels). Default is
        1.0 which corresponds to a single false-positive.
    mask_radius : float, optional
        Radius of the mask to apply around the peak, in units of pixels. Default is 5.0.

    Returns
    -------
    MatchTemplatePeaks
        Named tuple containing the peak locations, heights, and pose statistics.
    """
    z_score_cutoff = gev_zscore_cutoff(scaled_mip, false_positives=false_positives)

    # Find the peak locations only in the scaled MIP
    pos_y, pos_x = find_peaks_from_zscore(scaled_mip, z_score_cutoff, mask_radius)

    # Raise warning if no peaks are found
    if len(pos_y) == 0:
        warnings.warn("No peaks found using z-score metric.", stacklevel=2)

    # Raise warning if a very large number of peaks are found
    if len(pos_y) > LARGE_PEAK_WARNING_VALUE:
        warnings.warn(
            f"Found {len(pos_y)} peaks using the fitted GEV distribution. This is a "
            "lot and could indicate a poor fit to the data. You should inspect the fit "
            "before using these results. See the online documentation for details.",
            stacklevel=2,
        )

    # Extract peak heights, orientations, etc. from other maps
    return MatchTemplatePeaks(
        pos_y=pos_y,
        pos_x=pos_x,
        mip=mip[pos_y, pos_x],
        scaled_mip=scaled_mip[pos_y, pos_x],
        psi=best_psi[pos_y, pos_x],
        theta=best_theta[pos_y, pos_x],
        phi=best_phi[pos_y, pos_x],
        relative_defocus=best_defocus[pos_y, pos_x],
        correlation_mean=correlation_average[pos_y, pos_x],
        correlation_variance=correlation_variance[pos_y, pos_x],
        total_correlations=total_correlation_positions,
    )

fit_gev_to_zscore(zscore_map, min_zscore_value=None, max_zscore_value=8.5, num_samples=1000000)

Helper function to fit a GEV distribution to the z-score map.

See gev_zscore_cutoff for more details.

Source code in src/leopard_em/analysis/gev_fit_metric.py
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
def fit_gev_to_zscore(
    zscore_map: torch.Tensor,
    min_zscore_value: Optional[float] = None,
    max_zscore_value: Optional[float] = 8.5,
    num_samples: Optional[int] = 1_000_000,
) -> tuple[rv_frozen, tuple[float, float, float]]:
    """Helper function to fit a GEV distribution to the z-score map.

    See `gev_zscore_cutoff` for more details.
    """
    if isinstance(zscore_map, torch.Tensor):
        zscore_map = zscore_map.cpu().numpy()

    # Logic for handling optional parameters
    if min_zscore_value is None:
        min_zscore_value = zscore_map.min().item()
    if max_zscore_value is None:
        max_zscore_value = zscore_map.max().item()
    if num_samples is None or num_samples > zscore_map.size:
        num_samples = zscore_map.size

    # Get flattened and filtered data to fit the GEV distribution
    data = zscore_map.flatten()
    data = data[(data >= min_zscore_value) & (data <= max_zscore_value)]
    if len(data) > num_samples:  # type: ignore
        data = np.random.choice(data, num_samples, replace=False)

    # Fit the parameters of the GEV distribution
    shape, loc, scale = genextreme.fit(data)

    return genextreme(shape, loc=loc, scale=scale), (shape, loc, scale)

gev_zscore_cutoff(zscore_map, false_positives=1.0, min_zscore_value=None, max_zscore_value=8.5, num_samples=1000000)

Calculate the z-score cutoff value by fitting a GEV distn to the z-score map.

NOTE: This function can take on the order of 10s to 100s of seconds to run when there are a large number of pixels in the z-score map. The 'num_samples' parameter can be set to fit only using a random subset of the z-score map.

NOTE: Fitting with ~1,000,000 points seems to sufficiently capture the GEV behavior. Your fit results may vary depending on the data; inspecting the quality of your fit is recommended.

NOTE: The 'max_zscore_value' parameter is set to 8.5 by default which performs well for a full orientation search (1.5 degrees in-plane and 2.5 degrees out-of-plane). Adjusting the search space parameters will require adjustment from the default value.

Parameters:

Name Type Description Default
zscore_map Tensor

The z-score map to fit the GEV distribution to.

required
false_positives Optional[float]

The number of false positives to allow in the image (over all pixels). Default is 1.0 which corresponds to a single false-positive.

1.0
min_zscore_value Optional[float]

The minimum z-score value to consider for fitting the GEV distribution. If None, the minimum value in the z-score map is used.

None
max_zscore_value Optional[float]

The maximum z-score value to consider for fitting the GEV distribution. If None, the maximum value in the z-score map is used. Default is 8.5 and all values above this are ignored.

8.5
num_samples Optional[int]

The number of samples to use for fitting the GEV distribution. If None, the number of samples is set to the number of pixels in the z-score map. The default is 1,000,000, and 1 million random pixels are sampled from the z-score map.

1000000

Returns:

Type Description
float

The z-score cutoff value for the GEV distribution.

Source code in src/leopard_em/analysis/gev_fit_metric.py
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
def gev_zscore_cutoff(
    zscore_map: torch.Tensor,
    false_positives: Optional[float] = 1.0,
    min_zscore_value: Optional[float] = None,
    max_zscore_value: Optional[float] = 8.5,
    num_samples: Optional[int] = 1_000_000,
) -> float:
    """Calculate the z-score cutoff value by fitting a GEV distn to the z-score map.

    NOTE: This function can take on the order of 10s to 100s of seconds to run when
    there are a large number of pixels in the z-score map. The 'num_samples' parameter
    can be set to fit only using a random subset of the z-score map.

    NOTE: Fitting with ~1,000,000 points seems to sufficiently capture the GEV behavior.
    Your fit results may vary depending on the data; inspecting the quality of your fit
    is recommended.

    NOTE: The 'max_zscore_value' parameter is set to 8.5 by default which performs well
    for a full orientation search (1.5 degrees in-plane and 2.5 degrees out-of-plane).
    Adjusting the search space parameters will require adjustment from the default
    value.

    Parameters
    ----------
    zscore_map: torch.Tensor
        The z-score map to fit the GEV distribution to.
    false_positives: float, optional
        The number of false positives to allow in the image (over all pixels). Default
        is 1.0 which corresponds to a single false-positive.
    min_zscore_value: float, optional
        The minimum z-score value to consider for fitting the GEV distribution. If
        None, the minimum value in the z-score map is used.
    max_zscore_value: float, optional
        The maximum z-score value to consider for fitting the GEV distribution. If
        None, the maximum value in the z-score map is used. Default is 8.5 and all
        values above this are ignored.
    num_samples: int, optional
        The number of samples to use for fitting the GEV distribution. If None, the
        number of samples is set to the number of pixels in the z-score map. The default
        is 1,000,000, and 1 million random pixels are sampled from the z-score map.

    Returns
    -------
    float
        The z-score cutoff value for the GEV distribution.
    """
    if isinstance(zscore_map, torch.Tensor):
        zscore_map = zscore_map.cpu().numpy()

    gev_opt, _ = fit_gev_to_zscore(
        zscore_map,
        min_zscore_value=min_zscore_value,
        max_zscore_value=max_zscore_value,
        num_samples=num_samples,
    )

    # False positive rate of the survival function
    false_positive_density = false_positives / zscore_map.size
    tmp = gev_opt.isf(false_positive_density)

    return float(tmp)