report_metrics_by_site#
- uniharmony.metrics.report_metrics_by_site(y_true: ndarray[tuple[Any, ...], dtype[_ScalarT]], y_pred: ndarray[tuple[Any, ...], dtype[_ScalarT]], sites: ndarray[tuple[Any, ...], dtype[_ScalarT]], metrics: Callable | list[Callable], metric_kwargs: dict[str, Any] | list[dict[str, Any]] | None = None, overall_performance: bool = True, skip_empty_sites: bool = True) dict[str, dict[str | int, float]]#
Compute one or more metrics stratified by site.
Accepts either a single metric function or a sequence of metrics. Each metric can receive its own set of keyword arguments via
metric_kwargs. Ify_predcontains continuous scores but a metric requires discrete predictions, the scores are automatically binarized using thethresholdkeyword argument for that metric (default: 0.5).- Parameters:
- y_truenp.ndarray
Ground-truth (correct) target values.
- y_prednp.ndarray
Estimated targets as returned by a classifier, or probability estimates / decision function outputs.
- sitesnp.ndarray
Site identifiers for stratification. Can be strings or integers.
- metricscallable or list of callable
Metric function or list of metric functions to compute (e.g., from
sklearn.metrics). Pass a single callable for one metric, or a sequence for multiple metrics.- metric_kwargsdict or list of dict or None, optional (default None)
Keyword arguments for each metric. If a single dict, it is passed to all metrics. If a list,
metric_kwargs[i]is passed tometrics[i]. Must have the same length asmetrics. Includethreshold(default: 0.5) for metrics that require discrete predictions wheny_predcontains continuous scores.- overall_performancebool, optional (default True)
If True, include an
"overall"key for each metric computed across all sites.- skip_empty_sitesbool, optional (default True)
If True, skip sites with no samples.
- Returns:
- dict
Dictionary mapping metric names to site-wise results. Each inner dictionary maps site identifiers to metric values. When a single metric is passed, the result contains one top-level key (the metric’s
__name__).
- Raises:
- TypeError
If inputs have incorrect types.
- ValueError
If
metric_kwargslength does not matchmetricslength or if input arrays have mismatched lengths.
Examples
Single metric:
>>> from sklearn.metrics import accuracy_score >>> y_true = np.array([0, 1, 0, 1, 0, 1]) >>> y_scores = np.array([0.1, 0.9, 0.2, 0.4, 0.3, 0.8]) >>> sites = np.array(["A", "A", "B", "B", "A", "B"]) >>> report_metrics_by_site(y_true, y_scores, sites, accuracy_score) {'accuracy_score': {'A': 1.0, 'B': 0.5}}
Single metric with custom threshold:
>>> report_metrics_by_site( ... y_true, y_scores, sites, accuracy_score, metric_kwargs={"threshold": 0.3} ... ) {'accuracy_score': {'A': 1.0, 'B': 0.5}}
Multiple metrics:
>>> from sklearn.metrics import roc_auc_score, f1_score >>> report_metrics_by_site( ... y_true, ... y_scores, ... sites, ... metrics=[roc_auc_score, accuracy_score, f1_score], ... metric_kwargs=[ ... {}, ... {"threshold": 0.5}, ... {"threshold": 0.5, "average": "macro"}, ... ], ... overall_performance=True, ... ) {'roc_auc_score': {'overall': 0.833, 'A': 1.0, 'B': 0.5}, 'accuracy_score': {'overall': 0.833, 'A': 1.0, 'B': 0.5}, 'f1_score': {'overall': 0.833, 'A': 1.0, 'B': 0.5}}