Encoding Quality Metrics¶
quprep.metrics
¶
Data-driven circuit quality metrics for QuPrep encodings.
All metrics operate on the actual output states produced by encoding data through a circuit. No quantum hardware or external framework is required — a lightweight numpy statevector simulator handles all computation.
Simulation is limited to circuits with n_qubits ≤ metrics.MAX_QUBITS
(default 12). For larger circuits the functions return None.
Functions:
| Name | Description |
|---|---|
expressibility |
KL divergence of the output-state fidelity distribution from the Haar measure. Lower = more expressive. |
entanglement_capability |
Average Meyer-Wallach entanglement measure over sampled data points. Higher = more entangled. 0 for product-state encodings. |
kernel_alignment |
Normalised Frobenius alignment of the quantum kernel with class labels. Higher = better class separation. Requires labelled data. |
score_encoding |
Compute all three metrics and return an |
Classes¶
BarrenPlateauReport(encoding, n_qubits, circuit_depth, gradient_variance, risk_level, mitigations=list())
dataclass
¶
Barren plateau risk report for a quantum encoding.
Attributes:
| Name | Type | Description |
|---|---|---|
encoding |
str
|
Encoder name (lower-case, without "Encoder" suffix). |
n_qubits |
int
|
Number of qubits determined by cost estimation. |
circuit_depth |
int
|
Estimated circuit depth. |
gradient_variance |
float
|
Analytical upper bound on the gradient variance for the given cost type. Derived from the formula for the specified cost_type — no simulation is performed. |
risk_level |
str
|
One of |
mitigations |
list[str]
|
Suggested mitigation strategies (empty when risk is "none"). |
EncoderMetrics(encoding, expressibility, entanglement_capability, kernel_alignment, n_qubits)
dataclass
¶
Data-driven quality metrics for a parameterized encoding on a dataset.
Attributes:
| Name | Type | Description |
|---|---|---|
encoding |
str
|
Encoder name (e.g. |
expressibility |
float or None
|
KL divergence from the Haar distribution. Lower = more expressive.
|
entanglement_capability |
float or None
|
Average Meyer-Wallach entanglement measure ∈ [0, 1].
Higher = more entangled. 0 for product-state encodings.
|
kernel_alignment |
float or None
|
Normalised Frobenius alignment of the quantum kernel with class labels,
∈ [−1, 1]. Higher = better class separation.
|
n_qubits |
int
|
Qubit count used for simulation. |
SensitivityResult(feature_names, scores, epsilon, n_samples)
dataclass
¶
Per-feature sensitivity scores for an encoding.
Attributes:
| Name | Type | Description |
|---|---|---|
feature_names |
list[str]
|
|
scores |
ndarray
|
Sensitivity score per feature: mean state infidelity
|
epsilon |
float
|
Perturbation magnitude used. |
n_samples |
int
|
Number of dataset samples evaluated. |
Functions¶
detect_barren_plateau(encoder, dataset, *, cost_type='global')
¶
Analytically estimate barren plateau risk for a quantum encoding.
No circuit simulation is performed. Risk is derived from qubit count using the theoretical gradient variance bounds:
- Global cost (McClean et al. 2018):
Var[∂C/∂θ] ≤ 2^(1−n)— exponential decay with qubit count. - Local cost (Cerezo et al. 2021):
Var[∂C/∂θ] ≈ 1/n²— polynomial decay; strongly preferred for large circuits.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
encoder
|
BaseEncoder
|
A QuPrep encoder. Does not need to be fitted. |
required |
dataset
|
Dataset
|
Used only to determine qubit count and circuit depth via cost estimation. |
required |
cost_type
|
('global', local)
|
Cost function type used during training. |
"global"
|
Returns:
| Type | Description |
|---|---|
BarrenPlateauReport
|
|
Examples:
>>> import numpy as np
>>> import quprep as qd
>>> from quprep.core.dataset import Dataset
>>> ds = Dataset(data=np.random.default_rng(0).uniform(0, 1, (50, 8)))
>>> report = qd.detect_barren_plateau(qd.IQPEncoder(), ds)
>>> print(report.risk_level)
mild
References
McClean J.R. et al. "Barren plateaus in quantum neural network training landscapes." Nature Communications 9, 4812 (2018).
Cerezo M. et al. "Cost function dependent barren plateaus in shallow parametrized quantum circuits." Nature Communications 12, 1791 (2021).
encoding_sensitivity(encoder, dataset, epsilon=0.01, n_samples=20, seed=42)
¶
Measure how much each feature influences the encoded quantum state.
Perturbs each feature independently by epsilon and measures the mean
state infidelity (1 − |⟨ψ|ψ'⟩|²) across n_samples data points.
Features with higher scores have more influence on the circuit output —
useful for debugging encodings and identifying which features the quantum
model is most sensitive to.
Only works for encodings supported by the numpy statevector simulator
(n_qubits ≤ 12). Returns zero scores for unsupported encodings or
when simulation fails.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
encoder
|
BaseEncoder
|
Fitted encoder instance. |
required |
dataset
|
Dataset
|
Dataset to sample from. |
required |
epsilon
|
float
|
Perturbation magnitude (absolute, in the feature's current scale). Default 0.01. |
0.01
|
n_samples
|
int
|
Number of dataset samples to average over. Default 20. |
20
|
seed
|
int
|
Random seed for sample selection. Default 42. |
42
|
Returns:
| Type | Description |
|---|---|
SensitivityResult
|
|
entanglement_capability(encoder, dataset, *, n_samples=200, seed=None)
¶
Estimate the entanglement capability of an encoding.
Returns the average Meyer-Wallach measure over randomly sampled data points. Ranges from 0 (product state, e.g. plain angle encoding) to 1 (maximally entangled).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
encoder
|
encoder instance
|
|
required |
dataset
|
Dataset
|
|
required |
n_samples
|
int
|
Default 200. |
200
|
seed
|
int
|
|
None
|
Returns:
| Type | Description |
|---|---|
float or None
|
Average MW measure ∈ [0, 1]. |
References
Sim et al. (2019) https://doi.org/10.1002/qute.201900070
kernel_alignment(encoder, dataset, *, max_samples=300, seed=None)
¶
Compute the normalised kernel alignment between the quantum kernel and labels.
Measures how well the encoding separates classes by comparing the quantum kernel matrix K (where K[i,j] = |⟨ψ(xᵢ)|ψ(xⱼ)⟩|²) to the ideal label kernel K_y (where K_y[i,j] = yᵢ·yⱼ).
The alignment is:
.. math::
A(K, K_y) = \frac{\langle K, K_y \rangle_F}{\|K\|_F \|K_y\|_F}
Higher values indicate the encoding separates classes better.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
encoder
|
encoder instance
|
A fitted QuPrep encoder. |
required |
dataset
|
Dataset
|
Must have |
required |
max_samples
|
int
|
Subsample the dataset to at most this many points for efficiency. Default 300. |
300
|
seed
|
int
|
|
None
|
Returns:
| Type | Description |
|---|---|
float or None
|
Alignment score ∈ [−1, 1]. |
score_encoding(encoder, dataset, *, n_samples=200, seed=None)
¶
Compute all data-driven quality metrics for one encoder on a dataset.
Encoders that require fitting (e.g. RandomFourierEncoder) are
automatically fitted on the dataset before metric computation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
encoder
|
encoder instance
|
|
required |
dataset
|
Dataset
|
|
required |
n_samples
|
int
|
Samples used for expressibility and entanglement. Default 200. |
200
|
seed
|
int
|
|
None
|
Returns:
| Type | Description |
|---|---|
EncoderMetrics
|
|
Expressibility and entanglement capability metrics (Sim et al. 2019).
References
Sim S. et al. "Expressibility and Entangling Capability of Parameterized Quantum Circuits for Hybrid Quantum-Classical Algorithms." Advanced Quantum Technologies 2(12), 2019. https://doi.org/10.1002/qute.201900070
Functions¶
expressibility(encoder, dataset, *, n_samples=500, n_bins=75, seed=None)
¶
Estimate the expressibility of an encoding as KL divergence from Haar.
A lower value indicates a more expressive circuit (closer to the uniformly-random Haar distribution over the Hilbert space).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
encoder
|
encoder instance
|
A fitted QuPrep encoder (e.g. |
required |
dataset
|
Dataset
|
Source data — rows are sampled to build the fidelity distribution. |
required |
n_samples
|
int
|
Number of data rows to sample for fidelity estimation. Default 500. |
500
|
n_bins
|
int
|
Number of histogram bins for the KL divergence estimate. Default 75. |
75
|
seed
|
int
|
Random seed for reproducibility. |
None
|
Returns:
| Type | Description |
|---|---|
float or None
|
KL divergence ≥ 0. |
References
Sim et al. (2019) https://doi.org/10.1002/qute.201900070
Source code in quprep/metrics/expressibility.py
entanglement_capability(encoder, dataset, *, n_samples=200, seed=None)
¶
Estimate the entanglement capability of an encoding.
Returns the average Meyer-Wallach measure over randomly sampled data points. Ranges from 0 (product state, e.g. plain angle encoding) to 1 (maximally entangled).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
encoder
|
encoder instance
|
|
required |
dataset
|
Dataset
|
|
required |
n_samples
|
int
|
Default 200. |
200
|
seed
|
int
|
|
None
|
Returns:
| Type | Description |
|---|---|
float or None
|
Average MW measure ∈ [0, 1]. |
References
Sim et al. (2019) https://doi.org/10.1002/qute.201900070
Source code in quprep/metrics/expressibility.py
Quantum kernel alignment and composite encoder quality metrics.
Classes¶
EncoderMetrics(encoding, expressibility, entanglement_capability, kernel_alignment, n_qubits)
dataclass
¶
Data-driven quality metrics for a parameterized encoding on a dataset.
Attributes:
| Name | Type | Description |
|---|---|---|
encoding |
str
|
Encoder name (e.g. |
expressibility |
float or None
|
KL divergence from the Haar distribution. Lower = more expressive.
|
entanglement_capability |
float or None
|
Average Meyer-Wallach entanglement measure ∈ [0, 1].
Higher = more entangled. 0 for product-state encodings.
|
kernel_alignment |
float or None
|
Normalised Frobenius alignment of the quantum kernel with class labels,
∈ [−1, 1]. Higher = better class separation.
|
n_qubits |
int
|
Qubit count used for simulation. |
Functions¶
kernel_alignment(encoder, dataset, *, max_samples=300, seed=None)
¶
Compute the normalised kernel alignment between the quantum kernel and labels.
Measures how well the encoding separates classes by comparing the quantum kernel matrix K (where K[i,j] = |⟨ψ(xᵢ)|ψ(xⱼ)⟩|²) to the ideal label kernel K_y (where K_y[i,j] = yᵢ·yⱼ).
The alignment is:
.. math::
A(K, K_y) = \frac{\langle K, K_y \rangle_F}{\|K\|_F \|K_y\|_F}
Higher values indicate the encoding separates classes better.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
encoder
|
encoder instance
|
A fitted QuPrep encoder. |
required |
dataset
|
Dataset
|
Must have |
required |
max_samples
|
int
|
Subsample the dataset to at most this many points for efficiency. Default 300. |
300
|
seed
|
int
|
|
None
|
Returns:
| Type | Description |
|---|---|
float or None
|
Alignment score ∈ [−1, 1]. |
Source code in quprep/metrics/kernel.py
score_encoding(encoder, dataset, *, n_samples=200, seed=None)
¶
Compute all data-driven quality metrics for one encoder on a dataset.
Encoders that require fitting (e.g. RandomFourierEncoder) are
automatically fitted on the dataset before metric computation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
encoder
|
encoder instance
|
|
required |
dataset
|
Dataset
|
|
required |
n_samples
|
int
|
Samples used for expressibility and entanglement. Default 200. |
200
|
seed
|
int
|
|
None
|
Returns:
| Type | Description |
|---|---|
EncoderMetrics
|
|
Source code in quprep/metrics/kernel.py
Encoding sensitivity analysis — per-feature influence on the quantum state.
Classes¶
SensitivityResult(feature_names, scores, epsilon, n_samples)
dataclass
¶
Per-feature sensitivity scores for an encoding.
Attributes:
| Name | Type | Description |
|---|---|---|
feature_names |
list[str]
|
|
scores |
ndarray
|
Sensitivity score per feature: mean state infidelity
|
epsilon |
float
|
Perturbation magnitude used. |
n_samples |
int
|
Number of dataset samples evaluated. |
Functions¶
most_sensitive(n=5)
¶
Return the top-n most sensitive features as (name, score) pairs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n
|
int
|
Number of features to return. Default 5. |
5
|
Source code in quprep/metrics/sensitivity.py
Functions¶
encoding_sensitivity(encoder, dataset, epsilon=0.01, n_samples=20, seed=42)
¶
Measure how much each feature influences the encoded quantum state.
Perturbs each feature independently by epsilon and measures the mean
state infidelity (1 − |⟨ψ|ψ'⟩|²) across n_samples data points.
Features with higher scores have more influence on the circuit output —
useful for debugging encodings and identifying which features the quantum
model is most sensitive to.
Only works for encodings supported by the numpy statevector simulator
(n_qubits ≤ 12). Returns zero scores for unsupported encodings or
when simulation fails.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
encoder
|
BaseEncoder
|
Fitted encoder instance. |
required |
dataset
|
Dataset
|
Dataset to sample from. |
required |
epsilon
|
float
|
Perturbation magnitude (absolute, in the feature's current scale). Default 0.01. |
0.01
|
n_samples
|
int
|
Number of dataset samples to average over. Default 20. |
20
|
seed
|
int
|
Random seed for sample selection. Default 42. |
42
|
Returns:
| Type | Description |
|---|---|
SensitivityResult
|
|
Source code in quprep/metrics/sensitivity.py
Examples¶
Identify sensitive features¶
import quprep as qd
enc = qd.AngleEncoder(rotation="ry")
result = qd.encoding_sensitivity(enc, dataset, n_samples=20, seed=42)
print(result.scores) # array of per-feature infidelity scores
print(result.feature_names) # ["f0", "f1", ...]
for name, score in result.most_sensitive(n=3):
print(f"{name}: {score:.4f}")