Skip to content

Comparison API

API reference for quprep.compare — encoding comparison and side-by-side cost analysis.

All symbols are available on the top-level namespace:

import quprep as qd

result = qd.compare_encodings("data.csv", task="classification")
best   = result.best(prefer="nisq")

compare_encodings

quprep.compare.compare_encodings(source, *, include=None, exclude=None, task=None, qubits=None)

Compare all (or selected) encoding methods on source and return side-by-side stats.

No circuits are generated — costs are estimated analytically from the dataset shape, so this is fast even for large datasets.

Parameters:

Name Type Description Default
source str, numpy.ndarray, pandas.DataFrame, or Dataset

Input data — same formats accepted by :class:~quprep.Pipeline.

required
include list[str] or None

Encoder names to include. If None, all 12 encoders are compared. Valid names: "angle", "amplitude", "basis", "iqp", "reupload", "entangled_angle", "hamiltonian", "qaoa_problem", "zz_feature_map", "pauli_feature_map", "random_fourier", "tensor_product".

None
exclude list[str] or None

Encoder names to exclude. Applied after include.

None
task str or None

If provided, the recommended encoder for this task is starred in the output table. Passed to :func:~quprep.recommend. Valid: "classification", "regression", "qaoa", "kernel", "simulation".

None
qubits int or None

Maximum qubit budget. Encoders requiring more qubits have nisq_safe set to False and a budget warning added to their row.

None

Returns:

Type Description
ComparisonResult

Examples:

Compare all encoders on a CSV, highlight the best for classification:

>>> import quprep as qd
>>> result = qd.compare_encodings("data.csv", task="classification", qubits=8)
>>> print(result)
>>> result.best(prefer="nisq")

Compare a subset:

>>> result = qd.compare_encodings(X, include=["angle", "iqp", "amplitude"])

ComparisonResult

quprep.compare.ComparisonResult(rows, recommended=None) dataclass

Side-by-side cost estimates for multiple encoding methods.

Attributes:

Name Type Description
rows list[CostEstimate]

One :class:~quprep.CostEstimate per encoding method, in the order they were evaluated.

recommended str or None

Encoding name highlighted as the best match for the specified task/qubit budget. None if no task was passed to :func:compare_encodings.

Examples:

>>> import quprep as qd
>>> result = qd.compare_encodings("data.csv", task="classification")
>>> print(result)
>>> best = result.best(prefer="nisq")

Functions

best(*, prefer='nisq')

Return the best row according to prefer.

Parameters:

Name Type Description Default
prefer ('nisq', 'depth', 'gates', 'qubits')

"nisq" (default) — prefer NISQ-safe encodings, then minimise depth. "depth" — minimise circuit depth globally. "gates" — minimise total gate count. "qubits" — minimise qubit count.

"nisq"

Returns:

Type Description
CostEstimate

to_dict()

Return all rows as a list of plain dicts (JSON-serialisable).


EncodingRecommendation

quprep.core.recommender.EncodingRecommendation(method, qubits, depth, nisq_safe, reason, score, alternatives=list()) dataclass

Result of the encoding recommendation engine.

Attributes:

Name Type Description
method str

Recommended encoding method (e.g. 'iqp').

qubits int

Number of qubits required for the dataset's feature count.

depth str

Asymptotic circuit depth expression.

nisq_safe bool

Whether the encoding is suitable for current NISQ devices.

reason str

Human-readable explanation of the recommendation.

score float

Internal score (higher is better). For comparison only.

alternatives list[EncodingRecommendation]

Runner-up options ranked by score, without nested alternatives.

Functions

apply(source, *, framework='qasm', **kwargs)

Apply this recommendation to source data and export circuits.

Parameters:

Name Type Description Default
source str, Path, np.ndarray, or pd.DataFrame

Input data.

required
framework str

Export target. Default 'qasm'.

'qasm'
**kwargs

Forwarded to :func:quprep.prepare.

{}

Returns:

Type Description
PipelineResult

recommend

quprep.core.recommender.recommend(source, *, task='classification', qubits=None, use_metrics=False, **kwargs)

Recommend the best encoding for a dataset and task.

Scores all encodings against the dataset profile (feature count, binary/ continuous fraction, missing rate, sparsity, correlations, sample count) and the target task, then returns the highest-scoring option with ranked alternatives.

Parameters:

Name Type Description Default
source str, Path, np.ndarray, pd.DataFrame, or Dataset

Input data. Accepts anything the pipeline ingester accepts.

required
task str

Target task: 'classification', 'regression', 'qaoa', 'kernel', or 'simulation'. Default 'classification'.

'classification'
qubits int

Maximum qubit budget. Encodings that exceed this are heavily penalised.

None
use_metrics bool

When True and n_features ≤ 12, augment heuristic scores with data-driven circuit metrics (expressibility, entanglement capability, and — for labelled datasets — kernel alignment). Adds a few seconds of simulation time; disabled by default.

False
**kwargs

Reserved for future use (e.g. backend='ibm_brisbane').

{}

Returns:

Type Description
EncodingRecommendation

Top recommendation with alternatives list.

Raises:

Type Description
ValueError

If task is not one of the supported values.


fingerprint_pipeline

quprep.core.fingerprint.fingerprint_pipeline(pipeline)

Compute a reproducibility fingerprint for pipeline.

The fingerprint captures the class name and constructor parameters of every configured stage (ingester, preprocessor, cleaner, reducer, normalizer, encoder, exporter, schema, drift_detector) plus the installed versions of key dependencies. The resulting SHA-256 hash is deterministic: the same configuration always produces the same hash regardless of when or where the pipeline runs.

Parameters:

Name Type Description Default
pipeline Pipeline

A Pipeline instance (fitted or unfitted).

required

Returns:

Type Description
FingerprintResult

Contains config (serialisable dict) and hash (SHA-256 hex string).

Examples:

>>> import quprep as qd
>>> pipeline = qd.Pipeline(encoder=qd.AngleEncoder(), exporter=qd.QASMExporter())
>>> fp = qd.fingerprint_pipeline(pipeline)
>>> print(fp.hash)
>>> fp.save("experiment.json")

FingerprintResult

quprep.core.fingerprint.FingerprintResult(config, hash_hex)

Output of :func:fingerprint_pipeline.

Attributes:

Name Type Description
config dict

Full pipeline configuration (stages + dependency versions). This is the dict that was hashed — no timestamp, fully deterministic.

hash str

SHA-256 hex digest of the canonical JSON serialisation of config.

Attributes

hash = hash_hex instance-attribute

config = config instance-attribute

Functions

to_json(indent=2)

Return a JSON string (hash + timestamp + config).

save(path, format='json')

Write the fingerprint to a file.

Parameters:

Name Type Description Default
path str

Destination file path.

required
format ('json', 'yaml')

Output format.

"json"