Skip to content

Pipeline & Qubit Suggestion


suggest_pipeline

quprep.core.recommender.suggest_pipeline(source, *, task='classification', qubits=None)

Suggest a full pipeline configuration for a dataset and task.

Analyses the dataset to recommend which preprocessing stages to include and which encoder to use. Extends :func:recommend from encoder-only recommendation to a complete pipeline suggestion.

Parameters:

Name Type Description Default
source str, Path, np.ndarray, pd.DataFrame, or Dataset

Input data. Accepts anything the pipeline ingester accepts.

required
task str

Target task: 'classification', 'regression', 'qaoa', 'kernel', or 'simulation'. Default 'classification'.

'classification'
qubits int

Maximum qubit budget. Influences reducer component count and encoder scoring.

None

Returns:

Type Description
PipelineSuggestion

Fully specified suggestion with a :meth:PipelineSuggestion.build method to instantiate the recommended Pipeline.

Raises:

Type Description
ValueError

If task is not one of the supported values.


PipelineSuggestion

quprep.core.recommender.PipelineSuggestion(imputer, outlier_handler, reducer, reducer_n_components, normalizer, encoder, reason) dataclass

Auto-suggested full pipeline configuration for a dataset and task.

Attributes:

Name Type Description
imputer str or None

Suggested imputation strategy ('mean', 'median', None).

outlier_handler str or None

Suggested outlier method ('iqr', None).

reducer str or None

Suggested reducer type ('pca', 'lda', None).

reducer_n_components int or None

Suggested component count for the reducer.

normalizer str

Suggested :class:~quprep.normalize.scalers.Scaler strategy.

encoder str

Suggested encoding method (matches :attr:EncodingRecommendation.method).

reason str

Human-readable explanation of the choices made.

Functions

build()

Instantiate a :class:~quprep.core.pipeline.Pipeline from this suggestion.

Returns:

Type Description
Pipeline

preprocessing_report

quprep.ingest.profiler.preprocessing_report(dataset, *, encoder=None, qubit_budget=None)

Produce actionable preprocessing recommendations for a dataset.

Extends :func:profile from statistics to concrete action items: which columns need imputation, whether outlier removal is advisable, whether dimensionality reduction is needed for the qubit budget, encoder value- range compatibility issues, and class imbalance warnings.

Parameters:

Name Type Description Default
dataset Dataset
required
encoder BaseEncoder

If provided, check value-range compatibility and suggest the correct normalizer.

None
qubit_budget int

Maximum qubit count. Flag when n_features exceeds this value.

None

Returns:

Type Description
PreprocessingReport

PreprocessingReport

quprep.ingest.profiler.PreprocessingReport(recommendations=list(), n_issues=0) dataclass

Actionable preprocessing recommendations for a dataset.

Produced by :func:preprocessing_report. Each entry in recommendations is a concrete action the user should take.

Attributes:

Name Type Description
recommendations list[str]

Ordered list of actionable recommendations.

n_issues int

Number of recommendations (0 = dataset is ready to encode).


suggest_qubits

quprep.core.qubit_suggestion.suggest_qubits(source, *, task='classification', max_qubits=None)

Suggest an appropriate qubit budget for a dataset.

Analyses the dataset's feature count and sample count to recommend a qubit count that is practical on NISQ hardware. For datasets with more features than the budget allows, a dimensionality reduction step is recommended.

Parameters:

Name Type Description Default
source str, Path, np.ndarray, pd.DataFrame, or Dataset

Input data.

required
task str

Target task: 'classification', 'regression', 'qaoa', 'kernel', 'simulation'. Influences the encoding hint.

'classification'
max_qubits int

Hard upper bound on the suggestion. Defaults to 20 (practical NISQ ceiling).

None

Returns:

Type Description
QubitSuggestion
Source code in quprep/core/qubit_suggestion.py
def suggest_qubits(
    source,
    *,
    task: str = "classification",
    max_qubits: int | None = None,
) -> QubitSuggestion:
    """
    Suggest an appropriate qubit budget for a dataset.

    Analyses the dataset's feature count and sample count to recommend a
    qubit count that is practical on NISQ hardware. For datasets with more
    features than the budget allows, a dimensionality reduction step is
    recommended.

    Parameters
    ----------
    source : str, Path, np.ndarray, pd.DataFrame, or Dataset
        Input data.
    task : str
        Target task: 'classification', 'regression', 'qaoa', 'kernel',
        'simulation'. Influences the encoding hint.
    max_qubits : int, optional
        Hard upper bound on the suggestion. Defaults to 20 (practical NISQ
        ceiling).

    Returns
    -------
    QubitSuggestion
    """
    _valid_tasks = {"classification", "regression", "qaoa", "kernel", "simulation"}
    if task not in _valid_tasks:
        raise ValueError(
            f"Unknown task '{task}'. Choose from: {sorted(_valid_tasks)}"
        )

    dataset = _ingest(source)
    d = dataset.n_features
    n = dataset.n_samples
    ceiling = max_qubits if max_qubits is not None else _NISQ_CEILING

    # --- Qubit count and size reasoning ---
    if d <= ceiling:
        n_qubits = d
        warning = None
        size_reason = (
            f"dataset has {d} feature(s) — one qubit per feature fits within "
            f"the {'specified' if max_qubits is not None else 'NISQ'} budget of {ceiling}"
        )
    else:
        n_qubits = ceiling
        warning = (
            f"Dataset has {d} features but qubit budget is {ceiling}. "
            f"Apply a reducer (e.g. PCAReducer(n_components={ceiling})) "
            f"before encoding to avoid information loss."
        )
        size_reason = (
            f"dataset has {d} features; capped at {ceiling} qubits — "
            f"apply dimensionality reduction first"
        )

    # --- Encoding hint ---
    amp_qubits = max(1, math.ceil(math.log2(max(d, 2))))

    if task == "qaoa":
        hint = "qaoa_problem"
        hint_reason = "qaoa_problem encoding directly maps data onto QAOA cost Hamiltonian angles"
    elif task == "kernel":
        if n_qubits <= 8:
            hint = "iqp"
            hint_reason = (
                "IQP encoding is ideal for kernel methods at this qubit count"
            )
        else:
            hint = "angle"
            hint_reason = (
                "angle encoding preferred for kernel tasks with many qubits "
                "(IQP depth grows as O(d²))"
            )
    elif task == "simulation":
        hint = "hamiltonian"
        hint_reason = "Hamiltonian encoding directly represents physical time evolution"
    elif n > 500:
        hint = "angle"
        hint_reason = (
            "angle encoding scales to large sample counts; "
            "amplitude encoding requires per-sample state preparation"
        )
    elif n_qubits <= 4 and n <= 100 and amp_qubits <= n_qubits:
        hint = "amplitude"
        hint_reason = (
            "amplitude encoding is feasible for small qubit counts and sample sizes"
        )
    else:
        hint = "angle"
        hint_reason = "angle encoding is NISQ-safe and widely applicable"

    nisq_safe = n_qubits <= _NISQ_CEILING
    reasoning = f"{size_reason}; {hint_reason}"

    return QubitSuggestion(
        n_qubits=n_qubits,
        n_features=d,
        nisq_safe=nisq_safe,
        encoding_hint=hint,
        reasoning=reasoning,
        warning=warning,
    )

QubitSuggestion

quprep.core.qubit_suggestion.QubitSuggestion(n_qubits, n_features, nisq_safe, encoding_hint, reasoning, warning) dataclass

Qubit budget recommendation for a dataset.

Attributes:

Name Type Description
n_qubits int

Recommended qubit count.

n_features int

Number of features in the dataset (before any reduction).

nisq_safe bool

True if n_qubits <= 20 (practical NISQ ceiling).

encoding_hint str

Encoding that works well at this qubit count and task.

reasoning str

Human-readable explanation of the recommendation.

warning str or None

Warning if dimensionality reduction is strongly recommended.


Examples

Auto-suggest and build a pipeline

import quprep as qd

suggestion = qd.suggest_pipeline(dataset, task="classification", qubits=8)
print(suggestion)            # PipelineSuggestion(encoder='iqp', normalizer='minmax_2pi', ...)

pipeline = suggestion.build()
result = pipeline.fit_transform(dataset)

Preprocessing report before encoding

import quprep as qd

report = qd.preprocessing_report(dataset, encoder=qd.AngleEncoder(), qubit_budget=8)
print(f"{report.n_issues} issues found")
for rec in report.recommendations:
    print(" •", rec)