Pipeline & Qubit Suggestion¶

suggest_pipeline¶

`quprep.core.recommender.suggest_pipeline(source, *, task='classification', qubits=None)` ¶

Suggest a full pipeline configuration for a dataset and task.

Analyses the dataset to recommend which preprocessing stages to include and which encoder to use. Extends :func:recommend from encoder-only recommendation to a complete pipeline suggestion.

Parameters:

Name	Type	Description	Default
`source`	`str, Path, np.ndarray, pd.DataFrame, or Dataset`	Input data. Accepts anything the pipeline ingester accepts.	required
`task`	`str`	Target task: `'classification'`, `'regression'`, `'qaoa'`, `'kernel'`, or `'simulation'`. Default `'classification'`.	`'classification'`
`qubits`	`int`	Maximum qubit budget. Influences reducer component count and encoder scoring.	`None`

Returns:

Type	Description
`PipelineSuggestion`	Fully specified suggestion with a :meth:`PipelineSuggestion.build` method to instantiate the recommended Pipeline.

Raises:

Type	Description
`ValueError`	If `task` is not one of the supported values.

PipelineSuggestion¶

`quprep.core.recommender.PipelineSuggestion(imputer, outlier_handler, reducer, reducer_n_components, normalizer, encoder, reason)` `dataclass` ¶

Auto-suggested full pipeline configuration for a dataset and task.

Attributes:

Name	Type	Description
`imputer`	`str or None`	Suggested imputation strategy (`'mean'`, `'median'`, `None`).
`outlier_handler`	`str or None`	Suggested outlier method (`'iqr'`, `None`).
`reducer`	`str or None`	Suggested reducer type (`'pca'`, `'lda'`, `None`).
`reducer_n_components`	`int or None`	Suggested component count for the reducer.
`normalizer`	`str`	Suggested :class:`~quprep.normalize.scalers.Scaler` strategy.
`encoder`	`str`	Suggested encoding method (matches :attr:`EncodingRecommendation.method`).
`reason`	`str`	Human-readable explanation of the choices made.

Functions¶

`build()` ¶

Instantiate a :class:~quprep.core.pipeline.Pipeline from this suggestion.

Returns:

Type	Description
`Pipeline`

preprocessing_report¶

`quprep.ingest.profiler.preprocessing_report(dataset, *, encoder=None, qubit_budget=None)` ¶

Produce actionable preprocessing recommendations for a dataset.

Extends :func:profile from statistics to concrete action items: which columns need imputation, whether outlier removal is advisable, whether dimensionality reduction is needed for the qubit budget, encoder value- range compatibility issues, and class imbalance warnings.

Parameters:

Name	Type	Description	Default
`dataset`	`Dataset`		required
`encoder`	`BaseEncoder`	If provided, check value-range compatibility and suggest the correct normalizer.	`None`
`qubit_budget`	`int`	Maximum qubit count. Flag when `n_features` exceeds this value.	`None`

Returns:

Type	Description
`PreprocessingReport`

PreprocessingReport¶

`quprep.ingest.profiler.PreprocessingReport(recommendations=list(), n_issues=0)` `dataclass` ¶

Actionable preprocessing recommendations for a dataset.

Produced by :func:preprocessing_report. Each entry in recommendations is a concrete action the user should take.

Attributes:

Name	Type	Description
`recommendations`	`list[str]`	Ordered list of actionable recommendations.
`n_issues`	`int`	Number of recommendations (0 = dataset is ready to encode).

suggest_qubits¶

`quprep.core.qubit_suggestion.suggest_qubits(source, *, task='classification', max_qubits=None)` ¶

Suggest an appropriate qubit budget for a dataset.

Analyses the dataset's feature count and sample count to recommend a qubit count that is practical on NISQ hardware. For datasets with more features than the budget allows, a dimensionality reduction step is recommended.

Parameters:

Name	Type	Description	Default
`source`	`str, Path, np.ndarray, pd.DataFrame, or Dataset`	Input data.	required
`task`	`str`	Target task: 'classification', 'regression', 'qaoa', 'kernel', 'simulation'. Influences the encoding hint.	`'classification'`
`max_qubits`	`int`	Hard upper bound on the suggestion. Defaults to 20 (practical NISQ ceiling).	`None`

Returns:

Type	Description
`QubitSuggestion`

Source code in quprep/core/qubit_suggestion.py

def suggest_qubits(
    source,
    *,
    task: str = "classification",
    max_qubits: int | None = None,
) -> QubitSuggestion:
    """
    Suggest an appropriate qubit budget for a dataset.

    Analyses the dataset's feature count and sample count to recommend a
    qubit count that is practical on NISQ hardware. For datasets with more
    features than the budget allows, a dimensionality reduction step is
    recommended.

    Parameters
    ----------
    source : str, Path, np.ndarray, pd.DataFrame, or Dataset
        Input data.
    task : str
        Target task: 'classification', 'regression', 'qaoa', 'kernel',
        'simulation'. Influences the encoding hint.
    max_qubits : int, optional
        Hard upper bound on the suggestion. Defaults to 20 (practical NISQ
        ceiling).

    Returns
    -------
    QubitSuggestion
    """
    _valid_tasks = {"classification", "regression", "qaoa", "kernel", "simulation"}
    if task not in _valid_tasks:
        raise ValueError(
            f"Unknown task '{task}'. Choose from: {sorted(_valid_tasks)}"
        )

    dataset = _ingest(source)
    d = dataset.n_features
    n = dataset.n_samples
    ceiling = max_qubits if max_qubits is not None else _NISQ_CEILING

    # --- Qubit count and size reasoning ---
    if d <= ceiling:
        n_qubits = d
        warning = None
        size_reason = (
            f"dataset has {d} feature(s) — one qubit per feature fits within "
            f"the {'specified' if max_qubits is not None else 'NISQ'} budget of {ceiling}"
        )
    else:
        n_qubits = ceiling
        warning = (
            f"Dataset has {d} features but qubit budget is {ceiling}. "
            f"Apply a reducer (e.g. PCAReducer(n_components={ceiling})) "
            f"before encoding to avoid information loss."
        )
        size_reason = (
            f"dataset has {d} features; capped at {ceiling} qubits — "
            f"apply dimensionality reduction first"
        )

    # --- Encoding hint ---
    amp_qubits = max(1, math.ceil(math.log2(max(d, 2))))

    if task == "qaoa":
        hint = "qaoa_problem"
        hint_reason = "qaoa_problem encoding directly maps data onto QAOA cost Hamiltonian angles"
    elif task == "kernel":
        if n_qubits <= 8:
            hint = "iqp"
            hint_reason = (
                "IQP encoding is ideal for kernel methods at this qubit count"
            )
        else:
            hint = "angle"
            hint_reason = (
                "angle encoding preferred for kernel tasks with many qubits "
                "(IQP depth grows as O(d²))"
            )
    elif task == "simulation":
        hint = "hamiltonian"
        hint_reason = "Hamiltonian encoding directly represents physical time evolution"
    elif n > 500:
        hint = "angle"
        hint_reason = (
            "angle encoding scales to large sample counts; "
            "amplitude encoding requires per-sample state preparation"
        )
    elif n_qubits <= 4 and n <= 100 and amp_qubits <= n_qubits:
        hint = "amplitude"
        hint_reason = (
            "amplitude encoding is feasible for small qubit counts and sample sizes"
        )
    else:
        hint = "angle"
        hint_reason = "angle encoding is NISQ-safe and widely applicable"

    nisq_safe = n_qubits <= _NISQ_CEILING
    reasoning = f"{size_reason}; {hint_reason}"

    return QubitSuggestion(
        n_qubits=n_qubits,
        n_features=d,
        nisq_safe=nisq_safe,
        encoding_hint=hint,
        reasoning=reasoning,
        warning=warning,
    )

QubitSuggestion¶

`quprep.core.qubit_suggestion.QubitSuggestion(n_qubits, n_features, nisq_safe, encoding_hint, reasoning, warning)` `dataclass` ¶

Qubit budget recommendation for a dataset.

Attributes:

Name	Type	Description
`n_qubits`	`int`	Recommended qubit count.
`n_features`	`int`	Number of features in the dataset (before any reduction).
`nisq_safe`	`bool`	`True` if `n_qubits <= 20` (practical NISQ ceiling).
`encoding_hint`	`str`	Encoding that works well at this qubit count and task.
`reasoning`	`str`	Human-readable explanation of the recommendation.
`warning`	`str or None`	Warning if dimensionality reduction is strongly recommended.

Examples¶

Auto-suggest and build a pipeline¶

import quprep as qd

suggestion = qd.suggest_pipeline(dataset, task="classification", qubits=8)
print(suggestion)            # PipelineSuggestion(encoder='iqp', normalizer='minmax_2pi', ...)

pipeline = suggestion.build()
result = pipeline.fit_transform(dataset)

Preprocessing report before encoding¶

import quprep as qd

report = qd.preprocessing_report(dataset, encoder=qd.AngleEncoder(), qubit_budget=8)
print(f"{report.n_issues} issues found")
for rec in report.recommendations:
    print(" •", rec)

Pipeline & Qubit Suggestion¶

suggest_pipeline¶

quprep.core.recommender.suggest_pipeline(source, *, task='classification', qubits=None) ¶

PipelineSuggestion¶

quprep.core.recommender.PipelineSuggestion(imputer, outlier_handler, reducer, reducer_n_components, normalizer, encoder, reason) dataclass ¶

Functions¶

build() ¶

preprocessing_report¶

quprep.ingest.profiler.preprocessing_report(dataset, *, encoder=None, qubit_budget=None) ¶

PreprocessingReport¶

quprep.ingest.profiler.PreprocessingReport(recommendations=list(), n_issues=0) dataclass ¶

suggest_qubits¶

quprep.core.qubit_suggestion.suggest_qubits(source, *, task='classification', max_qubits=None) ¶

QubitSuggestion¶

quprep.core.qubit_suggestion.QubitSuggestion(n_qubits, n_features, nisq_safe, encoding_hint, reasoning, warning) dataclass ¶

Examples¶

Auto-suggest and build a pipeline¶

Preprocessing report before encoding¶

`quprep.core.recommender.suggest_pipeline(source, *, task='classification', qubits=None)` ¶

`quprep.core.recommender.PipelineSuggestion(imputer, outlier_handler, reducer, reducer_n_components, normalizer, encoder, reason)` `dataclass` ¶

`build()` ¶

`quprep.ingest.profiler.preprocessing_report(dataset, *, encoder=None, qubit_budget=None)` ¶

`quprep.ingest.profiler.PreprocessingReport(recommendations=list(), n_issues=0)` `dataclass` ¶

`quprep.core.qubit_suggestion.suggest_qubits(source, *, task='classification', max_qubits=None)` ¶

`quprep.core.qubit_suggestion.QubitSuggestion(n_qubits, n_features, nisq_safe, encoding_hint, reasoning, warning)` `dataclass` ¶