Skip to content

QuPrep

The missing preprocessing layer between classical datasets and quantum computing frameworks.

CSV / DataFrame / NumPy  →  QuPrep  →  circuit-ready output

QuPrep converts classical datasets into quantum-circuit-ready format. It is not a quantum computing framework, simulator, or training tool — it is the preprocessing step that feeds into Qiskit, PennyLane, Cirq, TKET, and any other quantum workflow.

  • Install in seconds


    pip install quprep
    

    No quantum framework required for the core install.

    Installation guide

  • Zero to circuit in one line


    import quprep
    result = quprep.prepare("data.csv", encoding="angle")
    print(result.circuit)
    

    Quickstart

  • Not sure which encoding?


    rec = quprep.recommend("data.csv", task="classification", qubits=8)
    result = rec.apply("data.csv")
    

    Encoding guide

  • QUBO & quantum optimization


    from quprep.qubo import max_cut, qaoa_circuit
    q = max_cut(adj)
    qasm = qaoa_circuit(q, p=2)
    

    QUBO guide


Pipeline stages

Stage Since Description
Ingest v0.1.0 CSV, TSV, NumPy arrays, Pandas DataFrames
Clean v0.1.0 Missing values, outliers, categoricals, feature selection
Normalize v0.1.0 Auto-selected per encoding (L2, MinMax, Z-score, binary)
Encode v0.1.0 Angle, Amplitude, Basis
Export v0.1.0 OpenQASM 3.0, Qiskit
Reduce v0.2.0 PCA, LDA, DFT, t-SNE, UMAP, hardware-aware
Encode+ v0.2.0 IQP, Entangled Angle, Data re-uploading, Hamiltonian
Export+ v0.2.0 PennyLane, Cirq, TKET, ASCII + matplotlib visualization
Recommend v0.2.0 Automatic encoding selection for your dataset and task
QUBO v0.3.0 QUBO/Ising, 7 problem formulations, solvers, QAOA, D-Wave export
Validate v0.4.0 Input validation, schema enforcement, cost estimation, sklearn fit/transform, import quprep as qd
Intelligence v0.5.0 Qubit suggestion, encoding comparison, data drift detection, pipeline save/load, batch QASM export
Encode++ v0.6.0 ZZFeatureMap, PauliFeatureMap, RandomFourier, TensorProduct, QAOAProblem encoders
Export++ v0.6.0 Amazon Braket, Q# (Azure Quantum), IQM native format
Plugins v0.6.0 register_encoder / register_exporter — custom encoders/exporters via prepare()
Modalities v0.7.0 Time series, sparse matrices, multi-label, image, text (TF-IDF + sentence-transformers), graph (lossy feature extraction + lossless graph state encoding)
Connectors v0.8.0 HuggingFace datasets, OpenML, Kaggle — load any public dataset in one line
CLI tools v0.8.0 quprep inspect (dataset profile), quprep benchmark (encoder comparison table)
Reproducibility v0.8.0 fingerprint_pipeline() — deterministic SHA-256 hash of pipeline config for paper methods sections
Noise-aware preprocessing v0.9.0 Assign high-variance features to least-noisy qubits; minimise SWAP count given hardware topology; remap angles away from 0/π poles
Encoding quality metrics v0.9.0 Simulation-based expressibility, entanglement capability, and kernel alignment scores; use_metrics=True in recommend() for data-driven re-ranking
Class imbalance v0.9.0 ImbalanceHandler — random oversample/undersample, SMOTE, ADASYN as a clean/ stage
Barren plateau detection v0.9.0 detect_barren_plateau() — analytical gradient variance bound before training; risk levels + mitigation suggestions
Streaming ingestion v0.9.0 CSVIngester.stream(), NumpyIngester.stream(), Pipeline.stream() — process datasets larger than RAM in chunks
API polish v0.10.0 Scaler.inverse_transform(), OutlierHandler.outlier_mask_, FeatureSelector.get_feature_names_out(), LDAReducer.explained_variance_ratio_, CategoricalEncoder high-cardinality grouping, PipelineResult.stages per-step snapshots
Quantum preprocessing v0.10.0 check_compatibility(), verify_encoding(), encoding_sensitivity(), suggest_pipeline(), preprocessing_report(), inspect_encoding() — quantum-aware dataset audit and circuit inspection
New encoders v0.10.0 DenseAngleEncoder (2 features/qubit via Ry+Rz), DiscretizedEncoder (continuous → binary, QUBO-ready)

Supported frameworks

Framework Install Output type
OpenQASM 3.0 (no extra deps) str
Qiskit quprep[qiskit] QuantumCircuit
PennyLane quprep[pennylane] qml.QNode
Cirq quprep[cirq] cirq.Circuit
TKET quprep[tket] pytket.Circuit
Amazon Braket quprep[braket] braket.circuits.Circuit
Q# / Azure Quantum quprep[qsharp] str (Q# 1.0 source)
IQM quprep[iqm] dict (PRX+CZ JSON)
D-Wave Ocean (via .to_dwave()) BQM dict

What QuPrep does NOT do

QuPrep is intentionally narrow in scope. It does not:

  • Train quantum machine learning models
  • Simulate quantum circuits
  • Execute on quantum hardware
  • Optimize variational parameters
  • Replace Qiskit, PennyLane, Cirq, or any other framework

It prepares your data. Everything else is your framework's job.


CLI

# Profile a dataset (shape, types, missing, sparsity, recommendation)
quprep inspect data.csv
quprep inspect data.csv --task kernel --qubits 8

# Benchmark all encoders (gate count, depth, timing)
quprep benchmark data.csv --task classification
quprep benchmark data.csv --include angle,iqp,amplitude --output bench.json

# Encode a CSV to OpenQASM 3.0
quprep convert data.csv --encoding angle

# Get an encoding recommendation
quprep recommend data.csv --task classification --qubits 8

# QUBO problems
quprep qubo maxcut --adjacency "0,1,1;1,0,1;1,1,0" --solve
quprep qubo qaoa maxcut --adjacency "0,1,1;1,0,1;1,1,0" --p 2 --output circuit.qasm