TODO#
Project-wide checklist derived from PLAN.md. Items are grouped by milestones; all are initially unchecked.
Implementation Milestones#
M0: Scaffolding
Repository layout: crates, Python package skeleton, CI with maturin
Basic CSR struct in Rust and initial PyO3 binding stub
Docs site scaffold (Sphinx + MyST)
M1: CSR Core
Kernels: SpMV, SpMM, reductions (sum, row/col sums), transpose
Indexing/slicing: rows/cols (read-only)
Cleanup ops: prune(eps), eliminate_zeros
Python OOP façade; NumPy interop; release GIL for compute
Parallel + SIMD implementations (Rayon + std::simd) for all v0.1 kernels
Implement and optimize the basic arithmetic kernels
M2: Conversions and Formats
Public COO and CSC types
Conversions: CSR <-> COO <-> CSC
Arithmetic: A + B, Hadamard A.multiply(B)
IO: Matrix Market (.mtx) and NPZ save/load
M3: Performance and Stability
Blocked/tiling strategies; cache/NUMA tuning
Benchmark suite and performance regression gates
API polish and error taxonomy
Documentation/tutorials pass
M4: Advanced
BSR format
Iterative solvers (CG/GMRES) and basic preconditioners
Plug-in kernel strategy
Optional f32 kernels and dtype growth
Cross-cutting#
Formats & Data Model
CSR default: values f64, indices i64; plan for f32/i32 as feature flags
ND baseline: COO-ND representation and invariants (v0.2)
COO-ND storage and invariants
Axis reductions: sum over axes
Axis permutation
ND→2D conversions: mode/axes unfold to CSR/CSC
Broadcasting elementwise ops (Hadamard)
mean and reshape
Future CSF for ND advanced ops (v0.4+)
Python API
lacuna.sparseclasses: SparseArray/SparseMatrix, CSR/CSC/COO/COONDConstruction and conversion APIs; SciPy/NumPy bridges
Ops surface: matmul, add, multiply, transpose, sum; slicing semantics
Rust Design
Core traits: SparseFormat, SparseND, and op traits (SpMV/SpMM/Add/MulElem/Transpose/…)
Deterministic reductions where required; careful
unsafeonly in hot paths
Kernel Implementation Checklist (by crate)#
crates/lacuna-kernels (Rust optimized kernels, f64/i64, parallel+SIMD)
SpMV
Feature done
Tests done
SpMM
Feature done
Tests done
Reductions: sum, row_sums, col_sums
Feature done
Tests done
Transpose (CSR -> CSR)
Feature done
Tests done
Cleanup: eliminate_zeros, prune(eps)
Feature done
Tests done
Arithmetic: add_csr (A+B), mul_scalar (alpha*A)
Feature done
Tests done
Utilities / Refactors
Centralize reusable kernel utilities in
util.rs(constants, helpers,UsizeF64Map)Replace HashMap-based sparse accumulators with
UsizeF64Mapinreduce.rsandspmv.rsImprove reduction paths: parallel small-dimension branches; SIMD stripe merge for column sums
ND COO
Sum / Permute axes / Reduce sum over axes (COO-ND kernels)
SpMV/SpMM along mode axis
ND→2D conversions (mode/axes unfold) in
convert.rs
crates/lacuna-py (PyO3 bindings)
SpMV / SpMM
Feature done (
Csr64.spmv/spmmand*_from_parts)Tests done (indirectly covered via Python tests)
Reductions / Transpose / Cleanup / Arithmetic
Feature done (
sum/row_sums/col_sums,transpose,prune/eliminate_zeros,add/mul_scalarbindings)Tests done (indirectly covered via Python tests)
Bindings structure
Split monolithic
src/lib.rsinto modules:csr.rs,csc.rs,coo.rs,functions.rs; keeplib.rsas aggregator (no Python API changes)
ND bindings
Export ND wrappers:
coond_sum_from_parts,coond_mean_from_parts,coond_reduce_sum_axes_from_parts,coond_reduce_mean_axes_from_parts,coond_permute_axes_from_parts,coond_reshape_from_parts,coond_hadamard_broadcast_from_parts,coond_mode_to_{csr,csc}_from_parts,coond_axes_to_{csr,csc}_from_partsRegistered in
lib.rs
python/lacuna (High-level Python API: CSR facade)
SpMV
Feature done (
__matmul__1D)Tests done (
python/tests/test_ops.py)Benchmarks done (
python/benchmarks/benchmark_spmv.py)Docs done
SpMM
Feature done (
__matmul__2D)Tests done (
python/tests/test_ops.py)Benchmarks done (
python/benchmarks/benchmark_spmm.py)Docs done
Reductions: sum, row_sums, col_sums
Feature done (
CSR.sumsupportsNone/0/1)Tests done (
python/tests/test_ops.py)Benchmarks done
Docs done
Transpose
Feature done (
CSR.T)Tests done (
python/tests/test_ops.py/test_more_ops.py)Benchmarks done
Docs done
Cleanup: prune, eliminate_zeros
Feature done
Tests done (
python/tests/test_ops.py)Benchmarks done
Docs done
Arithmetic: add, mul_scalar, sub, hadamard
Feature done (
__add__,__mul__/__rmul__)Tests done (
python/tests/test_ops.py/test_more_ops.py)Benchmarks done
Docs done
python/lacuna (High-level Python API: ND COO facade)
COOND
Feature done (
sum,mean,reduce_sum_axes,reduce_mean_axes,permute_axes,reshape,hadamard_broadcast,mode_unfold_to_{csr,csc},axes_unfold_to_{csr,csc})Tests done (
python/tests/test_nd.py)
Planned (per PLAN.md milestones)
Arithmetic: subtraction (A - B)
Arithmetic: Hadamard elementwise multiply
A.multiply(B)Format conversions: CSR <-> CSC, CSR <-> COO
ND baseline (COO-ND): elementwise ops with broadcasting (Hadamard),
sum/meanover axes,transpose/permute,reshapeReordering: CSR reorder (cache locality)
Cache-aware/blocked SpMM improvements
Block formats: BSR kernels
Dtype/index variants: f32 values, i32 indices (feature-gated)
ND advanced (CSF): kernels for
tensordot/mode-n product; masked ops
Packaging & CI
Build wheels with maturin for Win/macOS/Linux and Python 3.10–3.13
GitHub Actions matrix for wheels + sdist
Versioning (SemVer) and licensing check (Apache-2.0)
Testing & Benchmarks
Rust unit/property tests
Python pytest parity tests vs NumPy/SciPy; randomized matrices
pytest-benchmark scenarios; SuiteSparse/synthetic datasets
Benchmarks import
lacunaonly from installed environment (no local path injection)
Documentation
User guides (MyST), API docs (autodoc/napoleon), and design notes
Tutorials: build CSR from COO; SpMV at scale; convert to SciPy