hipscatalog_gen.io.input.compute_column_report_sample
- compute_column_report_sample(ddf_like, sample_rows=200000)[source]
Build a small column summary from a sample.
Uses sampling to keep the computation fast and scalable. Works with Dask DataFrames and LSDB catalogs.
- Parameters:
ddf_like (Any) – Dask-like collection or LSDB catalog.
sample_rows (int) – Approximate maximum number of rows to materialize.
- Returns:
Nested dict with basic column statistics and examples.
- Return type:
Dict