Photo-z Server

Tutorial Notebook 1 - Spec-z Catalogs

Last verified run: 2024-Jul-22

Introduction

Welcome to the PZ Server tutorials. If you are reading this notebook for the first time, we recommend not to skip the introduction notebook: 0_introduction.ipynb also available in this same repository.

Imports and Setup

from pzserver import PzServer
import matplotlib.pyplot as plt
%reload_ext autoreload
%autoreload 2

# pz_server = PzServer(token="<your token>", host="pz-dev") # "pz-dev" is the temporary host for test phase

For convenience, the token can be saved into a file named as token.txt (which is already listed in the .gitignore file in this repository).

with open('token.txt', 'r') as file:
    token = file.read()
pz_server = PzServer(token=token, host="pz-dev") # "pz-dev" is the temporary host for test phase

Product types

The PZ Server API provides Python classes with useful methods to handle particular product types. Let’s recap the product types available:

pz_server.display_product_types()

Product type	Description
Spec-z Catalog	Catalog of spectroscopic redshifts and positions (usually equatorial coordinates).
Training Set	Training set for photo-z algorithms (tabular data). It usually contains magnitudes, errors, and true redshifts.
Validation Results	Results of a photo-z validation procedure (free format). Usually contains photo-z estimates (single estimates and/or pdf) of a validation set and photo-z validation metrics.
Photo-z Table	Results of a photo-z estimation procedure. If the data is larger than the file upload limit (200MB), the product entry stores only the metadata (instructions on accessing the data should be provided in the description field.

Spec-z Catalogs

In the context of the PZ Server, Spec-z Catalogs are defined as any catalog containing spherical equatorial coordinates and spectroscopic redshift measurements (or, analogously, true redshifts from simulations). A Spec-z Catalog can include data from a single spectroscopic survey or a combination of data from several sources. To be considered as a single Spec-z Catalog, the data should be provided as a single file to PZ Server’s upload tool. For multi-survey catalogs, it is recommended to add the survey name or identification as an extra column.

Mandatory columns: * Right ascension [degrees] - float * Declination [degrees] - float * Spectroscopic or true redshift - float

Recommended columns: * Spectroscopic redshift error - float * Quality flag - integer, float, or string * Survey name (recommended for compilations of data from different surveys)

Spec-z Catalogs can be uploaded by users on PZ Server website or via the pzserver library. Also, they can be created as the combination of a list of other Spec-z Catalogs previously registered in the system by the PZ Sever’s pipeline “Combine Spec-z Catalogs” (under development). Any catalog built by the pipeline is automaticaly registered as a regular user-generated data product and has no difference from the uploaded ones.

Let’s see an example of Spec-z Catalog:

gama = pz_server.get_product(14)

Connecting to PZ Server...
Done!

gama.display_metadata()

key	value
id	14
release	None
product_type	Spec-z Catalog
uploaded_by	gschwend
internal_name	14_gama_specz_subsample
product_name	GAMA spec-z subsample
official_product	False
pz_code
description	A small subsample of the GAMA DR3 spec-z catalog (Baldry et al. 2018) as an example of a typical spec-z catalog from the literature.
created_at	2023-03-29T20:02:45.223568Z
main_file	specz_subsample_gama_example.csv

Display basic statistics

gama.data.describe()

	ID	RA	DEC	Z	ERR_Z	FLAG_DES
count	2.576000e+03	2576.000000	2576.000000	2576.000000	2576.0	2576.000000
mean	1.105526e+06	154.526343	-1.101865	0.224811	99.0	3.949534
std	4.006668e+04	70.783868	2.995036	0.102571	0.0	0.218947
...	...	...	...	...	...	...
50%	1.103558e+06	180.140145	-0.480830	0.217804	99.0	4.000000
75%	1.140619e+06	215.836583	1.170363	0.291810	99.0	4.000000
max	1.176440e+06	223.497080	2.998180	0.728717	99.0	4.000000

8 rows × 6 columns

The spec-z catalog object has a very basic plot method for quick visualization of catalog properties. For advanced interactive data visualization tips, we recommend the notebook DP02_06b_Interactive_Catalog_Visualization.ipynb from Rubin Observatory’s DP0.2 tutorial-notebooks repository.

gama.plot()

The attribute data, which is a DataFrame preserves the plot method from Pandas.

gama.data.plot(x="RA", y="DEC", kind="scatter")

<Axes: xlabel='RA', ylabel='DEC'>

Users feedback

Is something important missing? Click here to open an issue in the PZ Server library repository on GitHub.