Logo image
Data for QZO: A Catalog of 5 Million Quasars from the Zwicky Transient Facility
Dataset   Open access

Data for QZO: A Catalog of 5 Million Quasars from the Zwicky Transient Facility

S. J. Nakoneczny, M. J. Graham, D. Stern, G. Helou, S. G. Djorgovski, E. C. Bellm, T. X. Chen, R. Dekany, A. Drake, A. A. Mahabal, …
24 Jul 2025
url
https://doi.org/10.5281/zenodo.16410987View
Open

Abstract

QZO.csv The QZO catalog, which includes 4,849,574 objects and columns as described below, excluding the duplicate objects flag. The classifications are based on XGB models trained on ZTF g-band median magnitude and light curves classification with transformer model, as well as WISE W[1-4] magnitudes and colors. The photo-zs are based on ZTF g-band magnitude and WISE magnitudes and colors. We remove duplicated ZTF light curves by removing objects which within the full ZTF catalog have at least one neighbour within 1 arcsec with more ZTF observation epochs. The final number of quasars was achieved with magnitude, number of observation epochs, and minimum quasar classification probability cuts, such that g < n_obs / 80 + 20.375, where n_obs is the number of ZTF observational epochs per light curve, and p_(QSO) > 0.9, where p_(QSO) is XGB classification probability for the QSO class. The photo-zs are available for 35% of these objects, depending on the availability of WISE observations. ZTF_all_QSO.csv This file provides all the columns for 78,078,450 objects classified as QSOs by at least one of the two XGB models with and without the WISE features. There are no cuts applied, and there are no duplicates removed. 26% of objects are marked with the duplicates flag. train.csv The train data predictions. This file contains 2,588,221 records, with ZTF ID and duplicates flag missing. Selecting the longest ZTF light curve for each non duplicated SDSS object removed ZTF duplicates. Catalog columns ID                                                ZTF identifier ra                                                right ascension dec                                             declination n_obs                                         number of ZTF observation epochs is_duplicate                               flag indicating duplicated light curves mag_median                             ZTF g-band median magnitude p_[galaxy, QSO, star]               classification probabilities p_WISE_[galaxy, QSO, star]    classifications with added WISE data redshift                                      redshift estimate ANN_clf.[data-00000-of-00001, index] ANN model for classification of ZTF g-band light curves. ANN model is trained on ZTF g-band data with at least 20 observation epochs per light curve. It does not require scaling of input light curves, which is done separately for each light curve as part of the transformer model. An example on how to load and use the ANN can be found in the script “run_inference.py” in the GitHub repository. XGB_clf__ZTF_[PS, WISE, GAIA, PS_WISE, PS_GAIA, WISE_GAIA, PS_WISE_GAIA].pickle XGB_z__ZTF_WISE.pickle XGB classification and redshift models for different combinations of input surveys. XGB classification models are trained on all ZTF data with available ANN classification, learning to classify missing features. The XGB redshift model does not include ANN classification as features. An example on how to load and use XGB models can be found in the script “run_inference_XGB.py” in the GitHub repository. Features order ZTF         g_mag_median, p_ANN_galaxy, p_ANN_QSO, p_ANN_star PS           g, r, i, z, g - r, g - i, g - z, r - i, r - z, i - z WISE      W1, W2, W3, W4, W1 - W2, W1 - W3, W1 - W4, W2 - W3, W2 - W4, W3 - W4 GAIA       g_mean_mag, parallax, pmra, pmdec, bp_mean_mag, rp_mean_mag, bp_rp_excess_factor The exact column names can be found in the script “features.py” in the Github repository.

Metrics

26 Record Views

Details

Logo image