DESI Early Data Release (EDR) Catalog
Data
The Dark Energy Spectroscopic Instrument Early Data Release (DESI-EDR;
DESI Collaboration, 2024) catalog contains positions and information extracted from the spectra (object types and redshifts) taken for over 2 million sources selected from the Survey Validation observations. The final sample integrated into NED is a subset that contains 1,852,883 sources that correspond to unique science targets that exclude sky and standard star fibers and known faulty fibers (hardware or observing issue); see Section 2.4 of
DESI Collaboration, 2024 for more details. We note that roughly 73k of these objects have duplicate names with nearly or exactly identical positions, and we have reported these findings to the DESI team. These duplicate spectra are likely additional observations for effectively the same object and these redshifts are attached to the same NED object as independent measurements. The combined sky area of DESI-EDR is roughly 1390 square degrees, where the objects contained within are composed of 66% galaxies, 29% stars, and 5% QSOs ("SPECTYPE"), respectively. Please refer to the DESI-EDR page for more details on the observations and data products, including the Caveats and Limitations.
Nomenclature
Format: DESI JDDD.dddd+/-DD.dddd
Example: DESI J002.4135+09.8203
Processing Notes
Nomenclature, astrometry, redshifts, and redshift quality flags for DESI-EDR were integrated into NED using the latest data ingestion and cross-matching pipeline (see Ogle et al. 2015 for a description of the cross-matching algorithm MatchEx). The equatorial coordinates from the catalog are the best estimates of the ICRS celestial position of the source. We adopted a global position uncertainty for DESI-EDR derived from a combination of the uncertainties in the source position from imaging (0.03 arcsec from the DESI Legacy Survey) and the spectroscopic fiber position (0.14 arcsec; 2024AJ....168...35S), which totals to 0.143 arcsec (1-sigma).
Cross-matching with NED Objects
Of the 1,852,883 DESI-EDR sources, 70% of were cross-matched with existing objects in NED, while 30% became new NED objects. Below we show the distribution of NED objects matched with DESI-EDR sources (Left) and those not matched (Right), in Galactic coordinates. In addition, 91% of the DESI-EDR objects had no previous redshift measurements in NED, which includes new objects and those matched with objects in NED that did not have a redshift. The ingestion of DESI-EDR redshifts resulted in an increase of objects with redshifts in NED by 19% bringing the total number to 10.7 M.
Left: Sky plot of matches in Galactic coordinates. Right: Sky plot of non-matches in Galactic coordinates.
Validation of Matches
The validation of matching 1.85M DESI-EDR sources to the objects in NED was performed in several ways: examination of property distributions (separations and redshift differences; dz) for matches and non matches, examination of object type agreement for matched sources, the frequency of match decision nodes (the number of sources with match decisions that occurred in a given decision tree node) in our matching algorithm, and by inspecting all matches and possible matches in a set of test patches across the sky that capture the range of match situations.
Of the 70% of DESI-EDR sources that matched to a NED object, we find that: 90% have small separations (<0.56 arcsec), 87% of those with available dz measurements have dz < 10%, the majority have object type pairs that are compatible (e.g., galaxy-to-galaxy and star-to-star), and 99% of the match decisions were selected based on a single compatible candidate.
Of the 30% of DESI-EDR sources not matched to a NED object, we find that: 90% have large separations (>1.74 arcsec), a decreased peak in dz near zero and a larger tail of offset dz values, and 99% of the non-match decisions were selected based on no compatible candidates either in the vicinity or after incompatible candidates were removed.
Expert vetting and subsequent matches of all sources in a set of test patches showed that only 4.7% of these matches required intervention, where 3.0% were false negatives (the algorithm resulted in non-matches that should have been matched) and 1.6% were false positives (the algorithm resulted in matches that should not have been made). Assuming the test regions are representative of the entire catalog, we estimate an overall accuracy of 95.3%.
DESI Sources Matched to NED objects
Of the 1.85 M DESI-EDR sources, 70.1% were matched to a NED object. To assess the accuracy of these matches we investigate several property distributions, the object types of the matches, and the match node frequency for the matches.
The figure above shows the separation between the DESI-NED matched sources, where we find that 90% of the matches have separations less than 0.56 arcsec. These small offsets give confidence that the DESI and NED sources are the same object.
The figure above shows the redshift differences of the DESI-NED matched sources, where 12.7% (N=165k) of the matched sources had an existing NED redshift from which to compute a redshift difference (dz). The dz distribution is highly peaked around zero, and we find that 86.9% of the matches with a dz showed agreement (dz < 10%) between the DESI and NED values indicating good matches.
The frequency of DESI and NED object types that were matched shows that 83% of matches are compatible type matches (e.g., galaxy-to-galaxy, star-to-star, or galaxy/star-to-IR source). We also find that ∼6% of the matches showed seemingly incompatible type matches; QSO-to-star and galaxy-to-star. However, upon further inspection of these matches, we find that 90% of these have separation less than 0.5 arcsec and only 1.5% of these have other sources within a few arcsec in the Legacy Survey images. Thus, we conclude that the vast majority of these seemingly incompatible matches are due to "stars" not yet identified as QSOs or incorrect object type classifications. We note that for the latter case, the DESI-EDR ingestion will help NED to update these object types.
We can also assess the validity of these matches by examining the frequency of match decision tree nodes in our MatchEx algorithm. After the initial node of removing incompatible match candidates (statistical metrics based on positions and uncertainties, redshift differences greater 30%, and object type mismatches across hierarchies), we find that a match selection was made 98.9% of the time to the only candidate left after these cuts. The very high fraction of matches selected based on only a single compatible candidate gives further confidence in our matches.
DESI Sources Not Matched to NED objects (new objects)
Of the 1.85 M DESI-EDR sources, 29.9% were not matched to a NED object, which consequently became new objects in the NED database. To assess the accuracy of these non-matches we investigate several property distributions and the decision tree match node frequency.
The figure above shows the separation between the DESI-EDR sources and their closest NED match candidate, where we find that 90% of the unmatched sources have separations greater than 1.74 arcsec; we note that this is 12 times the uncertainty of the incoming source positions. These larger offsets give confidence that the unmatched DESI sources and closest NED candidate are distinctly different objects.
The figure above shows the redshift difference between DESI-EDR sources and their closest NED match candidate, where 2.9% (N=16k) of the unmatched sources have a closest NED candidate that also has a redshift from which to compute a redshift difference (dz). The dz distribution does exhibit a peak near zero, although at greatly reduced absolute numbers, and a greater fraction (∼65%) that exhibit a tail of larger dz offsets. The greater fraction of large dz offsets between DESI-EDR sources and their closest NED candidate gives confidence that these are distinctly different objects. However, if we assume that the ∼6000 DESI-EDR sources in the peak that have NED candidates with low dz offset are missed matches, then we can use these sources to estimate the false-negative rate of matches for the entire catalog. Taking the subset of DESI-EDR sources that have either a dz from a match or a dz from the closest NED candidate (a total of 165,076 + 16,148), we estimate a false-negative match rate of 3.3%. This rate is in agreement with the estimate found by inspecting all matches in test regions (see next section).
We can also assess the validity of these non matches by examining the frequency of match decision nodes in our MatchEx algorithm. After the initial node of removing incompatible match candidates, we find that no selection was made 99.5% of the time due to either no NED candidates in the search radius or no candidates passed the initial cuts. The very high fraction of no matches being selected based on N=0 compatible candidates gives further confidence in the match decision of these unmatched DESI-EDR sources.
Test Regions
We selected four test regions (total of 10,423 incoming DESI sources) designed to capture a range of DESI-NED match situations, and visually vetted the automated match results. During the vetting of these matches, a NED expert also provided their match assessments to serve as "ground truth" from which we can estimate the overall accuracy of the fully automated cross-matching. The Four test regions are listed and graphically displayed below:
- an area inside the SDSS footprint, where NED will have redshifts from SDSS.
- an area outside the SDSS footprint, where NED will have fewer known redshifts.
- an area overlapping M31, where parts of galaxies (star clusters, HII regions, etc.) will be present in both NED and DESI.
- an area near the Galactic plane, where most of DESI's spectroscopically determined stars are more likely to be located.
Of the 10k sources in the test regions, 65% and 35% were matched and not matched, respectively, to a NED object. The fraction of non/matched sources roughly agrees with the fraction from the entire DESI ingestion. The expert vetting of these matches resulted in 485 (4.7%) changes relative to the MatchEx selection, which suggests an overall accuracy of 95.3% of matches. Of the 4.7% changed by an expert, 317 (3.0%) are missed matches (false negatives) and 168 (1.6%) are incorrect matches (false positives). The low false-negative and false-positive rates for the matches in these test regions as well as the high overall accuracy gives confidence in the matching process performed during the ingestion of DESI-EDR catalog.
We thank David Schlegel and Anthony Kremin (Lawrence Berkeley National Lab) from the DESI Team for assistance in providing the DESI-EDR catalog data formatted with the content and format needed to streamline integration into NED.