Astrobase¶
Astrobase is a Python package for analyzing light curves and finding variable stars. It includes implementations of several period-finding algorithms, batch work drivers for working on large collections of light curves, and a web-app useful for reviewing and classifying light curves by stellar variability type. This package was spun out of a bunch of Python modules I wrote and maintain for my work with the HAT Exoplanet Surveys. It’s applicable to many other astronomical time-series observations, and includes support for the light curves produced by Kepler and TESS in particular.
Most functions in this package that deal with light curves (e.g. in the modules
astrobase.lcfit
, astrobase.lcmath
,
astrobase.periodbase
, astrobase.plotbase
,
astrobase.checkplot
) usually require three Numpy ndarrays as input:
times, mags, and errs, so they should work with any time-series data that
can be represented in this form. If you have flux time series measurements, most
functions also take a magsarefluxes keyword argument that makes them handle
flux light curves correctly.
The astrobase.lcproc
subpackage implements drivers for working on
large collections of light curve files, and includes functions to register your
own light curve format so that it gets recognized and can be worked on by other
Astrobase functions transparently.
- Guides for specific tasks are available as Jupyter notebooks at Github: astrobase-notebooks.
- The full API documentation generated automatically from the docstrings by Sphinx is available.
- The code for Astrobase is maintained at Github.
Install Astrobase from PyPI using pip:
# preferably in a virtualenv
# install Numpy to compile Fortran dependencies
$ pip install numpy
# install astrobase
$ pip install astrobase
Package contents¶
astrobase.astrokep module¶
Contains various useful tools for analyzing Kepler light curves.
-
astrobase.astrokep.
keplerflux_to_keplermag
(keplerflux, f12=174000.0)[source]¶ This converts the Kepler flux in electrons/sec to Kepler magnitude.
The kepler mag/flux relation is:
fkep = (10.0**(-0.4*(kepmag - 12.0)))*f12 f12 = 1.74e5 # electrons/sec
Parameters: - keplerflux (float or array-like) – The flux value(s) to convert to magnitudes.
- f12 (float) – The flux value in the Kepler band corresponding to Kepler mag = 12.0.
Returns: Magnitudes in the Kepler band corresponding to the input keplerflux flux value(s).
Return type: np.array
-
astrobase.astrokep.
keplermag_to_keplerflux
(keplermag, f12=174000.0)[source]¶ This converts the Kepler mag back to Kepler flux.
Parameters: - keplermag (float or array-like) – The Kepler magnitude value(s) to convert to fluxes.
- f12 (float) – The flux value in the Kepler band corresponding to Kepler mag = 12.0.
Returns: Fluxes in the Kepler band corresponding to the input keplermag magnitude value(s).
Return type: np.array
-
astrobase.astrokep.
keplermag_to_sdssr
(keplermag, kic_sdssg, kic_sdssr)[source]¶ Converts magnitude measurements in Kepler band to SDSS r band.
Parameters: - keplermag (float or array-like) – The Kepler magnitude value(s) to convert to fluxes.
- kic_sdssg,kic_sdssr (float or array-like) – The SDSS g and r magnitudes of the object(s) from the Kepler Input Catalog. The .llc.fits MAST light curve file for a Kepler object contains these values in the FITS extension 0 header.
Returns: SDSS r band magnitude(s) converted from the Kepler band magnitude.
Return type: float or array-like
-
astrobase.astrokep.
flux_ppm_to_magnitudes
(ppm)[source]¶ This converts Kepler’s flux parts-per-million to magnitudes.
Mostly useful for turning PPMs reported by Kepler or TESS into millimag values to compare with ground-based surveys.
Parameters: ppm (float or array-like) – Kepler flux measurement errors or RMS values in parts-per-million. Returns: Measurement errors or RMS values expressed in magnitudes. Return type: float or array-like
-
astrobase.astrokep.
read_kepler_fitslc
(lcfits, headerkeys=['TIMESYS', 'BJDREFI', 'BJDREFF', 'OBJECT', 'KEPLERID', 'RA_OBJ', 'DEC_OBJ', 'EQUINOX', 'EXPOSURE', 'CDPP3_0', 'CDPP6_0', 'CDPP12_0', 'PDCVAR', 'PDCMETHD', 'CROWDSAP', 'FLFRCSAP'], datakeys=['TIME', 'TIMECORR', 'CADENCENO', 'SAP_QUALITY', 'PSF_CENTR1', 'PSF_CENTR1_ERR', 'PSF_CENTR2', 'PSF_CENTR2_ERR', 'MOM_CENTR1', 'MOM_CENTR1_ERR', 'MOM_CENTR2', 'MOM_CENTR2_ERR'], sapkeys=['SAP_FLUX', 'SAP_FLUX_ERR', 'SAP_BKG', 'SAP_BKG_ERR'], pdckeys=['PDCSAP_FLUX', 'PDCSAP_FLUX_ERR'], topkeys=['CHANNEL', 'SKYGROUP', 'MODULE', 'OUTPUT', 'QUARTER', 'SEASON', 'CAMPAIGN', 'DATA_REL', 'OBSMODE', 'PMRA', 'PMDEC', 'PMTOTAL', 'PARALLAX', 'GLON', 'GLAT', 'GMAG', 'RMAG', 'IMAG', 'ZMAG', 'D51MAG', 'JMAG', 'HMAG', 'KMAG', 'KEPMAG', 'GRCOLOR', 'JKCOLOR', 'GKCOLOR', 'TEFF', 'LOGG', 'FEH', 'EBMINUSV', 'AV', 'RADIUS', 'TMINDEX'], apkeys=['NPIXSAP', 'NPIXMISS', 'CDELT1', 'CDELT2'], appendto=None, normalize=False)[source]¶ This extracts the light curve from a single Kepler or K2 LC FITS file.
This works on the light curves available at MAST:
- kplr{kepid}-{somedatething}_llc.fits files from the Kepler mission
- ktwo{epicid}-c{campaign}_llc.fits files from the K2 mission
Parameters: - lcfits (str) – The filename of a MAST Kepler/K2 light curve FITS file.
- headerkeys (list) – A list of FITS header keys that will be extracted from the FITS light curve file. These describe the observations. The default value for this is given in LCHEADERKEYS above.
- datakeys (list) – A list of FITS column names that correspond to the auxiliary measurements in the light curve. The default is LCDATAKEYS above.
- sapkeys (list) – A list of FITS column names that correspond to the SAP flux measurements in the light curve. The default is LCSAPKEYS above.
- pdckeys (list) – A list of FITS column names that correspond to the PDC flux measurements in the light curve. The default is LCPDCKEYS above.
- topkeys (list) – A list of FITS header keys that describe the object in the light curve. The default is LCTOPKEYS above.
- apkeys (list) – A list of FITS header keys that describe the flux measurement apertures used by the Kepler/K2 pipeline. The default is LCAPERTUREKEYS above.
- appendto (lcdict or None) – If appendto is an lcdict, will append measurements of this lcdict to that lcdict. This is used for consolidating light curves for the same object across different files (quarters). The appending does not care about the time order. To consolidate light curves in time order, use consolidate_kepler_fitslc below.
- normalize (bool) – If True, then each component light curve’s SAP_FLUX and PDCSAP_FLUX measurements will be normalized to 1.0 by dividing out the median flux for the component light curve.
Returns: Returns an lcdict (this is useable by most astrobase functions for LC processing).
Return type: lcdict
-
astrobase.astrokep.
consolidate_kepler_fitslc
(keplerid, lcfitsdir, normalize=True, headerkeys=['TIMESYS', 'BJDREFI', 'BJDREFF', 'OBJECT', 'KEPLERID', 'RA_OBJ', 'DEC_OBJ', 'EQUINOX', 'EXPOSURE', 'CDPP3_0', 'CDPP6_0', 'CDPP12_0', 'PDCVAR', 'PDCMETHD', 'CROWDSAP', 'FLFRCSAP'], datakeys=['TIME', 'TIMECORR', 'CADENCENO', 'SAP_QUALITY', 'PSF_CENTR1', 'PSF_CENTR1_ERR', 'PSF_CENTR2', 'PSF_CENTR2_ERR', 'MOM_CENTR1', 'MOM_CENTR1_ERR', 'MOM_CENTR2', 'MOM_CENTR2_ERR'], sapkeys=['SAP_FLUX', 'SAP_FLUX_ERR', 'SAP_BKG', 'SAP_BKG_ERR'], pdckeys=['PDCSAP_FLUX', 'PDCSAP_FLUX_ERR'], topkeys=['CHANNEL', 'SKYGROUP', 'MODULE', 'OUTPUT', 'QUARTER', 'SEASON', 'CAMPAIGN', 'DATA_REL', 'OBSMODE', 'PMRA', 'PMDEC', 'PMTOTAL', 'PARALLAX', 'GLON', 'GLAT', 'GMAG', 'RMAG', 'IMAG', 'ZMAG', 'D51MAG', 'JMAG', 'HMAG', 'KMAG', 'KEPMAG', 'GRCOLOR', 'JKCOLOR', 'GKCOLOR', 'TEFF', 'LOGG', 'FEH', 'EBMINUSV', 'AV', 'RADIUS', 'TMINDEX'], apkeys=['NPIXSAP', 'NPIXMISS', 'CDELT1', 'CDELT2'])[source]¶ This gets all Kepler/K2 light curves for the given keplerid in lcfitsdir.
Searches recursively in lcfitsdir for all of the files belonging to the specified keplerid. Sorts the light curves by time. Returns an lcdict. This is meant to be used to consolidate light curves for a single object across Kepler quarters.
NOTE: keplerid is an integer (without the leading zeros). This is usually the KIC ID.
NOTE: if light curve time arrays contain nans, these and their associated measurements will be sorted to the end of the final combined arrays.
Parameters: - keplerid (int) – The Kepler ID of the object to consolidate LCs for, as an integer without any leading zeros. This is usually the KIC or EPIC ID.
- lcfitsdir (str) – The directory to look in for LCs of the specified object.
- normalize (bool) – If True, then each component light curve’s SAP_FLUX and PDCSAP_FLUX measurements will be normalized to 1.0 by dividing out the median flux for the component light curve.
- headerkeys (list) – A list of FITS header keys that will be extracted from the FITS light curve file. These describe the observations. The default value for this is given in LCHEADERKEYS above.
- datakeys (list) – A list of FITS column names that correspond to the auxiliary measurements in the light curve. The default is LCDATAKEYS above.
- sapkeys (list) – A list of FITS column names that correspond to the SAP flux measurements in the light curve. The default is LCSAPKEYS above.
- pdckeys (list) – A list of FITS column names that correspond to the PDC flux measurements in the light curve. The default is LCPDCKEYS above.
- topkeys (list) – A list of FITS header keys that describe the object in the light curve. The default is LCTOPKEYS above.
- apkeys (list) – A list of FITS header keys that describe the flux measurement apertures used by the Kepler/K2 pipeline. The default is LCAPERTUREKEYS above.
Returns: Returns an lcdict (this is useable by most astrobase functions for LC processing).
Return type: lcdict
-
astrobase.astrokep.
read_k2sff_lightcurve
(lcfits)[source]¶ This reads a K2 SFF (Vandenberg+ 2014) light curve into an lcdict.
Use this with the light curves from the K2 SFF project at MAST.
Parameters: lcfits (str) – The filename of the FITS light curve file downloaded from MAST. Returns: Returns an lcdict (this is useable by most astrobase functions for LC processing). Return type: lcdict
-
astrobase.astrokep.
kepler_lcdict_to_pkl
(lcdict, outfile=None)[source]¶ This writes the lcdict to a Python pickle.
Parameters: - lcdict (lcdict) – This is the input lcdict to write to a pickle.
- outfile (str or None) – If this is None, the object’s Kepler ID/EPIC ID will determined from the lcdict and used to form the filename of the output pickle file. If this is a str, the provided filename will be used.
Returns: The absolute path to the written pickle file.
Return type: str
-
astrobase.astrokep.
read_kepler_pklc
(picklefile)[source]¶ This turns the pickled lightcurve file back into an lcdict.
Parameters: picklefile (str) – The path to a previously written Kepler LC picklefile generated by kepler_lcdict_to_pkl above. Returns: Returns an lcdict (this is useable by most astrobase functions for LC processing). Return type: lcdict
-
astrobase.astrokep.
stitch_kepler_lcdict
(lcdict)[source]¶ This stitches Kepler light curves together across quarters.
FIXME: implement this.
Parameters: lcdict (lcdict) – An lcdict produced by consolidate_kepler_fitslc. The flux measurements between quarters will be stitched together. Returns: Returns an lcdict (this is useable by most astrobase functions for LC processing). The flux measurements will have been shifted to form a seamless light curve across quarters suitable for long-term variability investigation. Return type: lcdict
-
astrobase.astrokep.
filter_kepler_lcdict
(lcdict, filterflags=True, nanfilter='sap, pdc', timestoignore=None)[source]¶ This filters the Kepler lcdict, removing nans and bad observations.
By default, this function removes points in the Kepler LC that have ANY quality flags set.
Parameters: - lcdict (lcdict) – An lcdict produced by consolidate_kepler_fitslc or read_kepler_fitslc.
- filterflags (bool) – If True, will remove any measurements that have non-zero quality flags present. This usually indicates an issue with the instrument or spacecraft.
- nanfilter ({'sap','pdc','sap,pdc'}) – Indicates the flux measurement type(s) to apply the filtering to.
- timestoignore (list of tuples or None) –
This is of the form:
[(time1_start, time1_end), (time2_start, time2_end), ...]
and indicates the start and end times to mask out of the final lcdict. Use this to remove anything that wasn’t caught by the quality flags.
Returns: Returns an lcdict (this is useable by most astrobase functions for LC processing). The lcdict is filtered IN PLACE!
Return type: lcdict
-
astrobase.astrokep.
epd_kepler_lightcurve
(lcdict, xccol='mom_centr1', yccol='mom_centr2', timestoignore=None, filterflags=True, writetodict=True, epdsmooth=5)[source]¶ This runs EPD on the Kepler light curve.
Following Huang et al. 2015, we fit the following EPD function to a smoothed light curve, and then subtract it to obtain EPD corrected magnitudes:
f = c0 + c1*sin(2*pi*x) + c2*cos(2*pi*x) + c3*sin(2*pi*y) + c4*cos(2*pi*y) + c5*sin(4*pi*x) + c6*cos(4*pi*x) + c7*sin(4*pi*y) + c8*cos(4*pi*y) + c9*bgv + c10*bge
By default, this function removes points in the Kepler LC that have ANY quality flags set.
Parameters: - lcdict (lcdict) – An lcdict produced by consolidate_kepler_fitslc or read_kepler_fitslc.
- xcol,ycol (str) – Indicates the x and y coordinate column names to use from the Kepler LC in the EPD fit.
- timestoignore (list of tuples) –
This is of the form:
[(time1_start, time1_end), (time2_start, time2_end), ...]
and indicates the start and end times to mask out of the final lcdict. Use this to remove anything that wasn’t caught by the quality flags.
- filterflags (bool) – If True, will remove any measurements that have non-zero quality flags present. This usually indicates an issue with the instrument or spacecraft.
- writetodict (bool) –
If writetodict is True, adds the following columns to the lcdict:
epd_time = time array epd_sapflux = uncorrected flux before EPD epd_epdsapflux = corrected flux after EPD epd_epdsapcorr = EPD flux corrections epd_bkg = background array epd_bkg_err = background errors array epd_xcc = xcoord array epd_ycc = ycoord array epd_quality = quality flag array
and updates the ‘columns’ list in the lcdict as well.
- epdsmooth (int) – Sets the number of light curve points to smooth over when generating the EPD fit function.
Returns: Returns a tuple of the form: (times, epdfluxes, fitcoeffs, epdfit)
Return type: tuple
-
astrobase.astrokep.
rfepd_kepler_lightcurve
(lcdict, xccol='mom_centr1', yccol='mom_centr2', timestoignore=None, filterflags=True, writetodict=True, epdsmooth=23, decorr='xcc, ycc', nrftrees=200)[source]¶ This uses a RandomForestRegressor to fit and decorrelate Kepler light curves.
Fits the X and Y positions, the background, and background error.
By default, this function removes points in the Kepler LC that have ANY quality flags set.
Parameters: - lcdict (lcdict) – An lcdict produced by consolidate_kepler_fitslc or read_kepler_fitslc.
- xcol,ycol (str) – Indicates the x and y coordinate column names to use from the Kepler LC in the EPD fit.
- timestoignore (list of tuples) –
This is of the form:
[(time1_start, time1_end), (time2_start, time2_end), ...]
and indicates the start and end times to mask out of the final lcdict. Use this to remove anything that wasn’t caught by the quality flags.
- filterflags (bool) – If True, will remove any measurements that have non-zero quality flags present. This usually indicates an issue with the instrument or spacecraft.
- writetodict (bool) –
If writetodict is True, adds the following columns to the lcdict:
rfepd_time = time array rfepd_sapflux = uncorrected flux before EPD rfepd_epdsapflux = corrected flux after EPD rfepd_epdsapcorr = EPD flux corrections rfepd_bkg = background array rfepd_bkg_err = background errors array rfepd_xcc = xcoord array rfepd_ycc = ycoord array rfepd_quality = quality flag array
and updates the ‘columns’ list in the lcdict as well.
- epdsmooth (int) – Sets the number of light curve points to smooth over when generating the EPD fit function.
- decorr ({'xcc,ycc','bgv,bge','xcc,ycc,bgv,bge'}) – Indicates whether to use the x,y coords alone; background value and error alone; or x,y coords and background value, error in combination as the features to training the RandomForestRegressor on and perform the fit.
- nrftrees (int) – The number of trees to use in the RandomForestRegressor.
Returns: Returns a tuple of the form: (times, corrected_fluxes, flux_corrections)
Return type: tuple
-
astrobase.astrokep.
detrend_centroid
(lcd, detrend='legendre', sigclip=None, mingap=0.5)[source]¶ Detrends the x and y coordinate centroids for a Kepler light curve.
Given an lcdict for a single quarter of Kepler data, returned by read_kepler_fitslc, this function returns this same dictionary, appending detrended centroid_x and centroid_y values.
Here “detrended” means “finite, SAP quality flag set to 0, sigma clipped, timegroups selected based on mingap day gaps, then fit vs time by a legendre polynomial of lowish degree”.
Parameters: - lcd (lcdict) – An lcdict generated by the read_kepler_fitslc function.
- detrend ({'legendre'}) – Method by which to detrend the LC. ‘legendre’ is the only thing implemented at the moment.
- sigclip (None or float or int or sequence of floats/ints) – Determines the type and amount of sigma-clipping done on the light curve to remove outliers. If None, no sigma-clipping is performed. If a two element sequence of floats/ints, the first element corresponds to the fainter sigma-clip limit, and the second element corresponds to the brighter sigma-clip limit.
- mingap (float) – Number of days by which to define “timegroups” (for individual fitting each of timegroup, and to eliminate “burn-in” of Kepler spacecraft. For long cadence data, 0.5 days is typical.
Returns: This is of the form (lcd, errflag), where:
lcd : an lcdict with the new key lcd[‘centroids’], containing the detrended times, (centroid_x, centroid_y) values, and their errors.
errflag : boolean error flag, could be raised at various points.
Return type: tuple
-
astrobase.astrokep.
get_centroid_offsets
(lcd, t_ing_egr, oot_buffer_time=0.1, sample_factor=3)[source]¶ After running detrend_centroid, this gets positions of centroids during transits, and outside of transits.
These positions can then be used in a false positive analysis.
This routine requires knowing the ingress and egress times for every transit of interest within the quarter this routine is being called for. There is currently no astrobase routine that automates this for periodic transits (it must be done in a calling routine).
To get out of transit centroids, this routine takes points outside of the “buffer” set by oot_buffer_time, sampling 3x as many points on either side of the transit as are in the transit (or however many are specified by sample_factor).
Parameters: - lcd (lcdict) – An lcdict generated by the read_kepler_fitslc function. We assume that the detrend_centroid function has been run on this lcdict.
- t_ing_egr (list of tuples) –
This is of the form:
[(ingress time of i^th transit, egress time of i^th transit)]
for i the transit number index in this quarter (starts at zero at the beginning of every quarter). Assumes units of BJD.
- oot_buffer_time (float) – Number of days away from ingress and egress times to begin sampling “out of transit” centroid points. The number of out of transit points to take per transit is 3x the number of points in transit.
- sample_factor (float) – The size of out of transit window from which to sample.
Returns: This is a dictionary keyed by transit number (i.e., the same index as t_ing_egr), where each key contains the following value:
{'ctd_x_in_tra':ctd_x_in_tra, 'ctd_y_in_tra':ctd_y_in_tra, 'ctd_x_oot':ctd_x_oot, 'ctd_y_oot':ctd_y_oot, 'npts_in_tra':len(ctd_x_in_tra), 'npts_oot':len(ctd_x_oot), 'in_tra_times':in_tra_times, 'oot_times':oot_times}
Return type: dict
astrobase.astrotess module¶
Contains various tools for analyzing TESS light curves.
-
astrobase.astrotess.
normalized_flux_to_mag
(lcdict, columns=('sap.sap_flux', 'sap.sap_flux_err', 'sap.sap_bkg', 'sap.sap_bkg_err', 'pdc.pdcsap_flux', 'pdc.pdcsap_flux_err'))[source]¶ This converts the normalized fluxes in the TESS lcdicts to TESS mags.
Uses the object’s TESS mag stored in lcdict[‘objectinfo’][‘tessmag’]:
mag - object_tess_mag = -2.5 log (flux/median_flux)
Parameters: - lcdict (lcdict) – An lcdict produced by read_tess_fitslc or consolidate_tess_fitslc. This must have normalized fluxes in its measurement columns (use the normalize kwarg for these functions).
- columns (sequence of str) – The column keys of the normalized flux and background measurements in the lcdict to operate on and convert to magnitudes in TESS band (T).
Returns: The returned lcdict will contain extra columns corresponding to magnitudes for each input normalized flux/background column.
Return type: lcdict
-
astrobase.astrotess.
read_tess_fitslc
(lcfits, headerkeys=['EXPOSURE', 'TIMEREF', 'TASSIGN', 'TIMESYS', 'BJDREFI', 'BJDREFF', 'TELAPSE', 'LIVETIME', 'INT_TIME', 'NUM_FRM', 'TIMEDEL', 'BACKAPP', 'DEADAPP', 'VIGNAPP', 'GAINA', 'GAINB', 'GAINC', 'GAIND', 'READNOIA', 'READNOIB', 'READNOIC', 'READNOID', 'CDPP0_5', 'CDPP1_0', 'CDPP2_0', 'PDCVAR', 'PDCMETHD', 'CROWDSAP', 'FLFRCSAP', 'NSPSDDET', 'NSPSDCOR'], datakeys=['TIME', 'TIMECORR', 'CADENCENO', 'QUALITY', 'PSF_CENTR1', 'PSF_CENTR1_ERR', 'PSF_CENTR2', 'PSF_CENTR2_ERR', 'MOM_CENTR1', 'MOM_CENTR1_ERR', 'MOM_CENTR2', 'MOM_CENTR2_ERR', 'POS_CORR1', 'POS_CORR2'], sapkeys=['SAP_FLUX', 'SAP_FLUX_ERR', 'SAP_BKG', 'SAP_BKG_ERR'], pdckeys=['PDCSAP_FLUX', 'PDCSAP_FLUX_ERR'], topkeys=['DATE-OBS', 'DATE-END', 'PROCVER', 'ORIGIN', 'DATA_REL', 'TIMVERSN', 'OBJECT', 'TICID', 'SECTOR', 'CAMERA', 'CCD', 'PXTABLE', 'RADESYS', 'RA_OBJ', 'DEC_OBJ', 'EQUINOX', 'PMRA', 'PMDEC', 'PMTOTAL', 'TESSMAG', 'TEFF', 'LOGG', 'MH', 'RADIUS', 'TICVER', 'CRMITEN', 'CRBLKSZ', 'CRSPOC'], apkeys=['NPIXSAP', 'NPIXMISS', 'CDELT1', 'CDELT2'], normalize=False, appendto=None, filterqualityflags=False, nanfilter=None, timestoignore=None)[source]¶ This extracts the light curve from a single TESS .lc.fits file.
This works on the light curves available at MAST.
TODO: look at:
https://archive.stsci.edu/missions/tess/doc/EXP-TESS-ARC-ICD-TM-0014.pdf
for details on the column descriptions and to fill in any other info we need.
Parameters: - lcfits (str) – The filename of a MAST Kepler/K2 light curve FITS file.
- headerkeys (list) – A list of FITS header keys that will be extracted from the FITS light curve file. These describe the observations. The default value for this is given in LCHEADERKEYS above.
- datakeys (list) – A list of FITS column names that correspond to the auxiliary measurements in the light curve. The default is LCDATAKEYS above.
- sapkeys (list) – A list of FITS column names that correspond to the SAP flux measurements in the light curve. The default is LCSAPKEYS above.
- pdckeys (list) – A list of FITS column names that correspond to the PDC flux measurements in the light curve. The default is LCPDCKEYS above.
- topkeys (list) – A list of FITS header keys that describe the object in the light curve. The default is LCTOPKEYS above.
- apkeys (list) – A list of FITS header keys that describe the flux measurement apertures used by the TESS pipeline. The default is LCAPERTUREKEYS above.
- normalize (bool) – If True, then the light curve’s SAP_FLUX and PDCSAP_FLUX measurements will be normalized to 1.0 by dividing out the median flux for the component light curve.
- appendto (lcdict or None) – If appendto is an lcdict, will append measurements of this lcdict to that lcdict. This is used for consolidating light curves for the same object across different files (sectors/cameras/CCDs?). The appending does not care about the time order. To consolidate light curves in time order, use consolidate_tess_fitslc below.
- filterqualityflags (bool) – If True, will remove any measurements that have non-zero quality flags present. This usually indicates an issue with the instrument or spacecraft.
- nanfilter ({'sap','pdc','sap,pdc'} or None) – Indicates the flux measurement type(s) to apply the filtering to.
- timestoignore (list of tuples or None) –
This is of the form:
[(time1_start, time1_end), (time2_start, time2_end), ...]
and indicates the start and end times to mask out of the final lcdict. Use this to remove anything that wasn’t caught by the quality flags.
Returns: Returns an lcdict (this is useable by most astrobase functions for LC processing).
Return type: lcdict
-
astrobase.astrotess.
consolidate_tess_fitslc
(lclist, normalize=True, filterqualityflags=False, nanfilter=None, timestoignore=None, headerkeys=['EXPOSURE', 'TIMEREF', 'TASSIGN', 'TIMESYS', 'BJDREFI', 'BJDREFF', 'TELAPSE', 'LIVETIME', 'INT_TIME', 'NUM_FRM', 'TIMEDEL', 'BACKAPP', 'DEADAPP', 'VIGNAPP', 'GAINA', 'GAINB', 'GAINC', 'GAIND', 'READNOIA', 'READNOIB', 'READNOIC', 'READNOID', 'CDPP0_5', 'CDPP1_0', 'CDPP2_0', 'PDCVAR', 'PDCMETHD', 'CROWDSAP', 'FLFRCSAP', 'NSPSDDET', 'NSPSDCOR'], datakeys=['TIME', 'TIMECORR', 'CADENCENO', 'QUALITY', 'PSF_CENTR1', 'PSF_CENTR1_ERR', 'PSF_CENTR2', 'PSF_CENTR2_ERR', 'MOM_CENTR1', 'MOM_CENTR1_ERR', 'MOM_CENTR2', 'MOM_CENTR2_ERR', 'POS_CORR1', 'POS_CORR2'], sapkeys=['SAP_FLUX', 'SAP_FLUX_ERR', 'SAP_BKG', 'SAP_BKG_ERR'], pdckeys=['PDCSAP_FLUX', 'PDCSAP_FLUX_ERR'], topkeys=['DATE-OBS', 'DATE-END', 'PROCVER', 'ORIGIN', 'DATA_REL', 'TIMVERSN', 'OBJECT', 'TICID', 'SECTOR', 'CAMERA', 'CCD', 'PXTABLE', 'RADESYS', 'RA_OBJ', 'DEC_OBJ', 'EQUINOX', 'PMRA', 'PMDEC', 'PMTOTAL', 'TESSMAG', 'TEFF', 'LOGG', 'MH', 'RADIUS', 'TICVER', 'CRMITEN', 'CRBLKSZ', 'CRSPOC'], apkeys=['NPIXSAP', 'NPIXMISS', 'CDELT1', 'CDELT2'])[source]¶ This consolidates a list of LCs for a single TIC object.
NOTE: if light curve time arrays contain nans, these and their associated measurements will be sorted to the end of the final combined arrays.
Parameters: - lclist (list of str, or str) – lclist is either a list of actual light curve files or a string that is valid for glob.glob to search for and generate a light curve list based on the file glob. This is useful for consolidating LC FITS files across different TESS sectors for a single TIC ID using a glob like *<TICID>*_lc.fits.
- normalize (bool) – If True, then the light curve’s SAP_FLUX and PDCSAP_FLUX measurements will be normalized to 1.0 by dividing out the median flux for the component light curve.
- filterqualityflags (bool) – If True, will remove any measurements that have non-zero quality flags present. This usually indicates an issue with the instrument or spacecraft.
- nanfilter ({'sap','pdc','sap,pdc'} or None) – Indicates the flux measurement type(s) to apply the filtering to.
- timestoignore (list of tuples or None) –
This is of the form:
[(time1_start, time1_end), (time2_start, time2_end), ...]
and indicates the start and end times to mask out of the final lcdict. Use this to remove anything that wasn’t caught by the quality flags.
- headerkeys (list) – A list of FITS header keys that will be extracted from the FITS light curve file. These describe the observations. The default value for this is given in LCHEADERKEYS above.
- datakeys (list) – A list of FITS column names that correspond to the auxiliary measurements in the light curve. The default is LCDATAKEYS above.
- sapkeys (list) – A list of FITS column names that correspond to the SAP flux measurements in the light curve. The default is LCSAPKEYS above.
- pdckeys (list) – A list of FITS column names that correspond to the PDC flux measurements in the light curve. The default is LCPDCKEYS above.
- topkeys (list) – A list of FITS header keys that describe the object in the light curve. The default is LCTOPKEYS above.
- apkeys (list) – A list of FITS header keys that describe the flux measurement apertures used by the TESS pipeline. The default is LCAPERTUREKEYS above.
Returns: Returns an lcdict (this is useable by most astrobase functions for LC processing).
Return type: lcdict
-
astrobase.astrotess.
tess_lcdict_to_pkl
(lcdict, outfile=None)[source]¶ This writes the lcdict to a Python pickle.
Parameters: - lcdict (lcdict) – This is the input lcdict to write to a pickle.
- outfile (str or None) – If this is None, the object’s Kepler ID/EPIC ID will determined from the lcdict and used to form the filename of the output pickle file. If this is a str, the provided filename will be used.
Returns: The absolute path to the written pickle file.
Return type: str
-
astrobase.astrotess.
read_tess_pklc
(picklefile)[source]¶ This turns the pickled lightcurve file back into an lcdict.
Parameters: picklefile (str) – The path to a previously written Kepler LC picklefile generated by tess_lcdict_to_pkl above. Returns: Returns an lcdict (this is useable by most astrobase functions for LC processing). Return type: lcdict
-
astrobase.astrotess.
filter_tess_lcdict
(lcdict, filterqualityflags=True, nanfilter='sap, pdc, time', timestoignore=None, quiet=False)[source]¶ This filters the provided TESS lcdict, removing nans and bad observations.
By default, this function removes points in the TESS LC that have ANY quality flags set.
Parameters: - lcdict (lcdict) – An lcdict produced by consolidate_tess_fitslc or read_tess_fitslc.
- filterflags (bool) – If True, will remove any measurements that have non-zero quality flags present. This usually indicates an issue with the instrument or spacecraft.
- nanfilter ({'sap','pdc','sap,pdc'}) – Indicates the flux measurement type(s) to apply the filtering to.
- timestoignore (list of tuples or None) –
This is of the form:
[(time1_start, time1_end), (time2_start, time2_end), ...]
and indicates the start and end times to mask out of the final lcdict. Use this to remove anything that wasn’t caught by the quality flags.
Returns: Returns an lcdict (this is useable by most astrobase functions for LC processing). The lcdict is filtered IN PLACE!
Return type: lcdict
astrobase.hatsurveys package¶
Submodules¶
astrobase.hatsurveys.hatlc module¶
This contains functions to read HAT sqlite (“sqlitecurves”) and CSV light curves generated by the new HAT data server.
The most useful functions in this module are:
read_csvlc(lcfile):
This reads a CSV light curve produced by the HAT data server into an
lcdict.
lcfile is the HAT gzipped CSV LC (with a .hatlc.csv.gz extension)
And:
read_and_filter_sqlitecurve(lcfile, columns=None, sqlfilters=None,
raiseonfail=False, forcerecompress=False):
This reads a sqlitecurve file and optionally filters it, returns an
lcdict.
Returns columns requested in columns. If None, then returns all columns
present in the latest columnlist in the lightcurve. See COLUMNDEFS for
the full list of HAT LC columns.
If sqlfilters is not None, it must be a list of text SQL filters that
apply to the columns in the lightcurve.
This returns an lcdict with an added 'lcfiltersql' key that indicates
what the parsed SQL filter string was.
If forcerecompress = True, will recompress the un-gzipped sqlitecurve
even if the gzipped form exists on disk already.
Finally:
describe(lcdict):
This describes the metadata of the light curve.
Command line usage¶
You can call this module directly from the command line:
If you just have this file alone:
$ chmod +x hatlc.py
$ ./hatlc.py --help
If astrobase is installed with pip, etc., this will be on your path already:
$ hatlc --help
These should give you the following:
usage: hatlc.py [-h] [--describe] hatlcfile
read a HAT LC of any format and output to stdout
positional arguments:
hatlcfile path to the light curve you want to read and pipe to stdout
optional arguments:
-h, --help show this help message and exit
--describe don't dump the columns, show only object info and LC metadata
Either one will dump any HAT LC recognized to stdout (or just dump the description if requested).
Other useful functions¶
Two other functions that might be useful:
normalize_lcdict(lcdict, timecol='rjd', magcols='all', mingap=4.0,
normto='sdssr', debugmode=False):
This normalizes magnitude columns (specified in the magcols keyword
argument) in an lcdict obtained from reading a HAT light curve. This
normalization is done by finding 'timegroups' in each magnitude column,
assuming that these belong to different 'eras' separated by a specified
gap in the mingap keyword argument, and thus may be offset vertically
from one another. Measurements within a timegroup are normalized to zero
using the meidan magnitude of the timegroup. Once all timegroups have
been processed this way, the whole time series is then re-normalized to
the specified value in the normto keyword argument.
And:
normalize_lcdict_byinst(lcdict, magcols='all', normto='sdssr',
normkeylist=('stf','ccd','flt','fld','prj','exp'),
debugmode=False)
This normalizes magnitude columns (specified in the magcols keyword
argument) in an lcdict obtained from reading a HAT light curve. This
normalization is done by generating a normalization key using columns in
the lcdict that specify various instrument properties. The default
normalization key (specified in the normkeylist kwarg) is a combination
of:
- HAT station IDs ('stf')
- camera position ID ('ccd'; useful for HATSouth observations)
- camera filters ('flt')
- observed HAT field names ('fld')
- HAT project IDs ('prj')
- camera exposure times ('exp')
with the assumption that measurements with identical normalization keys
belong to a single 'era'. Measurements within an era are normalized to
zero using the median magnitude of the era. Once all eras have been
processed this way, the whole time series is then re-normalized to the
specified value in the normto keyword argument.
There’s an IPython notebook describing the use of this module and accompanying modules from the astrobase package at:
https://github.com/waqasbhatti/astrobase-notebooks/blob/master/lightcurve-work.ipynb
-
astrobase.hatsurveys.hatlc.
read_and_filter_sqlitecurve
(lcfile, columns=None, sqlfilters=None, raiseonfail=False, returnarrays=True, forcerecompress=False, quiet=True)[source]¶ This reads a HAT sqlitecurve and optionally filters it.
Parameters: - lcfile (str) – The path to the HAT sqlitecurve file.
- columns (list) – A list of columns to extract from the ligh curve file. If None, then returns all columns present in the latest columnlist in the light curve.
- sqlfilters (list of str) – If no None, it must be a list of text SQL filters that apply to the columns in the lightcurve.
- raiseonfail (bool) – If this is True, an Exception when reading the LC will crash the function instead of failing silently and returning None as the result.
- returnarrays (bool) – If this is True, the output lcdict contains columns as np.arrays instead of lists. You generally want this to be True.
- forcerecompress (bool) – If True, the sqlitecurve will be recompressed even if a compressed version of it is found. This usually happens when sqlitecurve opening is interrupted by the OS for some reason, leaving behind a gzipped and un-gzipped copy. By default, this function refuses to overwrite the existing gzipped version so if the un-gzipped version is corrupt but that one isn’t, it can be safely recovered.
- quiet (bool) – If True, will not warn about any problems, even if the light curve reading fails (the only clue then will be the return value of None). Useful for batch processing of many many light curves.
Returns: tuple – A two-element tuple is returned, with the first element being the lcdict.
Return type: (lcdict, status_message)
-
astrobase.hatsurveys.hatlc.
describe
(lcdict, returndesc=False, offsetwith=None)[source]¶ This describes the light curve object and columns present.
Parameters: - lcdict (dict) – The input lcdict to parse for column and metadata info.
- returndesc (bool) – If True, returns the description string as an str instead of just printing it to stdout.
- offsetwith (str) – This is a character to offset the output description lines by. This is useful to add comment characters like ‘#’ to the output description lines.
Returns: If returndesc is True, returns the description lines as a str, otherwise returns nothing.
Return type: str or None
-
astrobase.hatsurveys.hatlc.
read_lcc_csvlc
(lcfile)[source]¶ This reads a CSV LC produced by an LCC-Server instance.
Parameters: lcfile (str) – The LC file to read. Returns: Returns an lcdict that’s readable by most astrobase functions for further processing. Return type: dict
-
astrobase.hatsurveys.hatlc.
describe_lcc_csv
(lcdict, returndesc=False)[source]¶ This describes the LCC CSV format light curve file.
Parameters: - lcdict (dict) – The input lcdict to parse for column and metadata info.
- returndesc (bool) – If True, returns the description string as an str instead of just printing it to stdout.
Returns: If returndesc is True, returns the description lines as a str, otherwise returns nothing.
Return type: str or None
-
astrobase.hatsurveys.hatlc.
read_csvlc
(lcfile)[source]¶ This reads a HAT data server or LCC-Server produced CSV light curve into an lcdict.
This will automatically figure out the format of the file provided. Currently, it can read:
- legacy HAT data server CSV LCs (e.g. from https://hatsouth.org/planets/lightcurves.html) with an extension of the form: .hatlc.csv.gz.
- all LCC-Server produced LCC-CSV-V1 LCs (e.g. from https://data.hatsurveys.org) with an extension of the form: -csvlc.gz.
Parameters: lcfile (str) – The light curve file to read. Returns: Returns an lcdict that can be read and used by many astrobase processing functions. Return type: dict
-
astrobase.hatsurveys.hatlc.
find_lc_timegroups
(lctimes, mingap=4.0)[source]¶ This finds the time gaps in the light curve, so we can figure out which times are for consecutive observations and which represent gaps between seasons.
Parameters: - lctimes (np.array) – This is the input array of times, assumed to be in some form of JD.
- mingap (float) – This defines how much the difference between consecutive measurements is allowed to be to consider them as parts of different timegroups. By default it is set to 4.0 days.
Returns: A tuple of the form below is returned, containing the number of time groups found and Python slice objects for each group:
(ngroups, [slice(start_ind_1, end_ind_1), ...])
Return type: tuple
-
astrobase.hatsurveys.hatlc.
normalize_lcdict
(lcdict, timecol='rjd', magcols='all', mingap=4.0, normto='sdssr', debugmode=False, quiet=False)[source]¶ This normalizes magcols in lcdict using timecol to find timegroups.
Parameters: - lcdict (dict) – The input lcdict to process.
- timecol (str) – The key in the lcdict that is to be used to extract the time column.
- magcols ('all' or list of str) – If this is ‘all’, all of the columns in the lcdict that are indicated to be magnitude measurement columns are normalized. If this is a list of str, must contain the keys of the lcdict specifying which magnitude columns will be normalized.
- mingap (float) – This defines how much the difference between consecutive measurements is allowed to be to consider them as parts of different timegroups. By default it is set to 4.0 days.
- normto ({'globalmedian', 'zero', 'jmag', 'hmag', 'kmag', 'bmag', 'vmag', 'sdssg', 'sdssr', 'sdssi'}) – This indicates which column will be the normalization target. If this is ‘globalmedian’, the normalization will be to the global median of each LC column. If this is ‘zero’, will normalize to 0.0 for each LC column. Otherwise, will normalize to the value of one of the other keys in the lcdict[‘objectinfo’][magkey], meaning the normalization will be to some form of catalog magnitude.
- debugmode (bool) – If True, will indicate progress as time-groups are found and processed.
- quiet (bool) – If True, will not emit any messages when processing.
Returns: Returns the lcdict with the magnitude measurements normalized as specified. The normalization happens IN PLACE.
Return type: dict
-
astrobase.hatsurveys.hatlc.
normalize_lcdict_byinst
(lcdict, magcols='all', normto='sdssr', normkeylist=('stf', 'ccd', 'flt', 'fld', 'prj', 'exp'), debugmode=False, quiet=False)[source]¶ This is a function to normalize light curves across all instrument combinations present.
Use this to normalize a light curve containing a variety of:
- HAT station IDs (‘stf’)
- camera IDs (‘ccd’)
- filters (‘flt’)
- observed field names (‘fld’)
- HAT project IDs (‘prj’)
- exposure times (‘exp’)
Parameters: - lcdict (dict) – The input lcdict to process.
- magcols ('all' or list of str) – If this is ‘all’, all of the columns in the lcdict that are indicated to be magnitude measurement columns are normalized. If this is a list of str, must contain the keys of the lcdict specifying which magnitude columns will be normalized.
- normto ({'zero', 'jmag', 'hmag', 'kmag', 'bmag', 'vmag', 'sdssg', 'sdssr', 'sdssi'}) – This indicates which column will be the normalization target. If this is ‘zero’, will normalize to 0.0 for each LC column. Otherwise, will normalize to the value of one of the other keys in the lcdict[‘objectinfo’][magkey], meaning the normalization will be to some form of catalog magnitude.
- normkeylist (list of str) – These are the column keys to use to form the normalization index. Measurements in the specified magcols with identical normalization index values will be considered as part of a single measurement ‘era’, and will be normalized to zero. Once all eras have been normalized this way, the final light curve will be re-normalized as specified in normto.
- debugmode (bool) – If True, will indicate progress as time-groups are found and processed.
- quiet (bool) – If True, will not emit any messages when processing.
Returns: Returns the lcdict with the magnitude measurements normalized as specified. The normalization happens IN PLACE.
Return type: dict
-
astrobase.hatsurveys.hatlc.
main
()[source]¶ This is called when we’re executed from the commandline.
The current usage from the command-line is described below:
usage: hatlc [-h] [--describe] hatlcfile read a HAT LC of any format and output to stdout positional arguments: hatlcfile path to the light curve you want to read and pipe to stdout optional arguments: -h, --help show this help message and exit --describe don't dump the columns, show only object info and LC metadata
astrobase.hatsurveys.k2hat module¶
This contains functions for reading K2 CSV light-curves produced by the HAT Project into a Python dictionary. Requires numpy.
The only external function here is:
read_csv_lightcurve(lcfile)
Example:
Reading the best aperture LC for EPIC201183188 = UCAC4-428-055298 (see http://k2.hatsurveys.org to search for this object and download the light curve):
>>> import k2hat
>>> lcdict = k2hat.read_csv_lightcurve('UCAC4-428-055298-75d3f4357b314ff5ac458e917e6dfeb964877b60affe9193d4f65088-k2lc.csv.gz')
The Python dict lcdict contains the metadata and all columns.
>>> lcdict.keys()
['decl', 'objectid', 'bjdoffset', 'qualflag', 'fovchannel', 'BGV',
'aperpixradius', 'IM04', 'TF17', 'EP01', 'CF01', 'ra', 'fovmodule', 'columns',
'k2campaign', 'EQ01', 'fovccd', 'FRN', 'IE04', 'kepid', 'YCC', 'XCC', 'BJD',
'napertures', 'ucac4id', 'IQ04', 'kepmag', 'ndet','kernelspec']
The columns for the light curve are stored in the columns key of the dict. To get a list of the columns:
>>> lcdict['columns']
['BJD', 'BGV', 'FRN', 'XCC', 'YCC', 'IM04', 'IE04', 'IQ04', 'EP01', 'EQ01',
'TF17', 'CF01']
To get columns:
>>> bjd, epdmags = lcdict['BJD'], lcdict['EP01']
>>> bjd
array([ 2456808.1787283, 2456808.1991608, 2456808.2195932, ...,
2456890.2535691, 2456890.274001 , 2456890.2944328])
>>> epdmags
array([ 16.03474, 16.02773, 16.01826, ..., 15.76997, 15.76577,
15.76263])
astrobase.astrokep
: contains functions for dealing with Kepler and K2 Mission light curves from STScI MAST (reading the FITS files, consolidating light curves for objects over quarters), and some basic operations (converting fluxes to mags, decorrelation of light curves, filtering light curves, and fitting object centroids for eclipse analysis, etc.)astrobase.astrotess
: contains functions for dealing with TESS 2-minute cadence light curves from STScI MAST (reading the FITS files, consolidating light curves for objects over sectors), and some basic operations (converting fluxes to mags, filtering light curves, etc.)astrobase.hatsurveys
: modules to read, filter, and normalize light curves from various HAT surveys.
astrobase.periodbase.abls module¶
Contains the Kovacs, et al. (2002) Box-Least-squared-Search period-search algorithm implementation for periodbase. This uses the implementation in Astropy 3.1, so requires that version.
-
astrobase.periodbase.abls.
bls_serial_pfind
(times, mags, errs, magsarefluxes=False, startp=0.1, endp=100.0, stepsize=0.0005, mintransitduration=0.01, maxtransitduration=0.4, ndurations=100, autofreq=True, blsobjective='likelihood', blsmethod='fast', blsoversample=10, blsmintransits=3, blsfreqfactor=10.0, periodepsilon=0.1, nbestpeaks=5, sigclip=10.0, endp_timebase_check=True, verbose=True, raiseonfail=False)[source]¶ Runs the Box Least Squares Fitting Search for transit-shaped signals.
Based on the version of BLS in Astropy 3.1: astropy.stats.BoxLeastSquares. If you don’t have Astropy 3.1, this module will fail to import. Note that by default, this implementation of bls_serial_pfind doesn’t use the .autoperiod() function from BoxLeastSquares but uses the same auto frequency-grid generation as the functions in periodbase.kbls. If you want to use Astropy’s implementation, set the value of autofreq kwarg to ‘astropy’.
The dict returned from this function contains a blsmodel key, which is the generated model from Astropy’s BLS. Use the .compute_stats() method to calculate the required stats like SNR, depth, duration, etc.
Parameters: - times,mags,errs (np.array) – The magnitude/flux time-series to search for transits.
- magsarefluxes (bool) – If the input measurement values in mags and errs are in fluxes, set this to True.
- startp,endp (float) – The minimum and maximum periods to consider for the transit search.
- stepsize (float) – The step-size in frequency to use when constructing a frequency grid for the period search.
- mintransitduration,maxtransitduration (float) – The minimum and maximum transitdurations (in units of phase) to consider for the transit search.
- ndurations (int) – The number of transit durations to use in the period-search.
- autofreq (bool or str) –
If this is True, the values of stepsize and nphasebins will be ignored, and these, along with a frequency-grid, will be determined based on the following relations:
nphasebins = int(ceil(2.0/mintransitduration)) if nphasebins > 3000: nphasebins = 3000 stepsize = 0.25*mintransitduration/(times.max()-times.min()) minfreq = 1.0/endp maxfreq = 1.0/startp nfreq = int(ceil((maxfreq - minfreq)/stepsize))
If this is False, you must set startp, endp, and stepsize as appropriate.
If this is str == ‘astropy’, will use the astropy.stats.BoxLeastSquares.autoperiod() function to calculate the frequency grid instead of the kbls method.
- blsobjective ({'likelihood','snr'}) – Sets the type of objective to optimize in the BoxLeastSquares.power() function.
- blsmethod ({'fast','slow'}) – Sets the type of method to use in the BoxLeastSquares.power() function.
- blsoversample ({'likelihood','snr'}) – Sets the oversample kwarg for the BoxLeastSquares.power() function.
- blsmintransits (int) – Sets the min_n_transits kwarg for the BoxLeastSquares.autoperiod() function.
- blsfreqfactor (float) – Sets the frequency_factor kwarg for the BoxLeastSquares.autperiod() function.
- periodepsilon (float) – The fractional difference between successive values of ‘best’ periods when sorting by periodogram power to consider them as separate periods (as opposed to part of the same periodogram peak). This is used to avoid broad peaks in the periodogram and make sure the ‘best’ periods returned are all actually independent.
- nbestpeaks (int) – The number of ‘best’ peaks to return from the periodogram results, starting from the global maximum of the periodogram peak values.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- endp_timebase_check (bool) – If True, will check if the
endp
value is larger than the time-base of the observations. If it is, will change theendp
value such that it is half of the time-base. If False, will allow anendp
larger than the time-base of the observations. - verbose (bool) – If this is True, will indicate progress and details about the frequency grid used for the period search.
- raiseonfail (bool) – If True, raises an exception if something goes wrong. Otherwise, returns None.
Returns: This function returns a dict, referred to as an lspinfo dict in other astrobase functions that operate on periodogram results. This is a standardized format across all astrobase period-finders, and is of the form below:
{'bestperiod': the best period value in the periodogram, 'bestlspval': the periodogram peak associated with the best period, 'nbestpeaks': the input value of nbestpeaks, 'nbestlspvals': nbestpeaks-size list of best period peak values, 'nbestperiods': nbestpeaks-size list of best periods, 'lspvals': the full array of periodogram powers, 'frequencies': the full array of frequencies considered, 'periods': the full array of periods considered, 'durations': the array of durations used to run BLS, 'blsresult': Astropy BLS result object (BoxLeastSquaresResult), 'blsmodel': Astropy BLS BoxLeastSquares object used for work, 'stepsize': the actual stepsize used, 'nfreq': the actual nfreq used, 'durations': the durations array used, 'mintransitduration': the input mintransitduration, 'maxtransitduration': the input maxtransitdurations, 'method':'bls' -> the name of the period-finder method, 'kwargs':{ dict of all of the input kwargs for record-keeping}}
Return type: dict
-
astrobase.periodbase.abls.
bls_parallel_pfind
(times, mags, errs, magsarefluxes=False, startp=0.1, endp=100.0, stepsize=0.0001, mintransitduration=0.01, maxtransitduration=0.4, ndurations=100, autofreq=True, blsobjective='likelihood', blsmethod='fast', blsoversample=5, blsmintransits=3, blsfreqfactor=10.0, nbestpeaks=5, periodepsilon=0.1, sigclip=10.0, endp_timebase_check=True, verbose=True, nworkers=None)[source]¶ Runs the Box Least Squares Fitting Search for transit-shaped signals.
Breaks up the full frequency space into chunks and passes them to parallel BLS workers.
Based on the version of BLS in Astropy 3.1: astropy.stats.BoxLeastSquares. If you don’t have Astropy 3.1, this module will fail to import. Note that by default, this implementation of bls_parallel_pfind doesn’t use the .autoperiod() function from BoxLeastSquares but uses the same auto frequency-grid generation as the functions in periodbase.kbls. If you want to use Astropy’s implementation, set the value of autofreq kwarg to ‘astropy’. The generated period array will then be broken up into chunks and sent to the individual workers.
NOTE: the combined BLS spectrum produced by this function is not identical to that produced by running BLS in one shot for the entire frequency space. There are differences on the order of 1.0e-3 or so in the respective peak values, but peaks appear at the same frequencies for both methods. This is likely due to different aliasing caused by smaller chunks of the frequency space used by the parallel workers in this function. When in doubt, confirm results for this parallel implementation by comparing to those from the serial implementation above.
In particular, when you want to get reliable estimates of the SNR, transit depth, duration, etc. that Astropy’s BLS gives you, rerun bls_serial_pfind with startp, and endp close to the best period you want to characterize the transit at. The dict returned from that function contains a blsmodel key, which is the generated model from Astropy’s BLS. Use the .compute_stats() method to calculate the required stats.
Parameters: - times,mags,errs (np.array) – The magnitude/flux time-series to search for transits.
- magsarefluxes (bool) – If the input measurement values in mags and errs are in fluxes, set this to True.
- startp,endp (float) – The minimum and maximum periods to consider for the transit search.
- stepsize (float) – The step-size in frequency to use when constructing a frequency grid for the period search.
- mintransitduration,maxtransitduration (float) – The minimum and maximum transitdurations (in units of phase) to consider for the transit search.
- ndurations (int) – The number of transit durations to use in the period-search.
- autofreq (bool or str) –
If this is True, the values of stepsize and nphasebins will be ignored, and these, along with a frequency-grid, will be determined based on the following relations:
nphasebins = int(ceil(2.0/mintransitduration)) if nphasebins > 3000: nphasebins = 3000 stepsize = 0.25*mintransitduration/(times.max()-times.min()) minfreq = 1.0/endp maxfreq = 1.0/startp nfreq = int(ceil((maxfreq - minfreq)/stepsize))
If this is False, you must set startp, endp, and stepsize as appropriate.
If this is str == ‘astropy’, will use the astropy.stats.BoxLeastSquares.autoperiod() function to calculate the frequency grid instead of the kbls method.
- blsobjective ({'likelihood','snr'}) – Sets the type of objective to optimize in the BoxLeastSquares.power() function.
- blsmethod ({'fast','slow'}) – Sets the type of method to use in the BoxLeastSquares.power() function.
- blsoversample ({'likelihood','snr'}) – Sets the oversample kwarg for the BoxLeastSquares.power() function.
- blsmintransits (int) – Sets the min_n_transits kwarg for the BoxLeastSquares.autoperiod() function.
- blsfreqfactor (float) – Sets the frequency_factor kwarg for the BoxLeastSquares.autoperiod() function.
- periodepsilon (float) – The fractional difference between successive values of ‘best’ periods when sorting by periodogram power to consider them as separate periods (as opposed to part of the same periodogram peak). This is used to avoid broad peaks in the periodogram and make sure the ‘best’ periods returned are all actually independent.
- nbestpeaks (int) – The number of ‘best’ peaks to return from the periodogram results, starting from the global maximum of the periodogram peak values.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- endp_timebase_check (bool) – If True, will check if the
endp
value is larger than the time-base of the observations. If it is, will change theendp
value such that it is half of the time-base. If False, will allow anendp
larger than the time-base of the observations. - verbose (bool) – If this is True, will indicate progress and details about the frequency grid used for the period search.
- nworkers (int or None) – The number of parallel workers to launch for period-search. If None, nworkers = NCPUS.
Returns: This function returns a dict, referred to as an lspinfo dict in other astrobase functions that operate on periodogram results. This is a standardized format across all astrobase period-finders, and is of the form below:
{'bestperiod': the best period value in the periodogram, 'bestlspval': the periodogram peak associated with the best period, 'nbestpeaks': the input value of nbestpeaks, 'nbestlspvals': nbestpeaks-size list of best period peak values, 'nbestperiods': nbestpeaks-size list of best periods, 'lspvals': the full array of periodogram powers, 'frequencies': the full array of frequencies considered, 'periods': the full array of periods considered, 'durations': the array of durations used to run BLS, 'blsresult': Astropy BLS result object (BoxLeastSquaresResult), 'blsmodel': Astropy BLS BoxLeastSquares object used for work, 'stepsize': the actual stepsize used, 'nfreq': the actual nfreq used, 'durations': the durations array used, 'mintransitduration': the input mintransitduration, 'maxtransitduration': the input maxtransitdurations, 'method':'bls' -> the name of the period-finder method, 'kwargs':{ dict of all of the input kwargs for record-keeping}}
Return type: dict
astrobase.periodbase.kbls module¶
Contains the Kovacs, et al. (2002) Box-Least-squared-Search period-search algorithm implementation for periodbase.
-
astrobase.periodbase.kbls.
bls_serial_pfind
(times, mags, errs, magsarefluxes=False, startp=0.1, endp=100.0, stepsize=0.0005, mintransitduration=0.01, maxtransitduration=0.4, nphasebins=200, autofreq=True, periodepsilon=0.1, nbestpeaks=5, sigclip=10.0, endp_timebase_check=True, verbose=True, get_stats=True)[source]¶ Runs the Box Least Squares Fitting Search for transit-shaped signals.
Based on eebls.f from Kovacs et al. 2002 and python-bls from Foreman-Mackey et al. 2015. This is the serial version (which is good enough in most cases because BLS in Fortran is fairly fast). If nfreq > 5e5, this will take a while.
Parameters: - times,mags,errs (np.array) – The magnitude/flux time-series to search for transits.
- magsarefluxes (bool) – If the input measurement values in mags and errs are in fluxes, set this to True.
- startp,endp (float) – The minimum and maximum periods to consider for the transit search.
- stepsize (float) – The step-size in frequency to use when constructing a frequency grid for the period search.
- mintransitduration,maxtransitduration (float) – The minimum and maximum transitdurations (in units of phase) to consider for the transit search.
- nphasebins (int) – The number of phase bins to use in the period search.
- autofreq (bool) –
If this is True, the values of stepsize and nphasebins will be ignored, and these, along with a frequency-grid, will be determined based on the following relations:
nphasebins = int(ceil(2.0/mintransitduration)) if nphasebins > 3000: nphasebins = 3000 stepsize = 0.25*mintransitduration/(times.max()-times.min()) minfreq = 1.0/endp maxfreq = 1.0/startp nfreq = int(ceil((maxfreq - minfreq)/stepsize))
- periodepsilon (float) – The fractional difference between successive values of ‘best’ periods when sorting by periodogram power to consider them as separate periods (as opposed to part of the same periodogram peak). This is used to avoid broad peaks in the periodogram and make sure the ‘best’ periods returned are all actually independent.
- nbestpeaks (int) – The number of ‘best’ peaks to return from the periodogram results, starting from the global maximum of the periodogram peak values.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- endp_timebase_check (bool) – If True, will check if the
endp
value is larger than the time-base of the observations. If it is, will change theendp
value such that it is half of the time-base. If False, will allow anendp
larger than the time-base of the observations. - verbose (bool) – If this is True, will indicate progress and details about the frequency grid used for the period search.
- get_stats (bool) –
If True, runs
bls_stats_singleperiod()
for each of the best periods in the output and injects the output into the output dict so you only have to run this function to get the periods and their stats.The output dict from this function will then contain a ‘stats’ key containing a list of dicts with statistics for each period in
resultdict['nbestperiods']
. These dicts will contain fit values of transit parameters after a trapezoid transit model is fit to the phased light curve at each period inresultdict['nbestperiods']
, i.e. fit values for period, epoch, transit depth, duration, ingress duration, and the SNR of the transit.NOTE: make sure to check the ‘fit_status’ key for each
resultdict['stats']
item to confirm that the trapezoid transit model fit succeeded and that the stats calculated are valid.
Returns: This function returns a dict, referred to as an lspinfo dict in other astrobase functions that operate on periodogram results. This is a standardized format across all astrobase period-finders, and is of the form below:
{'bestperiod': the best period value in the periodogram, 'bestlspval': the periodogram peak associated with the best period, 'nbestpeaks': the input value of nbestpeaks, 'nbestlspvals': nbestpeaks-size list of best period peak values, 'nbestperiods': nbestpeaks-size list of best periods, 'stats': BLS stats for each best period, 'lspvals': the full array of periodogram powers, 'frequencies': the full array of frequencies considered, 'periods': the full array of periods considered, 'blsresult': the result dict from the eebls.f wrapper function, 'stepsize': the actual stepsize used, 'nfreq': the actual nfreq used, 'nphasebins': the actual nphasebins used, 'mintransitduration': the input mintransitduration, 'maxtransitduration': the input maxtransitdurations, 'method':'bls' -> the name of the period-finder method, 'kwargs':{ dict of all of the input kwargs for record-keeping}}
Return type: dict
-
astrobase.periodbase.kbls.
bls_parallel_pfind
(times, mags, errs, magsarefluxes=False, startp=0.1, endp=100.0, stepsize=0.0001, mintransitduration=0.01, maxtransitduration=0.4, nphasebins=200, autofreq=True, nbestpeaks=5, periodepsilon=0.1, sigclip=10.0, endp_timebase_check=True, verbose=True, nworkers=None, get_stats=True)[source]¶ Runs the Box Least Squares Fitting Search for transit-shaped signals.
Based on eebls.f from Kovacs et al. 2002 and python-bls from Foreman-Mackey et al. 2015. Breaks up the full frequency space into chunks and passes them to parallel BLS workers.
NOTE: the combined BLS spectrum produced by this function is not identical to that produced by running BLS in one shot for the entire frequency space. There are differences on the order of 1.0e-3 or so in the respective peak values, but peaks appear at the same frequencies for both methods. This is likely due to different aliasing caused by smaller chunks of the frequency space used by the parallel workers in this function. When in doubt, confirm results for this parallel implementation by comparing to those from the serial implementation above.
Parameters: - times,mags,errs (np.array) – The magnitude/flux time-series to search for transits.
- magsarefluxes (bool) – If the input measurement values in mags and errs are in fluxes, set this to True.
- startp,endp (float) – The minimum and maximum periods to consider for the transit search.
- stepsize (float) – The step-size in frequency to use when constructing a frequency grid for the period search.
- mintransitduration,maxtransitduration (float) – The minimum and maximum transitdurations (in units of phase) to consider for the transit search.
- nphasebins (int) – The number of phase bins to use in the period search.
- autofreq (bool) –
If this is True, the values of stepsize and nphasebins will be ignored, and these, along with a frequency-grid, will be determined based on the following relations:
nphasebins = int(ceil(2.0/mintransitduration)) if nphasebins > 3000: nphasebins = 3000 stepsize = 0.25*mintransitduration/(times.max()-times.min()) minfreq = 1.0/endp maxfreq = 1.0/startp nfreq = int(ceil((maxfreq - minfreq)/stepsize))
- periodepsilon (float) – The fractional difference between successive values of ‘best’ periods when sorting by periodogram power to consider them as separate periods (as opposed to part of the same periodogram peak). This is used to avoid broad peaks in the periodogram and make sure the ‘best’ periods returned are all actually independent.
- nbestpeaks (int) – The number of ‘best’ peaks to return from the periodogram results, starting from the global maximum of the periodogram peak values.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- endp_timebase_check (bool) – If True, will check if the
endp
value is larger than the time-base of the observations. If it is, will change theendp
value such that it is half of the time-base. If False, will allow anendp
larger than the time-base of the observations. - verbose (bool) – If this is True, will indicate progress and details about the frequency grid used for the period search.
- nworkers (int or None) – The number of parallel workers to launch for period-search. If None, nworkers = NCPUS.
- get_stats (bool) –
If True, runs
bls_stats_singleperiod()
for each of the best periods in the output and injects the output into the output dict so you only have to run this function to get the periods and their stats.The output dict from this function will then contain a ‘stats’ key containing a list of dicts with statistics for each period in
resultdict['nbestperiods']
. These dicts will contain fit values of transit parameters after a trapezoid transit model is fit to the phased light curve at each period inresultdict['nbestperiods']
, i.e. fit values for period, epoch, transit depth, duration, ingress duration, and the SNR of the transit.NOTE: make sure to check the ‘fit_status’ key for each
resultdict['stats']
item to confirm that the trapezoid transit model fit succeeded and that the stats calculated are valid.
Returns: This function returns a dict, referred to as an lspinfo dict in other astrobase functions that operate on periodogram results. This is a standardized format across all astrobase period-finders, and is of the form below:
{'bestperiod': the best period value in the periodogram, 'bestlspval': the periodogram peak associated with the best period, 'nbestpeaks': the input value of nbestpeaks, 'nbestlspvals': nbestpeaks-size list of best period peak values, 'nbestperiods': nbestpeaks-size list of best periods, 'stats': list of stats dicts returned for each best period, 'lspvals': the full array of periodogram powers, 'frequencies': the full array of frequencies considered, 'periods': the full array of periods considered, 'blsresult': list of result dicts from eebls.f wrapper functions, 'stepsize': the actual stepsize used, 'nfreq': the actual nfreq used, 'nphasebins': the actual nphasebins used, 'mintransitduration': the input mintransitduration, 'maxtransitduration': the input maxtransitdurations, 'method':'bls' -> the name of the period-finder method, 'kwargs':{ dict of all of the input kwargs for record-keeping}}
Return type: dict
-
astrobase.periodbase.kbls.
bls_stats_singleperiod
(times, mags, errs, period, magsarefluxes=False, sigclip=10.0, perioddeltapercent=10, nphasebins=200, mintransitduration=0.01, maxtransitduration=0.4, ingressdurationfraction=0.1, verbose=True)[source]¶ This calculates the SNR, depth, duration, a refit period, and time of center-transit for a single period.
The equation used for SNR is:
SNR = (transit model depth / RMS of LC with transit model subtracted) * sqrt(number of points in transit)
NOTE: you should set the kwargs sigclip, nphasebins, mintransitduration, maxtransitduration to what you used for an initial BLS run to detect transits in the input light curve to match those input conditions.
Parameters: - times,mags,errs (np.array) – These contain the magnitude/flux time-series and any associated errors.
- period (float) – The period to search around and refit the transits. This will be used to calculate the start and end periods of a rerun of BLS to calculate the stats.
- magsarefluxes (bool) – Set to True if the input measurements in mags are actually fluxes and not magnitudes.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- perioddeltapercent (float) –
The fraction of the period provided to use to search around this value. This is a percentage. The period range searched will then be:
[period - (perioddeltapercent/100.0)*period, period + (perioddeltapercent/100.0)*period]
- nphasebins (int) – The number of phase bins to use in the BLS run.
- mintransitduration (float) – The minimum transit duration in phase to consider.
- maxtransitduration (float) – The maximum transit duration to consider.
- ingressdurationfraction (float) – The fraction of the transit duration to use to generate an initial value of the transit ingress duration for the BLS model refit. This will be fit by this function.
- verbose (bool) – If True, will indicate progress and any problems encountered.
Returns: A dict of the following form is returned:
{'period': the refit best period, 'epoch': the refit epoch (i.e. mid-transit time), 'snr':the SNR of the transit, 'transitdepth':the depth of the transit, 'transitduration':the duration of the transit, 'ingressduration':if trapezoid fit OK, is the ingress duration, 'npoints_in_transit':the number of LC points in transit, 'fit_status': 'ok' or 'trapezoid model fit failed,...', 'nphasebins':the input value of nphasebins, 'transingressbin':the phase bin containing transit ingress, 'transegressbin':the phase bin containing transit egress, 'blsmodel':the full BLS model used along with its parameters, 'subtractedmags':BLS model - phased light curve, 'phasedmags':the phase light curve, 'phases': the phase values}
You should check the ‘fit_status’ key in this returned dict for a value of ‘ok’. If it is ‘trapezoid model fit failed, using box model’, you may not want to trust the transit period and epoch found.
Return type: dict
-
astrobase.periodbase.kbls.
bls_snr
(blsdict, times, mags, errs, assumeserialbls=False, magsarefluxes=False, sigclip=10.0, npeaks=None, perioddeltapercent=10, ingressdurationfraction=0.1, verbose=True)[source]¶ Calculates the signal to noise ratio for each best peak in the BLS periodogram, along with transit depth, duration, and refit period and epoch.
The following equation is used for SNR:
SNR = (transit model depth / RMS of LC with transit model subtracted) * sqrt(number of points in transit)
Parameters: - blsdict (dict) –
This is an lspinfo dict produced by either bls_parallel_pfind or bls_serial_pfind in this module, or by your own BLS function. If you provide results in a dict from an external BLS function, make sure this matches the form below:
{'bestperiod': the best period value in the periodogram, 'bestlspval': the periodogram peak associated with the best period, 'nbestpeaks': the input value of nbestpeaks, 'nbestlspvals': nbestpeaks-size list of best period peak values, 'nbestperiods': nbestpeaks-size list of best periods, 'lspvals': the full array of periodogram powers, 'frequencies': the full array of frequencies considered, 'periods': the full array of periods considered, 'blsresult': list of result dicts from eebls.f wrapper functions, 'stepsize': the actual stepsize used, 'nfreq': the actual nfreq used, 'nphasebins': the actual nphasebins used, 'mintransitduration': the input mintransitduration, 'maxtransitduration': the input maxtransitdurations, 'method':'bls' -> the name of the period-finder method, 'kwargs':{ dict of all of the input kwargs for record-keeping}}
- times,mags,errs (np.array) – These contain the magnitude/flux time-series and any associated errors.
- assumeserialbls (bool) – If this is True, this function will not rerun BLS around each best peak in the input lspinfo dict to refit the periods and epochs. This is usally required for bls_parallel_pfind so set this to False if you use results from that function. The parallel method breaks up the frequency space into chunks for speed, and the results may not exactly match those from a regular BLS run.
- magsarefluxes (bool) – Set to True if the input measurements in mags are actually fluxes and not magnitudes.
- npeaks (int or None) – This controls how many of the periods in blsdict[‘nbestperiods’] to find the SNR for. If it’s None, then this will calculate the SNR for all of them. If it’s an integer between 1 and len(blsdict[‘nbestperiods’]), will calculate for only the specified number of peak periods, starting from the best period.
- perioddeltapercent (float) –
The fraction of the period provided to use to search around this value. This is a percentage. The period range searched will then be:
[period - (perioddeltapercent/100.0)*period, period + (perioddeltapercent/100.0)*period]
- ingressdurationfraction (float) – The fraction of the transit duration to use to generate an initial value of the transit ingress duration for the BLS model refit. This will be fit by this function.
- verbose (bool) – If True, will indicate progress and any problems encountered.
Returns: A dict of the following form is returned:
{'npeaks: the number of periodogram peaks requested to get SNR for, 'period': list of refit best periods for each requested peak, 'epoch': list of refit epochs (i.e. mid-transit times), 'snr':list of SNRs of the transit for each requested peak, 'transitdepth':list of depths of the transits, 'transitduration':list of durations of the transits, 'nphasebins':the input value of nphasebins, 'transingressbin':the phase bin containing transit ingress, 'transegressbin':the phase bin containing transit egress, 'allblsmodels':the full BLS models used along with its parameters, 'allsubtractedmags':BLS models - phased light curves, 'allphasedmags':the phase light curves, 'allphases': the phase values}
Return type: dict
- blsdict (dict) –
astrobase.periodbase.htls module¶
Contains the Hippke & Heller (2019) transit-least-squared period-search algorithm implementation for periodbase. This depends on the external package written by Hippke & Heller, https://github.com/hippke/tls.
-
astrobase.periodbase.htls.
tls_parallel_pfind
(times, mags, errs, magsarefluxes=None, startp=0.1, endp=None, tls_oversample=5, tls_mintransits=3, tls_transit_template='default', tls_rstar_min=0.13, tls_rstar_max=3.5, tls_mstar_min=0.1, tls_mstar_max=2.0, periodepsilon=0.1, nbestpeaks=5, sigclip=10.0, verbose=True, nworkers=None)[source]¶ Wrapper to Hippke & Heller (2019)’s “transit least squares”, which is BLS, but with a slightly better template (and niceties in the implementation).
A few comments:
The time series must be in units of days.
The frequency sampling Hippke & Heller (2019) advocate for is cubic in frequencies, instead of linear. Ofir (2014) found that the linear-in-frequency sampling (which is correct for sinusoidal signal detection) isn’t optimal for a Keplerian box signal. He gave an equation for “optimal” sampling. tlsoversample is the factor by which to oversample over that. The grid can be imported independently via:
from transitleastsquares import period_grid
The spacing equations are given here: https://transitleastsquares.readthedocs.io/en/latest/Python%20interface.html#period-grid
The boundaries of the period search are by default 0.1 day to 99% the baseline of times.
Parameters: - times,mags,errs (np.array) – The magnitude/flux time-series to search for transits.
- magsarefluxes (bool) – transitleastsquares requires fluxes. Therefore if magsarefluxes is set to false, the passed mags are converted to fluxes. All output dictionary vectors include fluxes, not mags.
- startp,endp (float) – The minimum and maximum periods to consider for the transit search.
- tls_oversample (int) – Factor by which to oversample the frequency grid.
- tls_mintransits (int) – Sets the min_n_transits kwarg for the BoxLeastSquares.autoperiod() function.
- tls_transit_template (str) – default, grazing, or box.
- tls_rstar_min,tls_rstar_max (float) – The range of stellar radii to consider when generating a frequency grid. In uniits of Rsun.
- tls_mstar_min,tls_mstar_max (float) – The range of stellar masses to consider when generating a frequency grid. In units of Msun.
- periodepsilon (float) – The fractional difference between successive values of ‘best’ periods when sorting by periodogram power to consider them as separate periods (as opposed to part of the same periodogram peak). This is used to avoid broad peaks in the periodogram and make sure the ‘best’ periods returned are all actually independent.
- nbestpeaks (int) – The number of ‘best’ peaks to return from the periodogram results, starting from the global maximum of the periodogram peak values.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- verbose (bool) – Kept for consistency with periodbase functions.
- nworkers (int or None) – The number of parallel workers to launch for period-search. If None, nworkers = NCPUS.
Returns: This function returns a dict, referred to as an lspinfo dict in other astrobase functions that operate on periodogram results. The format is similar to the other astrobase period-finders – it contains the nbestpeaks, which is the most important thing. (But isn’t entirely standardized.)
Crucially, it also contains “tlsresult”, which is a dictionary with transitleastsquares spectra (used to get the SDE as defined in the TLS paper), statistics, transit period, mid-time, duration, depth, SNR, and the “odd_even_mismatch” statistic. The full key list is:
dict_keys(['SDE', 'SDE_raw', 'chi2_min', 'chi2red_min', 'period', 'period_uncertainty', 'T0', 'duration', 'depth', 'depth_mean', 'depth_mean_even', 'depth_mean_odd', 'transit_depths', 'transit_depths_uncertainties', 'rp_rs', 'snr', 'snr_per_transit', 'snr_pink_per_transit', 'odd_even_mismatch', 'transit_times', 'per_transit_count', 'transit_count', 'distinct_transit_count', 'empty_transit_count', 'FAP', 'in_transit_count', 'after_transit_count', 'before_transit_count', 'periods', 'power', 'power_raw', 'SR', 'chi2', 'chi2red', 'model_lightcurve_time', 'model_lightcurve_model', 'model_folded_phase', 'folded_y', 'folded_dy', 'folded_phase', 'model_folded_model'])
The descriptions are here:
https://transitleastsquares.readthedocs.io/en/latest/Python%20interface.html#return-values
The remaining resultdict is:
resultdict = { 'tlsresult':tlsresult, 'bestperiod': the best period value in the periodogram, 'bestlspval': the peak associated with the best period, 'nbestpeaks': the input value of nbestpeaks, 'nbestlspvals': nbestpeaks-size list of best period peak values, 'nbestperiods': nbestpeaks-size list of best periods, 'lspvals': the full array of periodogram powers, 'periods': the full array of periods considered, 'tlsresult': Astropy tls result object (BoxLeastSquaresResult), 'tlsmodel': Astropy tls BoxLeastSquares object used for work, 'method':'tls' -> the name of the period-finder method, 'kwargs':{ dict of all of the input kwargs for record-keeping} }
Return type: dict
astrobase.periodbase.spdm module¶
Contains the Stellingwerf (1978) phase-dispersion minimization period-search algorithm implementation for periodbase.
-
astrobase.periodbase.spdm.
stellingwerf_pdm_theta
(times, mags, errs, frequency, binsize=0.05, minbin=9)[source]¶ This calculates the Stellingwerf PDM theta value at a test frequency.
Parameters: - times,mags,errs (np.array) – The input time-series and associated errors.
- frequency (float) – The test frequency to calculate the theta statistic at.
- binsize (float) – The phase bin size to use.
- minbin (int) – The minimum number of items in a phase bin to consider in the calculation of the statistic.
Returns: theta_pdm – The value of the theta statistic at the specified frequency.
Return type: float
-
astrobase.periodbase.spdm.
stellingwerf_pdm
(times, mags, errs, magsarefluxes=False, startp=None, endp=None, stepsize=0.0001, autofreq=True, normalize=False, phasebinsize=0.05, mindetperbin=9, nbestpeaks=5, periodepsilon=0.1, sigclip=10.0, nworkers=None, verbose=True)[source]¶ This runs a parallelized Stellingwerf phase-dispersion minimization (PDM) period search.
Parameters: - times,mags,errs (np.array) – The mag/flux time-series with associated measurement errors to run the period-finding on.
- magsarefluxes (bool) – If the input measurement values in mags and errs are in fluxes, set this to True.
- startp,endp (float or None) – The minimum and maximum periods to consider for the transit search.
- stepsize (float) – The step-size in frequency to use when constructing a frequency grid for the period search.
- autofreq (bool) – If this is True, the value of stepsize will be ignored and the
astrobase.periodbase.get_frequency_grid()
function will be used to generate a frequency grid based on startp, and endp. If these are None as well, startp will be set to 0.1 and endp will be set to times.max() - times.min(). - normalize (bool) – This sets if the input time-series is normalized to 0.0 and rescaled such that its variance = 1.0. This is the recommended procedure by Schwarzenberg-Czerny 1996.
- phasebinsize (float) – The bin size in phase to use when calculating the PDM theta statistic at a test frequency.
- mindetperbin (int) – The minimum number of elements in a phase bin to consider it valid when calculating the PDM theta statistic at a test frequency.
- nbestpeaks (int) – The number of ‘best’ peaks to return from the periodogram results, starting from the global maximum of the periodogram peak values.
- periodepsilon (float) – The fractional difference between successive values of ‘best’ periods when sorting by periodogram power to consider them as separate periods (as opposed to part of the same periodogram peak). This is used to avoid broad peaks in the periodogram and make sure the ‘best’ periods returned are all actually independent.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- nworkers (int) – The number of parallel workers to use when calculating the periodogram.
- verbose (bool) – If this is True, will indicate progress and details about the frequency grid used for the period search.
Returns: This function returns a dict, referred to as an lspinfo dict in other astrobase functions that operate on periodogram results. This is a standardized format across all astrobase period-finders, and is of the form below:
{'bestperiod': the best period value in the periodogram, 'bestlspval': the periodogram peak associated with the best period, 'nbestpeaks': the input value of nbestpeaks, 'nbestlspvals': nbestpeaks-size list of best period peak values, 'nbestperiods': nbestpeaks-size list of best periods, 'lspvals': the full array of periodogram powers, 'periods': the full array of periods considered, 'method':'pdm' -> the name of the period-finder method, 'kwargs':{ dict of all of the input kwargs for record-keeping}}
Return type: dict
-
astrobase.periodbase.spdm.
analytic_false_alarm_probability
(lspinfo, times, conservative_nfreq_eff=True, peakvals=None, inplace=True)[source]¶ This returns the analytic false alarm probabilities for periodogram peak values.
FIXME: this doesn’t actually work. Fix later.
The calculation follows that on page 3 of Zechmeister & Kurster (2009):
FAP = 1 − [1 − Prob(z > z0)]**M
where:
M is the number of independent frequencies Prob(z > z0) is the probability of peak with value > z0 z0 is the peak value we're evaluating
For PDM, the Prob(z > z0) is described by the beta distribution, according to:
- Schwarzenberg-Czerny (1997; https://ui.adsabs.harvard.edu/#abs/1997ApJ…489..941S)
- Zalian, Chadid, and Stellingwerf (2013; http://adsabs.harvard.edu/abs/2014MNRAS.440…68Z)
This is given by:
beta( (N-B)/2, (B-1)/2; ((N-B)/(B-1))*theta_pdm )
Where:
N = number of observations B = number of phase bins
This translates to a scipy.stats call to the beta distribution CDF:
x = ((N-B)/(B-1))*theta_pdm_best prob_exceeds_val = scipy.stats.beta.cdf(x, (N-B)/2.0, (B-1.0)/2.0)
Which we can then plug into the false alarm prob eqn above with the calculation of M.
Parameters: - lspinfo (dict) – The dict returned by the
stellingwerf_pdm()
function. - times (np.array) – The times for which the periodogram result in
lspinfo
was calculated. - conservative_nfreq_eff (bool) –
If True, will follow the prescription given in Schwarzenberg-Czerny (2003):
http://adsabs.harvard.edu/abs/2003ASPC..292..383S
and estimate the effective number of independent frequences M_eff as:
min(N_obs, N_freq, DELTA_f/delta_f)
- peakvals (sequence or None) – The peak values for which to evaluate the false-alarm probability. If
None, will calculate this for each of the peak values in the
nbestpeaks
key of thelspinfo
dict. - inplace (bool) – If True, puts the results of the FAP calculation into the
lspinfo
dict as a list available aslspinfo['falsealarmprob']
.
Returns: The calculated false alarm probabilities for each of the peak values in
peakvals
.Return type: list
astrobase.periodbase.saov module¶
Contains the Schwarzenberg-Czerny Analysis of Variance period-search algorithm implementation for periodbase.
-
astrobase.periodbase.saov.
aov_theta
(times, mags, errs, frequency, binsize=0.05, minbin=9)[source]¶ Calculates the Schwarzenberg-Czerny AoV statistic at a test frequency.
Parameters: - times,mags,errs (np.array) – The input time-series and associated errors.
- frequency (float) – The test frequency to calculate the theta statistic at.
- binsize (float) – The phase bin size to use.
- minbin (int) – The minimum number of items in a phase bin to consider in the calculation of the statistic.
Returns: theta_aov – The value of the AoV statistic at the specified frequency.
Return type: float
-
astrobase.periodbase.saov.
aov_periodfind
(times, mags, errs, magsarefluxes=False, startp=None, endp=None, stepsize=0.0001, autofreq=True, normalize=True, phasebinsize=0.05, mindetperbin=9, nbestpeaks=5, periodepsilon=0.1, sigclip=10.0, nworkers=None, verbose=True)[source]¶ This runs a parallelized Analysis-of-Variance (AoV) period search.
NOTE: normalize = True here as recommended by Schwarzenberg-Czerny 1996, i.e. mags will be normalized to zero and rescaled so their variance = 1.0.
Parameters: - times,mags,errs (np.array) – The mag/flux time-series with associated measurement errors to run the period-finding on.
- magsarefluxes (bool) – If the input measurement values in mags and errs are in fluxes, set this to True.
- startp,endp (float or None) – The minimum and maximum periods to consider for the transit search.
- stepsize (float) – The step-size in frequency to use when constructing a frequency grid for the period search.
- autofreq (bool) – If this is True, the value of stepsize will be ignored and the
astrobase.periodbase.get_frequency_grid()
function will be used to generate a frequency grid based on startp, and endp. If these are None as well, startp will be set to 0.1 and endp will be set to times.max() - times.min(). - normalize (bool) – This sets if the input time-series is normalized to 0.0 and rescaled such that its variance = 1.0. This is the recommended procedure by Schwarzenberg-Czerny 1996.
- phasebinsize (float) – The bin size in phase to use when calculating the AoV theta statistic at a test frequency.
- mindetperbin (int) – The minimum number of elements in a phase bin to consider it valid when calculating the AoV theta statistic at a test frequency.
- nbestpeaks (int) – The number of ‘best’ peaks to return from the periodogram results, starting from the global maximum of the periodogram peak values.
- periodepsilon (float) – The fractional difference between successive values of ‘best’ periods when sorting by periodogram power to consider them as separate periods (as opposed to part of the same periodogram peak). This is used to avoid broad peaks in the periodogram and make sure the ‘best’ periods returned are all actually independent.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- nworkers (int) – The number of parallel workers to use when calculating the periodogram.
- verbose (bool) – If this is True, will indicate progress and details about the frequency grid used for the period search.
Returns: This function returns a dict, referred to as an lspinfo dict in other astrobase functions that operate on periodogram results. This is a standardized format across all astrobase period-finders, and is of the form below:
{'bestperiod': the best period value in the periodogram, 'bestlspval': the periodogram peak associated with the best period, 'nbestpeaks': the input value of nbestpeaks, 'nbestlspvals': nbestpeaks-size list of best period peak values, 'nbestperiods': nbestpeaks-size list of best periods, 'lspvals': the full array of periodogram powers, 'periods': the full array of periods considered, 'method':'aov' -> the name of the period-finder method, 'kwargs':{ dict of all of the input kwargs for record-keeping}}
Return type: dict
-
astrobase.periodbase.saov.
analytic_false_alarm_probability
(lspinfo, times, conservative_nfreq_eff=True, peakvals=None, inplace=True)[source]¶ This returns the analytic false alarm probabilities for periodogram peak values.
FIXME: this doesn’t actually work. Fix later.
The calculation follows that on page 3 of Zechmeister & Kurster (2009):
FAP = 1 − [1 − Prob(z > z0)]**M
where:
M is the number of independent frequencies Prob(z > z0) is the probability of peak with value > z0 z0 is the peak value we're evaluating
For AoV and AoV-harmonic, the Prob(z > z0) is described by the F distribution, according to:
- Schwarzenberg-Czerny (1997; https://ui.adsabs.harvard.edu/#abs/1997ApJ…489..941S)
This is given by:
F( (B-1), (N-B); theta_aov )
Where:
N = number of observations B = number of phase bins
This translates to a scipy.stats call to the F distribution CDF:
x = theta_aov_best prob_exceeds_val = scipy.stats.f.cdf(x, (B-1.0), (N-B))
Which we can then plug into the false alarm prob eqn above with the calculation of M.
Parameters: - lspinfo (dict) – The dict returned by the
aov_periodfind()
function. - times (np.array) – The times for which the periodogram result in
lspinfo
was calculated. - conservative_nfreq_eff (bool) –
If True, will follow the prescription given in Schwarzenberg-Czerny (2003):
http://adsabs.harvard.edu/abs/2003ASPC..292..383S
and estimate the effective number of independent frequences M_eff as:
min(N_obs, N_freq, DELTA_f/delta_f)
- peakvals (sequence or None) – The peak values for which to evaluate the false-alarm probability. If
None, will calculate this for each of the peak values in the
nbestpeaks
key of thelspinfo
dict. - inplace (bool) – If True, puts the results of the FAP calculation into the
lspinfo
dict as a list available aslspinfo['falsealarmprob']
.
Returns: The calculated false alarm probabilities for each of the peak values in
peakvals
.Return type: list
astrobase.periodbase.smav module¶
Contains the Schwarzenberg-Czerny Analysis of Variance period-search algorithm implementation for periodbase. This uses the multi-harmonic version presented in Schwarzenberg-Czerny (1996).
-
astrobase.periodbase.smav.
aovhm_theta
(times, mags, errs, frequency, nharmonics, magvariance)[source]¶ This calculates the harmonic AoV theta statistic for a frequency.
This is a mostly faithful translation of the inner loop in aovper.f90. See the following for details:
- http://users.camk.edu.pl/alex/
- Schwarzenberg-Czerny (1996)
Schwarzenberg-Czerny (1996) equation 11:
theta_prefactor = (K - 2N - 1)/(2N) theta_top = sum(c_n*c_n) (from n=0 to n=2N) theta_bot = variance(timeseries) - sum(c_n*c_n) (from n=0 to n=2N) theta = theta_prefactor * (theta_top/theta_bot) N = number of harmonics (nharmonics) K = length of time series (times.size)
Parameters: - times,mags,errs (np.array) – The input time-series to calculate the test statistic for. These should all be of nans/infs and be normalized to zero.
- frequency (float) – The test frequency to calculate the statistic for.
- nharmonics (int) – The number of harmonics to calculate up to.The recommended range is 4 to 8.
- magvariance (float) – This is the (weighted by errors) variance of the magnitude time series. We provide it as a pre-calculated value here so we don’t have to re-calculate it for every worker.
Returns: aov_harmonic_theta – THe value of the harmonic AoV theta for the specified test frequency.
Return type: float
-
astrobase.periodbase.smav.
aovhm_periodfind
(times, mags, errs, magsarefluxes=False, startp=None, endp=None, stepsize=0.0001, autofreq=True, normalize=True, nharmonics=6, nbestpeaks=5, periodepsilon=0.1, sigclip=10.0, nworkers=None, verbose=True)[source]¶ This runs a parallelized harmonic Analysis-of-Variance (AoV) period search.
NOTE: normalize = True here as recommended by Schwarzenberg-Czerny 1996, i.e. mags will be normalized to zero and rescaled so their variance = 1.0.
Parameters: - times,mags,errs (np.array) – The mag/flux time-series with associated measurement errors to run the period-finding on.
- magsarefluxes (bool) – If the input measurement values in mags and errs are in fluxes, set this to True.
- startp,endp (float or None) – The minimum and maximum periods to consider for the transit search.
- stepsize (float) – The step-size in frequency to use when constructing a frequency grid for the period search.
- autofreq (bool) – If this is True, the value of stepsize will be ignored and the
astrobase.periodbase.get_frequency_grid()
function will be used to generate a frequency grid based on startp, and endp. If these are None as well, startp will be set to 0.1 and endp will be set to times.max() - times.min(). - normalize (bool) – This sets if the input time-series is normalized to 0.0 and rescaled such that its variance = 1.0. This is the recommended procedure by Schwarzenberg-Czerny 1996.
- nharmonics (int) – The number of harmonics to use when calculating the AoV theta value at a test frequency. This should be between 4 and 8 in most cases.
- nbestpeaks (int) – The number of ‘best’ peaks to return from the periodogram results, starting from the global maximum of the periodogram peak values.
- periodepsilon (float) – The fractional difference between successive values of ‘best’ periods when sorting by periodogram power to consider them as separate periods (as opposed to part of the same periodogram peak). This is used to avoid broad peaks in the periodogram and make sure the ‘best’ periods returned are all actually independent.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- nworkers (int) – The number of parallel workers to use when calculating the periodogram.
- verbose (bool) – If this is True, will indicate progress and details about the frequency grid used for the period search.
Returns: This function returns a dict, referred to as an lspinfo dict in other astrobase functions that operate on periodogram results. This is a standardized format across all astrobase period-finders, and is of the form below:
{'bestperiod': the best period value in the periodogram, 'bestlspval': the periodogram peak associated with the best period, 'nbestpeaks': the input value of nbestpeaks, 'nbestlspvals': nbestpeaks-size list of best period peak values, 'nbestperiods': nbestpeaks-size list of best periods, 'lspvals': the full array of periodogram powers, 'periods': the full array of periods considered, 'method':'mav' -> the name of the period-finder method, 'kwargs':{ dict of all of the input kwargs for record-keeping}}
Return type: dict
-
astrobase.periodbase.smav.
analytic_false_alarm_probability
(lspinfo, times, conservative_nfreq_eff=True, peakvals=None, inplace=True)[source]¶ This returns the analytic false alarm probabilities for periodogram peak values.
FIXME: this doesn’t actually work. Fix later.
The calculation follows that on page 3 of Zechmeister & Kurster (2009):
FAP = 1 − [1 − Prob(z > z0)]**M
where:
M is the number of independent frequencies Prob(z > z0) is the probability of peak with value > z0 z0 is the peak value we're evaluating
For AoV and AoV-harmonic, the Prob(z > z0) is described by the F distribution, according to:
- Schwarzenberg-Czerny (1997; https://ui.adsabs.harvard.edu/#abs/1997ApJ…489..941S)
- Schwarzenberg-Czerny (1996; http://adsabs.harvard.edu/abs/1996ApJ…460L.107S)
This is given by:
F( 2N, K - 2N - 1; theta_aov )
Where:
N = number of harmonics used for AOV_harmonic K = number of observations
This translates to a scipy.stats call to the F distribution CDF:
x = theta_aov_best prob_exceeds_val = scipy.stats.f.cdf(x, 2N, K - 2N - 1)
Which we can then plug into the false alarm prob eqn above with the calculation of M.
Parameters: - lspinfo (dict) – The dict returned by the
aovhm_periodfind()
function. - times (np.array) – The times for which the periodogram result in
lspinfo
was calculated. - conservative_nfreq_eff (bool) –
If True, will follow the prescription given in Schwarzenberg-Czerny (2003):
http://adsabs.harvard.edu/abs/2003ASPC..292..383S
and estimate the effective number of independent frequences M_eff as:
min(N_obs, N_freq, DELTA_f/delta_f)
- peakvals (sequence or None) – The peak values for which to evaluate the false-alarm probability. If
None, will calculate this for each of the peak values in the
nbestpeaks
key of thelspinfo
dict. - inplace (bool) – If True, puts the results of the FAP calculation into the
lspinfo
dict as a list available aslspinfo['falsealarmprob']
.
Returns: The calculated false alarm probabilities for each of the peak values in
peakvals
.Return type: list
astrobase.periodbase.zgls module¶
Contains the Zechmeister & Kurster (2002) Generalized Lomb-Scargle period-search algorithm implementation for periodbase.
-
astrobase.periodbase.zgls.
generalized_lsp_value
(times, mags, errs, omega)[source]¶ Generalized LSP value for a single omega.
The relations used are:
P(w) = (1/YY) * (YC*YC/CC + YS*YS/SS) where: YC, YS, CC, and SS are all calculated at T and where: tan 2omegaT = 2*CS/(CC - SS) and where: Y = sum( w_i*y_i ) C = sum( w_i*cos(wT_i) ) S = sum( w_i*sin(wT_i) ) YY = sum( w_i*y_i*y_i ) - Y*Y YC = sum( w_i*y_i*cos(wT_i) ) - Y*C YS = sum( w_i*y_i*sin(wT_i) ) - Y*S CpC = sum( w_i*cos(w_T_i)*cos(w_T_i) ) CC = CpC - C*C SS = (1 - CpC) - S*S CS = sum( w_i*cos(w_T_i)*sin(w_T_i) ) - C*S
Parameters: - times,mags,errs (np.array) – The time-series to calculate the periodogram value for.
- omega (float) – The frequency to calculate the periodogram value at.
Returns: periodogramvalue – The normalized periodogram at the specified test frequency omega.
Return type: float
-
astrobase.periodbase.zgls.
generalized_lsp_value_withtau
(times, mags, errs, omega)[source]¶ Generalized LSP value for a single omega.
This uses tau to provide an arbitrary time-reference point.
The relations used are:
P(w) = (1/YY) * (YC*YC/CC + YS*YS/SS) where: YC, YS, CC, and SS are all calculated at T and where: tan 2omegaT = 2*CS/(CC - SS) and where: Y = sum( w_i*y_i ) C = sum( w_i*cos(wT_i) ) S = sum( w_i*sin(wT_i) ) YY = sum( w_i*y_i*y_i ) - Y*Y YC = sum( w_i*y_i*cos(wT_i) ) - Y*C YS = sum( w_i*y_i*sin(wT_i) ) - Y*S CpC = sum( w_i*cos(w_T_i)*cos(w_T_i) ) CC = CpC - C*C SS = (1 - CpC) - S*S CS = sum( w_i*cos(w_T_i)*sin(w_T_i) ) - C*S
Parameters: - times,mags,errs (np.array) – The time-series to calculate the periodogram value for.
- omega (float) – The frequency to calculate the periodogram value at.
Returns: periodogramvalue – The normalized periodogram at the specified test frequency omega.
Return type: float
-
astrobase.periodbase.zgls.
generalized_lsp_value_notau
(times, mags, errs, omega)[source]¶ This is the simplified version not using tau.
The relations used are:
W = sum (1.0/(errs*errs) ) w_i = (1/W)*(1/(errs*errs)) Y = sum( w_i*y_i ) C = sum( w_i*cos(wt_i) ) S = sum( w_i*sin(wt_i) ) YY = sum( w_i*y_i*y_i ) - Y*Y YC = sum( w_i*y_i*cos(wt_i) ) - Y*C YS = sum( w_i*y_i*sin(wt_i) ) - Y*S CpC = sum( w_i*cos(w_t_i)*cos(w_t_i) ) CC = CpC - C*C SS = (1 - CpC) - S*S CS = sum( w_i*cos(w_t_i)*sin(w_t_i) ) - C*S D(omega) = CC*SS - CS*CS P(omega) = (SS*YC*YC + CC*YS*YS - 2.0*CS*YC*YS)/(YY*D)
Parameters: - times,mags,errs (np.array) – The time-series to calculate the periodogram value for.
- omega (float) – The frequency to calculate the periodogram value at.
Returns: periodogramvalue – The normalized periodogram at the specified test frequency omega.
Return type: float
-
astrobase.periodbase.zgls.
specwindow_lsp_value
(times, mags, errs, omega)[source]¶ This calculates the peak associated with the spectral window function for times and at the specified omega.
NOTE: this is classical Lomb-Scargle, not the Generalized Lomb-Scargle. mags and errs are silently ignored since we’re calculating the periodogram of the observing window function. These are kept to present a consistent external API so the pgen_lsp function below can call this transparently.
Parameters: - times,mags,errs (np.array) – The time-series to calculate the periodogram value for.
- omega (float) – The frequency to calculate the periodogram value at.
Returns: periodogramvalue – The normalized periodogram at the specified test frequency omega.
Return type: float
-
astrobase.periodbase.zgls.
pgen_lsp
(times, mags, errs, magsarefluxes=False, startp=None, endp=None, stepsize=0.0001, autofreq=True, nbestpeaks=5, periodepsilon=0.1, sigclip=10.0, nworkers=None, workchunksize=None, glspfunc=<function _glsp_worker_withtau>, verbose=True)[source]¶ This calculates the generalized Lomb-Scargle periodogram.
Uses the algorithm from Zechmeister and Kurster (2009).
Parameters: - times,mags,errs (np.array) – The mag/flux time-series with associated measurement errors to run the period-finding on.
- magsarefluxes (bool) – If the input measurement values in mags and errs are in fluxes, set this to True.
- startp,endp (float or None) – The minimum and maximum periods to consider for the transit search.
- stepsize (float) – The step-size in frequency to use when constructing a frequency grid for the period search.
- autofreq (bool) – If this is True, the value of stepsize will be ignored and the
astrobase.periodbase.get_frequency_grid()
function will be used to generate a frequency grid based on startp, and endp. If these are None as well, startp will be set to 0.1 and endp will be set to times.max() - times.min(). - nbestpeaks (int) – The number of ‘best’ peaks to return from the periodogram results, starting from the global maximum of the periodogram peak values.
- periodepsilon (float) – The fractional difference between successive values of ‘best’ periods when sorting by periodogram power to consider them as separate periods (as opposed to part of the same periodogram peak). This is used to avoid broad peaks in the periodogram and make sure the ‘best’ periods returned are all actually independent.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- nworkers (int) – The number of parallel workers to use when calculating the periodogram.
- workchunksize (None or int) – If this is an int, will use chunks of the given size to break up the work for the parallel workers. If None, the chunk size is set to 1.
- glspfunc (Python function) – The worker function to use to calculate the periodogram. This can be used to make this function calculate the time-series sampling window function instead of the time-series measurements’ GLS periodogram by passing in _glsp_worker_specwindow instead of the default _glsp_worker_withtau function.
- verbose (bool) – If this is True, will indicate progress and details about the frequency grid used for the period search.
Returns: This function returns a dict, referred to as an lspinfo dict in other astrobase functions that operate on periodogram results. This is a standardized format across all astrobase period-finders, and is of the form below:
{'bestperiod': the best period value in the periodogram, 'bestlspval': the periodogram peak associated with the best period, 'nbestpeaks': the input value of nbestpeaks, 'nbestlspvals': nbestpeaks-size list of best period peak values, 'nbestperiods': nbestpeaks-size list of best periods, 'lspvals': the full array of periodogram powers, 'periods': the full array of periods considered, 'method':'gls' -> the name of the period-finder method, 'kwargs':{ dict of all of the input kwargs for record-keeping}}
Return type: dict
-
astrobase.periodbase.zgls.
specwindow_lsp
(times, mags, errs, magsarefluxes=False, startp=None, endp=None, stepsize=0.0001, autofreq=True, nbestpeaks=5, periodepsilon=0.1, sigclip=10.0, nworkers=None, glspfunc=<function _glsp_worker_specwindow>, verbose=True)[source]¶ This calculates the spectral window function.
Wraps the pgen_lsp function above to use the specific worker for calculating the window-function.
Parameters: - times,mags,errs (np.array) – The mag/flux time-series with associated measurement errors to run the period-finding on.
- magsarefluxes (bool) – If the input measurement values in mags and errs are in fluxes, set this to True.
- startp,endp (float or None) – The minimum and maximum periods to consider for the transit search.
- stepsize (float) – The step-size in frequency to use when constructing a frequency grid for the period search.
- autofreq (bool) – If this is True, the value of stepsize will be ignored and the
astrobase.periodbase.get_frequency_grid()
function will be used to generate a frequency grid based on startp, and endp. If these are None as well, startp will be set to 0.1 and endp will be set to times.max() - times.min(). - nbestpeaks (int) – The number of ‘best’ peaks to return from the periodogram results, starting from the global maximum of the periodogram peak values.
- periodepsilon (float) – The fractional difference between successive values of ‘best’ periods when sorting by periodogram power to consider them as separate periods (as opposed to part of the same periodogram peak). This is used to avoid broad peaks in the periodogram and make sure the ‘best’ periods returned are all actually independent.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- nworkers (int) – The number of parallel workers to use when calculating the periodogram.
- glspfunc (Python function) – The worker function to use to calculate the periodogram. This is used to used to make the pgen_lsp function calculate the time-series sampling window function instead of the time-series measurements’ GLS periodogram by passing in _glsp_worker_specwindow instead of the default _glsp_worker function.
- verbose (bool) – If this is True, will indicate progress and details about the frequency grid used for the period search.
Returns: This function returns a dict, referred to as an lspinfo dict in other astrobase functions that operate on periodogram results. This is a standardized format across all astrobase period-finders, and is of the form below:
{'bestperiod': the best period value in the periodogram, 'bestlspval': the periodogram peak associated with the best period, 'nbestpeaks': the input value of nbestpeaks, 'nbestlspvals': nbestpeaks-size list of best period peak values, 'nbestperiods': nbestpeaks-size list of best periods, 'lspvals': the full array of periodogram powers, 'periods': the full array of periods considered, 'method':'win' -> the name of the period-finder method, 'kwargs':{ dict of all of the input kwargs for record-keeping}}
Return type: dict
-
astrobase.periodbase.zgls.
probability_peak_exceeds_value
(times, peakval)[source]¶ This calculates the probability that periodogram values exceed the given peak value.
This is from page 3 of Zechmeister and Kurster (2009):
Prob(p > p_best) = (1 − p_best)**((N−3)/2)
where:
p_best is the peak value in consideration N is the number of times
Note that this is for the default normalization of the periodogram, e.g. P_normalized = P(omega), such that P represents the sample variance (see Table 1).
Parameters: - lspvals (np.array) – The periodogram power value array.
- peakval (float) – A single peak value to calculate the probability for.
Returns: prob – The probability value.
Return type: float
-
astrobase.periodbase.zgls.
analytic_false_alarm_probability
(lspinfo, times, conservative_nfreq_eff=True, peakvals=None, inplace=True)[source]¶ This returns the analytic false alarm probabilities for periodogram peak values.
The calculation follows that on page 3 of Zechmeister & Kurster (2009):
FAP = 1 − [1 − Prob(z > z0)]**M
where:
M is the number of independent frequencies Prob(z > z0) is the probability of peak with value > z0 z0 is the peak value we're evaluating
Parameters: - lspinfo (dict) – The dict returned by the
pgen_lsp()
function. - times (np.array) – The times for which the periodogram result in
lspinfo
was calculated. - conservative_nfreq_eff (bool) –
If True, will follow the prescription given in Schwarzenberg-Czerny (2003):
http://adsabs.harvard.edu/abs/2003ASPC..292..383S
and estimate the effective number of independent frequences M_eff as:
min(N_obs, N_freq, DELTA_f/delta_f)
- peakvals (sequence or None) – The peak values for which to evaluate the false-alarm probability. If
None, will calculate this for each of the peak values in the
nbestpeaks
key of thelspinfo
dict. - inplace (bool) – If True, puts the results of the FAP calculation into the
lspinfo
dict as a list available aslspinfo['falsealarmprob']
.
Returns: The calculated false alarm probabilities for each of the peak values in
peakvals
.Return type: list
- lspinfo (dict) – The dict returned by the
astrobase.periodbase.macf module¶
This contains the ACF period-finding algorithm from McQuillan+ 2013a and McQuillan+ 2014.
-
astrobase.periodbase.macf.
plot_acf_results
(acfp, outfile, maxlags=5000, yrange=(-0.4, 0.4))[source]¶ This plots the unsmoothed/smoothed ACF vs lag.
Parameters: - acfp (dict) – This is the dict returned from macf_period_find below.
- outfile (str) – The output file the plot will be written to.
- maxlags (int) – The maximum number of lags to include in the plot.
- yrange (sequence of two floats) – The y-range of the ACF vs. lag plot to use.
-
astrobase.periodbase.macf.
macf_period_find
(times, mags, errs, fillgaps=0.0, filterwindow=11, forcetimebin=None, maxlags=None, maxacfpeaks=10, smoothacf=21, smoothfunc=<function _smooth_acf_savgol>, smoothfunckwargs=None, magsarefluxes=False, sigclip=3.0, verbose=True, periodepsilon=0.1, nworkers=None, startp=None, endp=None, autofreq=None, stepsize=None)[source]¶ This finds periods using the McQuillan+ (2013a, 2014) ACF method.
The kwargs from periodepsilon to stepsize don’t do anything but are used to present a consistent API for all periodbase period-finders to an outside driver (e.g. the one in the checkplotserver).
Parameters: - times,mags,errs (np.array) – The input magnitude/flux time-series to run the period-finding for.
- fillgaps ('noiselevel' or float) – This sets what to use to fill in gaps in the time series. If this is ‘noiselevel’, will smooth the light curve using a point window size of filterwindow (this should be an odd integer), subtract the smoothed LC from the actual LC and estimate the RMS. This RMS will be used to fill in the gaps. Other useful values here are 0.0, and npnan.
- filterwindow (int) – The light curve’s smoothing filter window size to use if fillgaps=’noiselevel’.
- forcetimebin (None or float) – This is used to force a particular cadence in the light curve other than the automatically determined cadence. This effectively rebins the light curve to this cadence. This should be in the same time units as times.
- maxlags (None or int) – This is the maximum number of lags to calculate. If None, will calculate all lags.
- maxacfpeaks (int) – This is the maximum number of ACF peaks to use when finding the highest peak and obtaining a fit period.
- smoothacf (int) –
This is the number of points to use as the window size when smoothing the ACF with the smoothfunc. This should be an odd integer value. If this is None, will not smooth the ACF, but this will probably lead to finding spurious peaks in a generally noisy ACF.
For Kepler, a value between 21 and 51 seems to work fine. For ground based data, much larger values may be necessary: between 1001 and 2001 seem to work best for the HAT surveys. This is dependent on cadence, RMS of the light curve, the periods of the objects you’re looking for, and finally, any correlated noise in the light curve. Make a plot of the smoothed/unsmoothed ACF vs. lag using the result dict of this function and the plot_acf_results function above to see the identified ACF peaks and what kind of smoothing might be needed.
The value of smoothacf will also be used to figure out the interval to use when searching for local peaks in the ACF: this interval is 1/2 of the smoothacf value.
- smoothfunc (Python function) – This is the function that will be used to smooth the ACF. This should take at least one kwarg: ‘windowsize’. Other kwargs can be passed in using a dict provided in smoothfunckwargs. By default, this uses a Savitsky-Golay filter, a Gaussian filter is also provided but not used. Another good option would be an actual low-pass filter (generated using scipy.signal?) to remove all high frequency noise from the ACF.
- smoothfunckwargs (dict or None) – The dict of optional kwargs to pass in to the smoothfunc.
- magsarefluxes (bool) – If your input measurements in mags are actually fluxes instead of mags, set this is True.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- verbose (bool) – If True, will indicate progress and report errors.
Returns: Returns a dict with results. dict[‘bestperiod’] is the estimated best period and dict[‘fitperiodrms’] is its estimated error. Other interesting things in the output include:
- dict[‘acfresults’]: all results from calculating the ACF. in particular, the unsmoothed ACF might be of interest: dict[‘acfresults’][‘acf’] and dict[‘acfresults’][‘lags’].
- dict[‘lags’] and dict[‘acf’] contain the ACF after smoothing was applied.
- dict[‘periods’] and dict[‘lspvals’] can be used to construct a pseudo-periodogram.
- dict[‘naivebestperiod’] is obtained by multiplying the lag at the highest ACF peak with the cadence. This is usually close to the fit period (dict[‘fitbestperiod’]), which is calculated by doing a fit to the lags vs. peak index relation as in McQuillan+ 2014.
Return type: dict
This package contains parallelized implementations of several period-finding algorithms.
astrobase.lcfit package¶
Fitting routines for light curves. Includes:
astrobase.lcfit.sinusoidal.fourier_fit_magseries()
: fit an arbitrary order Fourier series to a magnitude/flux time series.astrobase.lcfit.nonphysical.spline_fit_magseries()
: fit a univariate cubic spline to a magnitude/flux time series with a specified spline knot fraction.astrobase.lcfit.nonphysical.savgol_fit_magseries()
: apply a Savitzky-Golay smoothing filter to a magnitude/flux time series, returning the resulting smoothed function as a “fit”.astrobase.lcfit.nonphysical.legendre_fit_magseries()
: fit a Legendre function of the specified order to the magnitude/flux time series.astrobase.lcfit.eclipses.gaussianeb_fit_magseries()
: fit a double inverted gaussian eclipsing binary model to the magnitude/flux time seriesastrobase.lcfit.transits.traptransit_fit_magseries()
: fit a trapezoid-shaped transit signal to the magnitude/flux time seriesastrobase.lcfit.transits.mandelagol_fit_magseries()
: fit a Mandel & Agol (2002) planet transit model to the flux time series.astrobase.lcfit.transits.mandelagol_and_line_fit_magseries()
: fit a Mandel & Agol 2002 model, + a local line to the flux time series.astrobase.lcfit.transits.fivetransitparam_fit_magseries()
: fit out a line around each transit window in the given light curve, and then fit the light curve for t0, period, a/Rstar, Rp/Rstar, and inclination.
Submodules¶
astrobase.lcfit.eclipses module¶
Light curve fitting routines for eclipsing binaries:
astrobase.lcfit.eclipses.gaussianeb_fit_magseries()
: fit a double inverted gaussian eclipsing binary model to the magnitude/flux time series
-
astrobase.lcfit.eclipses.
gaussianeb_fit_magseries
(times, mags, errs, ebparams, param_bounds=None, scale_errs_redchisq_unity=True, sigclip=10.0, plotfit=False, magsarefluxes=False, verbose=True, curve_fit_kwargs=None)[source]¶ This fits a double inverted gaussian EB model to a magnitude time series.
Parameters: - times,mags,errs (np.array) – The input mag/flux time-series to fit the EB model to.
- period (float) – The period to use for EB fit.
- ebparams (list of float) –
This is a list containing the eclipsing binary parameters:
ebparams = [period (time), epoch (time), pdepth (mags), pduration (phase), psdepthratio, secondaryphase]
period is the period in days.
epoch is the time of primary minimum in JD.
pdepth is the depth of the primary eclipse:
- for magnitudes -> pdepth should be < 0
- for fluxes -> pdepth should be > 0
pduration is the length of the primary eclipse in phase.
psdepthratio is the ratio of the secondary eclipse depth to that of the primary eclipse.
secondaryphase is the phase at which the minimum of the secondary eclipse is located. This effectively parameterizes eccentricity.
If epoch is None, this function will do an initial spline fit to find an approximate minimum of the phased light curve using the given period.
The pdepth provided is checked against the value of magsarefluxes. if magsarefluxes = True, the ebdepth is forced to be > 0; if magsarefluxes = False, the ebdepth is forced to be < 0.
- param_bounds (dict or None) –
This is a dict of the upper and lower bounds on each fit parameter. Should be of the form:
{'period': (lower_bound_period, upper_bound_period), 'epoch': (lower_bound_epoch, upper_bound_epoch), 'pdepth': (lower_bound_pdepth, upper_bound_pdepth), 'pduration': (lower_bound_pduration, upper_bound_pduration), 'psdepthratio': (lower_bound_psdepthratio, upper_bound_psdepthratio), 'secondaryphase': (lower_bound_secondaryphase, upper_bound_secondaryphase)}
- To indicate that a parameter is fixed, use ‘fixed’ instead of a tuple providing its lower and upper bounds as tuple.
- To indicate that a parameter has no bounds, don’t include it in the param_bounds dict.
If this is None, the default value of this kwarg will be:
{'period':(0.0,np.inf), # period is between 0 and inf 'epoch':(0.0, np.inf), # epoch is between 0 and inf 'pdepth':(-np.inf,np.inf), # pdepth is between -np.inf and np.inf 'pduration':(0.0,1.0), # pduration is between 0.0 and 1.0 'psdepthratio':(0.0,1.0), # psdepthratio is between 0.0 and 1.0 'secondaryphase':(0.0,1.0), # secondaryphase is between 0.0 and 1.0
- scale_errs_redchisq_unity (bool) – If True, the standard errors on the fit parameters will be scaled to
make the reduced chi-sq = 1.0. This sets the
absolute_sigma
kwarg for thescipy.optimize.curve_fit
function to False. - sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- magsarefluxes (bool) – If True, will treat the input values of mags as fluxes for purposes of plotting the fit and sig-clipping.
- plotfit (str or False) – If this is a string, this function will make a plot for the fit to the mag/flux time-series and writes the plot to the path specified here.
- ignoreinitfail (bool) – If this is True, ignores the initial failure to find a set of optimized Fourier parameters using the global optimization function and proceeds to do a least-squares fit anyway.
- verbose (bool) – If True, will indicate progress and warn of any problems.
- curve_fit_kwargs (dict or None) – If not None, this should be a dict containing extra kwargs to pass to the scipy.optimize.curve_fit function.
Returns: This function returns a dict containing the model fit parameters, the minimized chi-sq value and the reduced chi-sq value. The form of this dict is mostly standardized across all functions in this module:
{ 'fittype':'gaussianeb', 'fitinfo':{ 'initialparams':the initial EB params provided, 'finalparams':the final model fit EB params, 'finalparamerrs':formal errors in the params, 'fitmags': the model fit mags, 'fitepoch': the epoch of minimum light for the fit, }, 'fitchisq': the minimized value of the fit's chi-sq, 'fitredchisq':the reduced chi-sq value, 'fitplotfile': the output fit plot if fitplot is not None, 'magseries':{ 'times':input times in phase order of the model, 'phase':the phases of the model mags, 'mags':input mags/fluxes in the phase order of the model, 'errs':errs in the phase order of the model, 'magsarefluxes':input value of magsarefluxes kwarg } }
Return type: dict
astrobase.lcfit.nonphysical module¶
Light curve fitting routines for ‘non-physical’ models:
astrobase.lcfit.nonphysical.spline_fit_magseries()
: fit a univariate cubic spline to a magnitude/flux time series with a specified spline knot fraction.astrobase.lcfit.nonphysical.savgol_fit_magseries()
: apply a Savitzky-Golay smoothing filter to a magnitude/flux time series, returning the resulting smoothed function as a “fit”.astrobase.lcfit.nonphysical.legendre_fit_magseries()
: fit a Legendre function of the specified order to the magnitude/flux time series.
-
astrobase.lcfit.nonphysical.
spline_fit_magseries
(times, mags, errs, period, knotfraction=0.01, maxknots=30, sigclip=30.0, plotfit=False, ignoreinitfail=False, magsarefluxes=False, verbose=True)[source]¶ This fits a univariate cubic spline to the phased light curve.
This fit may be better than the Fourier fit for sharply variable objects, like EBs, so can be used to distinguish them from other types of variables.
Parameters: - times,mags,errs (np.array) – The input mag/flux time-series to fit a spline to.
- period (float) – The period to use for the spline fit.
- knotfraction (float) – The knot fraction is the number of internal knots to use for the spline. A value of 0.01 (or 1%) of the total number of non-nan observations appears to work quite well, without over-fitting. maxknots controls the maximum number of knots that will be allowed.
- maxknots (int) – The maximum number of knots that will be used even if knotfraction gives a value to use larger than maxknots. This helps dealing with over-fitting to short time-scale variations.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- magsarefluxes (bool) – If True, will treat the input values of mags as fluxes for purposes of plotting the fit and sig-clipping.
- plotfit (str or False) – If this is a string, this function will make a plot for the fit to the mag/flux time-series and writes the plot to the path specified here.
- ignoreinitfail (bool) – If this is True, ignores the initial failure to find a set of optimized Fourier parameters using the global optimization function and proceeds to do a least-squares fit anyway.
- verbose (bool) – If True, will indicate progress and warn of any problems.
Returns: This function returns a dict containing the model fit parameters, the minimized chi-sq value and the reduced chi-sq value. The form of this dict is mostly standardized across all functions in this module:
{ 'fittype':'spline', 'fitinfo':{ 'nknots': the number of knots used for the fit 'fitmags': the model fit mags, 'fitepoch': the epoch of minimum light for the fit, }, 'fitchisq': the minimized value of the fit's chi-sq, 'fitredchisq':the reduced chi-sq value, 'fitplotfile': the output fit plot if fitplot is not None, 'magseries':{ 'times':input times in phase order of the model, 'phase':the phases of the model mags, 'mags':input mags/fluxes in the phase order of the model, 'errs':errs in the phase order of the model, 'magsarefluxes':input value of magsarefluxes kwarg } }
Return type: dict
-
astrobase.lcfit.nonphysical.
savgol_fit_magseries
(times, mags, errs, period, windowlength=None, polydeg=2, sigclip=30.0, plotfit=False, magsarefluxes=False, verbose=True)[source]¶ Fit a Savitzky-Golay filter to the magnitude/flux time series.
SG fits successive sub-sets (windows) of adjacent data points with a low-order polynomial via least squares. At each point (magnitude), it returns the value of the polynomial at that magnitude’s time. This is made significantly cheaper than actually performing least squares for each window through linear algebra tricks that are possible when specifying the window size and polynomial order beforehand. Numerical Recipes Ch 14.8 gives an overview, Eq. 14.8.6 is what Scipy has implemented.
The idea behind Savitzky-Golay is to preserve higher moments (>=2) of the input data series than would be done by a simple moving window average.
Note that the filter assumes evenly spaced data, which magnitude time series are not. By pretending the data points are evenly spaced, we introduce an additional noise source in the function values. This is a relatively small noise source provided that the changes in the magnitude values across the full width of the N=windowlength point window is < sqrt(N/2) times the measurement noise on a single point.
TODO: - Find correct dof for reduced chi squared in savgol_fit_magseries
Parameters: - times,mags,errs (np.array) – The input mag/flux time-series to fit the Savitsky-Golay model to.
- period (float) – The period to use for the model fit.
- windowlength (None or int) – The length of the filter window (the number of coefficients). Must be either positive and odd, or None. (The window is the number of points to the left, and to the right, of whatever point is having a polynomial fit to it locally). Bigger windows at fixed polynomial order risk lowering the amplitude of sharp features. If None, this routine (arbitrarily) sets the windowlength for phased LCs to be either the number of finite data points divided by 300, or polydeg+3, whichever is bigger.
- polydeg (int) – This is the order of the polynomial used to fit the samples. Must be less than windowlength. “Higher-order filters do better at preserving feature heights and widths, but do less smoothing on broader features.” (Numerical Recipes).
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- magsarefluxes (bool) – If True, will treat the input values of mags as fluxes for purposes of plotting the fit and sig-clipping.
- plotfit (str or False) – If this is a string, this function will make a plot for the fit to the mag/flux time-series and writes the plot to the path specified here.
- ignoreinitfail (bool) – If this is True, ignores the initial failure to find a set of optimized Fourier parameters using the global optimization function and proceeds to do a least-squares fit anyway.
- verbose (bool) – If True, will indicate progress and warn of any problems.
Returns: This function returns a dict containing the model fit parameters, the minimized chi-sq value and the reduced chi-sq value. The form of this dict is mostly standardized across all functions in this module:
{ 'fittype':'savgol', 'fitinfo':{ 'windowlength': the window length used for the fit, 'polydeg':the polynomial degree used for the fit, 'fitmags': the model fit mags, 'fitepoch': the epoch of minimum light for the fit, }, 'fitchisq': the minimized value of the fit's chi-sq, 'fitredchisq':the reduced chi-sq value, 'fitplotfile': the output fit plot if fitplot is not None, 'magseries':{ 'times':input times in phase order of the model, 'phase':the phases of the model mags, 'mags':input mags/fluxes in the phase order of the model, 'errs':errs in the phase order of the model, 'magsarefluxes':input value of magsarefluxes kwarg } }
Return type: dict
-
astrobase.lcfit.nonphysical.
legendre_fit_magseries
(times, mags, errs, period, legendredeg=10, sigclip=30.0, plotfit=False, magsarefluxes=False, verbose=True)[source]¶ Fit an arbitrary-order Legendre series, via least squares, to the magnitude/flux time series.
This is a series of the form:
p(x) = c_0*L_0(x) + c_1*L_1(x) + c_2*L_2(x) + ... + c_n*L_n(x)
where L_i’s are Legendre polynomials (also called “Legendre functions of the first kind”) and c_i’s are the coefficients being fit.
This function is mainly just a wrapper to numpy.polynomial.legendre.Legendre.fit.
Parameters: - times,mags,errs (np.array) – The input mag/flux time-series to fit a Legendre series polynomial to.
- period (float) – The period to use for the Legendre fit.
- legendredeg (int) – This is n in the equation above, e.g. if you give n=5, you will get 6 coefficients. This number should be much less than the number of data points you are fitting.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- magsarefluxes (bool) – If True, will treat the input values of mags as fluxes for purposes of plotting the fit and sig-clipping.
- plotfit (str or False) – If this is a string, this function will make a plot for the fit to the mag/flux time-series and writes the plot to the path specified here.
- ignoreinitfail (bool) – If this is True, ignores the initial failure to find a set of optimized Fourier parameters using the global optimization function and proceeds to do a least-squares fit anyway.
- verbose (bool) – If True, will indicate progress and warn of any problems.
Returns: This function returns a dict containing the model fit parameters, the minimized chi-sq value and the reduced chi-sq value. The form of this dict is mostly standardized across all functions in this module:
{ 'fittype':'legendre', 'fitinfo':{ 'legendredeg': the Legendre polynomial degree used, 'fitmags': the model fit mags, 'fitepoch': the epoch of minimum light for the fit, }, 'fitchisq': the minimized value of the fit's chi-sq, 'fitredchisq':the reduced chi-sq value, 'fitplotfile': the output fit plot if fitplot is not None, 'magseries':{ 'times':input times in phase order of the model, 'phase':the phases of the model mags, 'mags':input mags/fluxes in the phase order of the model, 'errs':errs in the phase order of the model, 'magsarefluxes':input value of magsarefluxes kwarg } }
Return type: dict
astrobase.lcfit.sinusoidal module¶
Light curve fitting routines for sinusoidal models:
astrobase.lcfit.sinusoidal.fourier_fit_magseries()
: fit an arbitrary order Fourier series to a magnitude/flux time series.
-
astrobase.lcfit.sinusoidal.
fourier_fit_magseries
(times, mags, errs, period, fourierorder=None, fourierparams=None, fix_period=True, scale_errs_redchisq_unity=True, sigclip=3.0, magsarefluxes=False, plotfit=False, ignoreinitfail=True, verbose=True, curve_fit_kwargs=None)[source]¶ This fits a Fourier series to a mag/flux time series.
Parameters: - times,mags,errs (np.array) – The input mag/flux time-series to fit a Fourier cosine series to.
- period (float) – The period to use for the Fourier fit.
- fourierorder (None or int) – If this is an int, will be interpreted as the Fourier order of the series to fit to the input mag/flux times-series. If this is None and fourierparams is specified, fourierparams will be used directly to generate the fit Fourier series. If fourierparams is also None, this function will try to fit a Fourier cosine series of order 3 to the mag/flux time-series.
- fourierparams (list of floats or None) –
If this is specified as a list of floats, it must be of the form below:
[fourier_amp1, fourier_amp2, fourier_amp3,...,fourier_ampN, fourier_phase1, fourier_phase2, fourier_phase3,...,fourier_phaseN]
to specify a Fourier cosine series of order N. If this is None and fourierorder is specified, the Fourier order specified there will be used to construct the Fourier cosine series used to fit the input mag/flux time-series. If both are None, this function will try to fit a Fourier cosine series of order 3 to the input mag/flux time-series.
- fix_period (bool) – If True, will fix the period with fitting the sinusoidal function to the phased light curve.
- scale_errs_redchisq_unity (bool) – If True, the standard errors on the fit parameters will be scaled to
make the reduced chi-sq = 1.0. This sets the
absolute_sigma
kwarg for thescipy.optimize.curve_fit
function to False. - sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- magsarefluxes (bool) – If True, will treat the input values of mags as fluxes for purposes of plotting the fit and sig-clipping.
- plotfit (str or False) – If this is a string, this function will make a plot for the fit to the mag/flux time-series and writes the plot to the path specified here.
- ignoreinitfail (bool) – If this is True, ignores the initial failure to find a set of optimized Fourier parameters using the global optimization function and proceeds to do a least-squares fit anyway.
- verbose (bool) – If True, will indicate progress and warn of any problems.
- curve_fit_kwargs (dict or None) – If not None, this should be a dict containing extra kwargs to pass to the scipy.optimize.curve_fit function.
Returns: This function returns a dict containing the model fit parameters, the minimized chi-sq value and the reduced chi-sq value. The form of this dict is mostly standardized across all functions in this module:
{ 'fittype':'fourier', 'fitinfo':{ 'finalparams': the list of final model fit params, 'finalparamerrs': list of errs for each model fit param, 'fitmags': the model fit mags, 'fitperiod': the fit period if this wasn't set to fixed, 'fitepoch': this is times.min() for this fit type, 'actual_fitepoch': time of minimum light from fit model ... other fit function specific keys ... }, 'fitchisq': the minimized value of the fit's chi-sq, 'fitredchisq':the reduced chi-sq value, 'fitplotfile': the output fit plot if fitplot is not None, 'magseries':{ 'times':input times in phase order of the model, 'phase':the phases of the model mags, 'mags':input mags/fluxes in the phase order of the model, 'errs':errs in the phase order of the model, 'magsarefluxes':input value of magsarefluxes kwarg } }
NOTE: the returned value of ‘fitepoch’ in the ‘fitinfo’ dict returned by this function is the time value of the first observation since this is where the LC is folded for the fit procedure. To get the actual time of minimum epoch as calculated by a spline fit to the phased LC, use the key ‘actual_fitepoch’ in the ‘fitinfo’ dict.
Return type: dict
astrobase.lcfit.transits module¶
Fitting routines for planetary transits:
astrobase.lcfit.transits.traptransit_fit_magseries()
: fit a trapezoid-shaped transit signal to the magnitude/flux time seriesastrobase.lcfit.transits.mandelagol_fit_magseries()
: fit a Mandel & Agol (2002) planet transit model to the flux time series, fixing some parameters (e.g., eccentricity) and varying other parameters (e.g., t0, period, a/Rstar). Priors must be passed by user.astrobase.lcfit.transits.mandelagol_and_line_fit_magseries()
: fit a Mandel & Agol 2002 model, + a local line to the flux time series. Priors must be passed by user.astrobase.lcfit.transits.fivetransitparam_fit_magseries()
: fit out a line around each transit window in the given light curve, and then fit all the transits in the light curve for t0, period, a/Rstar, Rp/Rstar, and inclination. Fixes e to 0, and uses theoretical quadratic limb darkening coefficients in the bandpass given by the user. Figures out the priors, user only needs to pass stellar parameters instead.
-
astrobase.lcfit.transits.
traptransit_fit_magseries
(times, mags, errs, transitparams, param_bounds=None, scale_errs_redchisq_unity=True, sigclip=10.0, plotfit=False, magsarefluxes=False, verbose=True, curve_fit_kwargs=None)[source]¶ This fits a trapezoid transit model to a magnitude time series.
Parameters: - times,mags,errs (np.array) – The input mag/flux time-series to fit a trapezoid planet-transit model to.
- period (float) – The period to use for the model fit.
- transitparams (list of floats) –
These are initial parameters for the transit model fit. A list of the following form is required:
transitparams = [transit_period (time), transit_epoch (time), transit_depth (flux or mags), transit_duration (phase), ingress_duration (phase)]
- for magnitudes -> transit_depth should be < 0
- for fluxes -> transit_depth should be > 0
If transitepoch is None, this function will do an initial spline fit to find an approximate minimum of the phased light curve using the given period.
The transitdepth provided is checked against the value of magsarefluxes. if magsarefluxes = True, the transitdepth is forced to be > 0; if magsarefluxes = False, the transitdepth is forced to be < 0.
- param_bounds (dict or None) –
This is a dict of the upper and lower bounds on each fit parameter. Should be of the form:
{'period': (lower_bound_period, upper_bound_period), 'epoch': (lower_bound_epoch, upper_bound_epoch), 'depth': (lower_bound_depth, upper_bound_depth), 'duration': (lower_bound_duration, upper_bound_duration), 'ingressduration': (lower_bound_ingressduration, upper_bound_ingressduration)}
- To indicate that a parameter is fixed, use ‘fixed’ instead of a tuple providing its lower and upper bounds as tuple.
- To indicate that a parameter has no bounds, don’t include it in the param_bounds dict.
If this is None, the default value of this kwarg will be:
{'period':(0.0,np.inf), # period is between 0 and inf 'epoch':(0.0, np.inf), # epoch is between 0 and inf 'depth':(-np.inf,np.inf), # depth is between -np.inf and np.inf 'duration':(0.0,1.0), # duration is between 0.0 and 1.0 'ingressduration':(0.0,0.5)} # ingress duration between 0.0 and 0.5
- scale_errs_redchisq_unity (bool) – If True, the standard errors on the fit parameters will be scaled to
make the reduced chi-sq = 1.0. This sets the
absolute_sigma
kwarg for thescipy.optimize.curve_fit
function to False. - sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- magsarefluxes (bool) – If True, will treat the input values of mags as fluxes for purposes of plotting the fit and sig-clipping.
- plotfit (str or False) – If this is a string, this function will make a plot for the fit to the mag/flux time-series and writes the plot to the path specified here.
- ignoreinitfail (bool) – If this is True, ignores the initial failure to find a set of optimized Fourier parameters using the global optimization function and proceeds to do a least-squares fit anyway.
- verbose (bool) – If True, will indicate progress and warn of any problems.
- curve_fit_kwargs (dict or None) – If not None, this should be a dict containing extra kwargs to pass to the scipy.optimize.curve_fit function.
Returns: This function returns a dict containing the model fit parameters, the minimized chi-sq value and the reduced chi-sq value. The form of this dict is mostly standardized across all functions in this module:
{ 'fittype':'traptransit', 'fitinfo':{ 'initialparams':the initial transit params provided, 'finalparams':the final model fit transit params , 'finalparamerrs':formal errors in the params, 'fitmags': the model fit mags, 'fitepoch': the epoch of minimum light for the fit, 'ntransitpoints': the number of LC points in transit phase }, 'fitchisq': the minimized value of the fit's chi-sq, 'fitredchisq':the reduced chi-sq value, 'fitplotfile': the output fit plot if fitplot is not None, 'magseries':{ 'times':input times in phase order of the model, 'phase':the phases of the model mags, 'mags':input mags/fluxes in the phase order of the model, 'errs':errs in the phase order of the model, 'magsarefluxes':input value of magsarefluxes kwarg } }
Return type: dict
-
astrobase.lcfit.transits.
log_posterior_transit
(theta, params, model, t, flux, err_flux, priorbounds)[source]¶ Evaluate posterior probability given proposed model parameters and the observed flux timeseries.
-
astrobase.lcfit.transits.
log_posterior_transit_plus_line
(theta, params, model, t, flux, err_flux, priorbounds)[source]¶ Evaluate posterior probability given proposed model parameters and the observed flux timeseries.
-
astrobase.lcfit.transits.
mandelagol_fit_magseries
(times, mags, errs, fitparams, priorbounds, fixedparams, trueparams=None, burninpercent=0.3, plotcorner=False, samplesavpath=False, n_walkers=50, n_mcmc_steps=400, exp_time_minutes=2, eps=0.0001, skipsampling=False, overwriteexistingsamples=False, mcmcprogressbar=False, plotfit=False, magsarefluxes=False, sigclip=10.0, verbose=True, nworkers=4)[source]¶ This fits a Mandel & Agol (2002) planetary transit model to a flux time series. You can fit and fix whatever parameters you want.
It relies on Kreidberg (2015)’s BATMAN implementation for the transit model, emcee to sample the posterior (Foreman-Mackey et al 2013), corner to plot it, and h5py to save the samples. See e.g., Claret’s work for good guesses of star-appropriate limb-darkening parameters.
NOTE: this only works for flux time-series at the moment.
NOTE: Between the fitparams, priorbounds, and fixedparams dicts, you must specify all of the planetary transit parameters required by BATMAN: [‘t0’, ‘rp’, ‘sma’, ‘incl’, ‘u’, ‘rp’, ‘ecc’, ‘omega’, ‘period’], or the BATMAN model will fail to initialize.
Parameters: - times,mags,errs (np.array) – The input flux time-series to fit a Fourier cosine series to.
- fitparams (dict) –
This is the initial parameter guesses for MCMC, found e.g., by BLS. The key string format must not be changed, but any parameter can be either “fit” or “fixed”. If it is “fit”, it must have a corresponding prior. For example:
fitparams = {'t0':1325.9, 'rp':np.sqrt(fitd['transitdepth']), 'sma':6.17, 'incl':85, 'u':[0.3, 0.2]}
where ‘u’ is a list of the limb darkening parameters, Linear first, then quadratic. Quadratic limb darkening is the only form implemented.
- priorbounds (dict) –
This sets the lower & upper bounds on uniform prior, e.g.:
priorbounds = {'rp':(0.135, 0.145), 'u_linear':(0.3-1, 0.3+1), 'u_quad':(0.2-1, 0.2+1), 't0':(np.min(time), np.max(time)), 'sma':(6,6.4), 'incl':(80,90)}
- fixedparams (dict) –
This sets which parameters are fixed, and their values. For example:
fixedparams = {'ecc':0., 'omega':90., 'limb_dark':'quadratic', 'period':fitd['period'] }
limb_dark must be “quadratic”. It’s “fixed”, because once you choose your limb-darkening model, it’s fixed.
- trueparams (list of floats) – The true parameter values you’re fitting for, if they’re known (e.g., a known planet, or fake data). Only for plotting purposes.
- burninpercent (float) – The percent of MCMC samples to discard as burn-in.
- plotcorner (str or False) – If this is a str, points to the path of output corner plot that will be generated for this MCMC run.
- samplesavpath (str) – This must be provided so emcee can save its MCMC samples to disk as HDF5 files. This will set the path of the output HDF5file written.
- n_walkers (int) – The number of MCMC walkers to use.
- n_mcmc_steps (int) – The number of MCMC steps to take.
- exp_time_minutes (int) – Exposure time, in minutes, passed to transit model to smear observations.
- eps (float) – The radius of the n_walkers-dimensional Gaussian ball used to initialize the MCMC.
- skipsampling (bool) – If you’ve already collected MCMC samples, and you do not want any more sampling (e.g., just make the plots), set this to be True.
- overwriteexistingsamples (bool) – If you’ve collected samples, but you want to overwrite them, set this to True. Usually, it should be False, which appends samples to samplesavpath HDF5 file.
- mcmcprogressbar (bool) – If True, will show a progress bar for the MCMC process.
- plotfit (str or bool) – If a str, indicates the path of the output fit plot file. If False, no fit plot will be made.
- magsarefluxes (bool) – This indicates if the input measurements in mags are actually fluxes.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- verbose (bool) – If True, will indicate MCMC progress.
- nworkers (int) – The number of parallel workers to launch for MCMC.
Returns: This function returns a dict containing the model fit parameters and other fit information. The form of this dict is mostly standardized across all functions in this module:
{ 'fittype':'mandelagol', 'fitinfo':{ 'initialparams':the initial transit params provided, 'fixedparams':the fixed transit params provided, 'finalparams':the final model fit transit params, 'finalparamerrs':formal errors in the params, 'fitmags': the model fit mags, 'fitepoch': the epoch of minimum light for the fit, 'acceptancefraction': fraction of MCMC ensemble. low=bad. 'autocorrtime': if autocorrtime ~= n_mcmc_steps, not good. }, 'fitplotfile': the output fit plot if fitplot is not None, 'magseries':{ 'times':input times in phase order of the model, 'phase':the phases of the model mags, 'mags':input mags/fluxes in the phase order of the model, 'errs':errs in the phase order of the model, 'magsarefluxes':input value of magsarefluxes kwarg } }
Return type: dict
-
astrobase.lcfit.transits.
mandelagol_and_line_fit_magseries
(times, mags, errs, fitparams, priorbounds, fixedparams, trueparams=None, burninpercent=0.3, plotcorner=False, timeoffset=0, samplesavpath=False, n_walkers=50, n_mcmc_steps=400, exp_time_minutes=2, eps=0.0001, skipsampling=False, overwriteexistingsamples=False, mcmcprogressbar=False, plotfit=False, scatterxdata=None, scatteryaxes=None, magsarefluxes=True, sigclip=10.0, verbose=True, nworkers=4)[source]¶ The model fit by this function is: a Mandel & Agol (2002) transit, PLUS a line. You can fit and fix whatever parameters you want.
Typical use case: you want to measure transit times of individual SNR >~ 50 transits. You fix all the transit parameters except for the mid-time, and also fit for a line locally.
NOTE: this only works for flux time-series at the moment.
NOTE: Between the fitparams, priorbounds, and fixedparams dicts, you must specify all of the planetary transit parameters required by BATMAN and the parameters for the line fit: [‘t0’, ‘rp’, ‘sma’, ‘incl’, ‘u’, ‘rp’, ‘ecc’, ‘omega’, ‘period’, ‘poly_order0’, poly_order1’], or the BATMAN model will fail to initialize.
Parameters: - times,mags,errs (np.array) – The input flux time-series to fit a Fourier cosine series to.
- fitparams (dict) –
This is the initial parameter guesses for MCMC, found e.g., by BLS. The key string format must not be changed, but any parameter can be either “fit” or “fixed”. If it is “fit”, it must have a corresponding prior. For example:
fitparams = {'t0':1325.9, 'poly_order0':1, 'poly_order1':0.}
where t0 is the time of transit-center for a reference transit. poly_order0 corresponds to the intercept of the line, poly_order1 is the slope.
- priorbounds (dict) –
This sets the lower & upper bounds on uniform prior, e.g.:
priorbounds = {'t0':(np.min(time), np.max(time)), 'poly_order0':(0.5,1.5), 'poly_order1':(-0.5,0.5) }
- fixedparams (dict) –
This sets which parameters are fixed, and their values. For example:
fixedparams = {'ecc':0., 'omega':90., 'limb_dark':'quadratic', 'period':fitd['period'], 'rp':np.sqrt(fitd['transitdepth']), 'sma':6.17, 'incl':85, 'u':[0.3, 0.2]}
limb_dark must be “quadratic”. It’s “fixed”, because once you choose your limb-darkening model, it’s fixed.
- trueparams (list of floats) – The true parameter values you’re fitting for, if they’re known (e.g., a known planet, or fake data). Only for plotting purposes.
- burninpercent (float) – The percent of MCMC samples to discard as burn-in.
- plotcorner (str or False) – If this is a str, points to the path of output corner plot that will be generated for this MCMC run.
- timeoffset (float) – If input times are offset by some constant, and you want saved pickles to fix that.
- samplesavpath (str) – This must be provided so emcee can save its MCMC samples to disk as HDF5 files. This will set the path of the output HDF5file written.
- n_walkers (int) – The number of MCMC walkers to use.
- n_mcmc_steps (int) – The number of MCMC steps to take.
- exp_time_minutes (int) – Exposure time, in minutes, passed to transit model to smear observations.
- eps (float) – The radius of the n_walkers-dimensional Gaussian ball used to initialize the MCMC.
- skipsampling (bool) – If you’ve already collected MCMC samples, and you do not want any more sampling (e.g., just make the plots), set this to be True.
- overwriteexistingsamples (bool) – If you’ve collected samples, but you want to overwrite them, set this to True. Usually, it should be False, which appends samples to samplesavpath HDF5 file.
- mcmcprogressbar (bool) – If True, will show a progress bar for the MCMC process.
- plotfit (str or bool) – If a str, indicates the path of the output fit plot file. If False, no fit plot will be made.
- scatterxdata (np.array or None) – Use this to overplot x,y scatter points on the output model/data lightcurve (e.g., to highlight bad data, or to indicate an ephemeris), this can take a np.ndarray with the same units as times.
- scatteryaxes (np.array or None) – Use this to provide the y-values for scatterxdata, in units of fraction of an axis.
- magsarefluxes (bool) – This indicates if the input measurements in mags are actually fluxes.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- verbose (bool) – If True, will indicate MCMC progress.
- nworkers (int) – The number of parallel workers to launch for MCMC.
Returns: This function returns a dict containing the model fit parameters and other fit information. The form of this dict is mostly standardized across all functions in this module:
{ 'fittype':'mandelagol_and_line', 'fitinfo':{ 'initialparams':the initial transit params provided, 'fixedparams':the fixed transit params provided, 'finalparams':the final model fit transit params, 'finalparamerrs':formal errors in the params, 'fitmags': the model fit mags, 'fitepoch': the epoch of minimum light for the fit, 'acceptancefraction': fraction of MCMC ensemble. low=bad. 'autocorrtime': if autocorrtime ~= n_mcmc_steps, not good. }, 'fitplotfile': the output fit plot if fitplot is not None, 'magseries':{ 'times':input times in phase order of the model, 'phase':the phases of the model mags, 'mags':input mags/fluxes in the phase order of the model, 'errs':errs in the phase order of the model, 'magsarefluxes':input value of magsarefluxes kwarg } }
Return type: dict
-
astrobase.lcfit.transits.
fivetransitparam_fit_magseries
(times, mags, errs, teff, rstar, logg, identifier, fit_savdir, chain_savdir, n_mcmc_steps=1, overwriteexistingsamples=False, burninpercent=0.3, n_transit_durations=5, make_tlsfit_plot=True, exp_time_minutes=30, bandpass='tess', magsarefluxes=True, nworkers=32)[source]¶ Wrapper to mandelagol_fit_magseries that fits out a line around each transit window in the given light curve, and then fits the entire light curve for (t0, period, a/Rstar, Rp/Rstar, inclination). Fixes e to 0, and uses theoretical quadratic limb darkening coefficients in the bandpass given by the user, as found with the stellar parameters. Figures out the priors for you.
Typical use case: you have a light curve with >=2 transits in it. You want to fit the entire light curve for the parameters noted above, but you don’t want to need to manually determine all the priors.
Parameters: - times,mags,errs (np.array) – The input flux time-series to fit.
- teff,rstar,logg (float) – Stellar parameters [K, Rsun, cgs] used to get limb darkening coefficients.
- identifier (str) – String that goes into file names to identify the object being fit. E.g., fit CSV file will be at {fit_savdir}/{identifier}_fivetransitparam_fitresults.csv
- fit_savdir (str) – Path to directory where CSV results of fits, fit status files, and diagnostic plots are saved. If it doesn’t exist, this function tries to make it.
- chain_savdir (str) – Path to directory where MCMC chains are saved.
- n_mcmc_steps (int) – Number of steps to run MCMC. (Note: convergence not guaranteed).
- overwriteexistingsamples (bool) – If False, and finds pickle file with saved parameters (in fit_savdir), no additoinal MCMC sampling is done.
- exp_time_minutes (int) – Exposure time in minutes. Used for the model fitting.
- n_transit_durations (int) – The points used in the fit are only those within +/- N transit durations of each transit mid-point. This is to prevent excessive out-of-transit data being used in the fit (these points do not inform the model’s parameters).
Returns: (mafr, tlsr, is_converged) –
mafr
is the Mandel-Agol fit result dictionary, which contains the same information as frommandelagol_and_line_fit_magseries
. Fit parameters are accessed likemaf_empc_errs['fitinfo']['finalparams']['sma']
,tlsr
is the TLS result dictionary, containing keys documented inperiodbase/htls.tls_parallel_pfind
.is_converged : boolean for whether the fitting converged, according to the chain autocorrelation time.
Return type: tuple
astrobase.lcfit.utils module¶
This contains utilities for fitting routines in the rest of this subpackage.
-
astrobase.lcfit.utils.
get_phased_quantities
(stimes, smags, serrs, period)[source]¶ Does phase-folding for the mag/flux time-series given a period.
Given finite and sigma-clipped times, magnitudes, and errors, along with the period at which to phase-fold the data, perform the phase-folding and return the phase-folded values.
Parameters: - stimes,smags,serrs (np.array) – The sigma-clipped and finite input mag/flux time-series arrays to operate on.
- period (float) – The period to phase the mag/flux time-series at. stimes.min() is used as the epoch value to fold the times-series around.
Returns: (phase, pmags, perrs, ptimes, mintime) – The tuple returned contains the following items:
- phase: phase-sorted values of phase at each of stimes
- pmags: phase-sorted magnitudes at each phase
- perrs: phase-sorted errors
- ptimes: phase-sorted times
- mintime: earliest time in stimes.
Return type: tuple
-
astrobase.lcfit.utils.
make_fit_plot
(phase, pmags, perrs, fitmags, period, mintime, magseriesepoch, plotfit, magsarefluxes=False, wrap=False, model_over_lc=True, fitphase=None)[source]¶ This makes a plot of the LC model fit.
Parameters: - phase,pmags,perrs (np.array) – The actual mag/flux time-series.
- fitmags (np.array) – The model fit time-series.
- period (float) – The period at which the phased LC was generated.
- mintime (float) – The minimum time value.
- magseriesepoch (float) – The value of time around which the phased LC was folded.
- plotfit (str) – The name of a file to write the plot to.
- magsarefluxes (bool) – Set this to True if the values in pmags and fitmags are actually fluxes.
- wrap (bool) – If True, will wrap the phased LC around 0.0 to make some phased LCs easier to look at.
- model_over_lc (bool) – Usually, this function will plot the actual LC over the model LC. Set this to True to plot the model over the actual LC; this is most useful when you have a very dense light curve and want to be able to see how it follows the model.
- fitphase (optional np.array) – If passed, use this as x values for fitmags
Returns: Return type: Nothing.
-
astrobase.lcfit.utils.
iterative_fit
(data_x, data_y, init_coeffs, objective_func, objective_args=None, objective_kwargs=None, optimizer_func=<function least_squares>, optimizer_kwargs=None, optimizer_needs_scalar=False, objective_residualarr_func=None, fit_iterations=5, fit_reject_sigma=3.0, verbose=True, full_output=False)[source]¶ This is a function to run iterative fitting based on repeated sigma-clipping of fit outliers.
Parameters: - data_x (np.array) – Array of the independent variable.
- data_y (np.array) – Array of the dependent variable.
- init_coeffs – The initial values of the fit function coefficients.
- objective_func (Python function) –
A function that is used to calculate residuals between the model and the data_y array. This should have a signature similar to:
def objective_func(fit_coeffs, data_x, data_y, *objective_args, **objective_kwargs)
and return an array of residuals or a scalar value indicating some sort of sum of residuals (depending on what the optimizer function requires).
If this function returns a scalar value, you must set optimizer_needs_scalar to True, and provide a Python function in objective_residualarr_func that returns an array of residuals for each value of data_x and data_y given an array of fit coefficients.
- objective_args (tuple or None) – A tuple of arguments to pass into the objective_func.
- objective_kwargs (dict or None) – A dict of keyword arguments to pass into the objective_func.
- optimizer_func (Python function) –
The function that minimizes the residual between the model and the data_y array using the objective_func. This should have a signature similar to one of the optimizer functions in scipy.optimize, i.e.:
def optimizer_func(objective_func, initial_coeffs, args=(), kwargs={}, ...)
and return a scipy.optimize.OptimizeResult. We’ll rely on the
.success
attribute to determine if the EPD fit was successful, and the.x
attribute to get the values of the fit coefficients. - optimizer_kwargs (dict or None) – A dict of kwargs to pass into the optimizer_func function.
- optimizer_needs_scalar (bool) – If True, this indicates that the optimizer requires a scalar value to be returned from the objective_func. This is the case for scipy.optimize.minimize. If this is True, you must also provide a function in objective_residual_func.
- objective_residualarr_func (Python function) –
This is used in conjunction with optimizer_needs_scalar. The function provided here must return an array of residuals for each value of data_x and data_y given an array of fit coefficients. This is then used to calculate which points are outliers after a fit iteration. The function here must have the following signature:
def objective_residualarr_func(coeffs, data_x, data_y, *objective_args, **objective_kwargs)
- fit_iterations (int) – The number of iterations of the fit to perform while throwing out outliers to the fit.
- fit_reject_sigma (float) – The maximum deviation allowed to consider a data_y item as an outlier to the fit and to remove it from consideration in a successive iteration of the fit.
- verbose (bool) – If True, reports per iteration on the cost function value and the number of items remaining in data_x and data_y after sigma-clipping outliers.
- full_output (bool) – If True, returns the full output from the optimizer_func along with the resulting fit function coefficients.
Returns: result – If full_output was True, will return the fit coefficients np.array as the first element and the optimizer function fit output from the last iteration as the second element of a tuple. If full_output was False, will only return the final fit coefficients as an np.array.
Return type: np.array or tuple
astrobase.lcmath module¶
Contains various useful tools for calculating various things related to lightcurves (like phasing, sigma-clipping, finding and filling gaps, etc.)
-
astrobase.lcmath.
find_lc_timegroups
(lctimes, mingap=4.0)[source]¶ Finds gaps in the provided time-series and indexes them into groups.
This finds the gaps in the provided lctimes array, so we can figure out which times are for consecutive observations and which represent gaps between seasons or observing eras.
Parameters: - lctimes (array-like) – This contains the times to analyze for gaps; assumed to be some form of Julian date.
- mingap (float) – This defines how much the difference between consecutive measurements is allowed to be to consider them as parts of different timegroups. By default it is set to 4.0 days.
Returns: A tuple of the form: (ngroups, [slice(start_ind_1, end_ind_1), …]) is returned. This contains the number of groups as the first element, and a list of Python slice objects for each time-group found. These can be used directly to index into the array of times to quickly get measurements associated with each group.
Return type: tuple
-
astrobase.lcmath.
normalize_magseries
(times, mags, mingap=4.0, normto='globalmedian', magsarefluxes=False, debugmode=False)[source]¶ This normalizes the magnitude time-series to a specified value.
This is used to normalize time series measurements that may have large time gaps and vertical offsets in mag/flux measurement between these ‘timegroups’, either due to instrument changes or different filters.
NOTE: this works in-place! The mags array will be replaced with normalized mags when this function finishes.
Parameters: - times,mags (array-like) – The times (assumed to be some form of JD) and mags (or flux) measurements to be normalized.
- mingap (float) – This defines how much the difference between consecutive measurements is allowed to be to consider them as parts of different timegroups. By default it is set to 4.0 days.
- normto ({'globalmedian', 'zero'} or a float) –
Specifies the normalization type:
'globalmedian' -> norms each mag to the global median of the LC column 'zero' -> norms each mag to zero a float -> norms each mag to this specified float value.
- magsarefluxes (bool) –
Indicates if the input mags array is actually an array of flux measurements instead of magnitude measurements. If this is set to True, then:
- if normto is ‘zero’, then the median flux is divided from each observation’s flux value to yield normalized fluxes with 1.0 as the global median.
- if normto is ‘globalmedian’, then the global median flux value across the entire time series is multiplied with each measurement.
- if norm is set to a float, then this number is multiplied with the flux value for each measurement.
- debugmode (bool) – If this is True, will print out verbose info on each timegroup found.
Returns: times,normalized_mags – Normalized magnitude values after normalization. If normalization fails for some reason, times and normalized_mags will both be None.
Return type: np.arrays
-
astrobase.lcmath.
sigclip_magseries
(times, mags, errs, sigclip=None, iterative=False, niterations=None, meanormedian='median', magsarefluxes=False)[source]¶ Sigma-clips a magnitude or flux time-series.
Selects the finite times, magnitudes (or fluxes), and errors from the passed values, and apply symmetric or asymmetric sigma clipping to them.
Parameters: - times,mags,errs (np.array) –
The magnitude or flux time-series arrays to sigma-clip. This doesn’t assume all values are finite or if they’re positive/negative. All of these arrays will have their non-finite elements removed, and then will be sigma-clipped based on the arguments to this function.
errs is optional. Set it to None if you don’t have values for these. A ‘faked’ errs array will be generated if necessary, which can be ignored in the output as well.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- iterative (bool) – If this is set to True, will perform iterative sigma-clipping. If niterations is not set and this is True, sigma-clipping is iterated until no more points are removed.
- niterations (int) – The maximum number of iterations to perform for sigma-clipping. If None, the iterative arg takes precedence, and iterative=True will sigma-clip until no more points are removed. If niterations is not None and iterative is False, niterations takes precedence and iteration will occur for the specified number of iterations.
- meanormedian ({'mean', 'median'}) – Use ‘mean’ for sigma-clipping based on the mean value, or ‘median’ for sigma-clipping based on the median value. Default is ‘median’.
- magsareflux (bool) – True if your “mags” are in fact fluxes, i.e. if “fainter” corresponds to mags getting smaller.
Returns: (stimes, smags, serrs) – The sigma-clipped and nan-stripped time-series.
Return type: tuple
- times,mags,errs (np.array) –
-
astrobase.lcmath.
sigclip_magseries_with_extparams
(times, mags, errs, extparams, sigclip=None, iterative=False, magsarefluxes=False)[source]¶ Sigma-clips a magnitude or flux time-series and associated measurement arrays.
Selects the finite times, magnitudes (or fluxes), and errors from the passed values, and apply symmetric or asymmetric sigma clipping to them. Uses the same array indices as these values to filter out the values of all arrays in the extparams list. This can be useful for simultaneously sigma-clipping a magnitude/flux time-series along with their associated values of external parameters, such as telescope hour angle, zenith distance, temperature, moon phase, etc.
Parameters: - times,mags,errs (np.array) –
The magnitude or flux time-series arrays to sigma-clip. This doesn’t assume all values are finite or if they’re positive/negative. All of these arrays will have their non-finite elements removed, and then will be sigma-clipped based on the arguments to this function.
errs is optional. Set it to None if you don’t have values for these. A ‘faked’ errs array will be generated if necessary, which can be ignored in the output as well.
- extparams (list of np.array) – This is a list of all external parameter arrays to simultaneously filter along with the magnitude/flux time-series. All of these arrays should have the same length as the times, mags, and errs arrays.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- iterative (bool) – If this is set to True, will perform iterative sigma-clipping. If niterations is not set and this is True, sigma-clipping is iterated until no more points are removed.
- magsareflux (bool) – True if your “mags” are in fact fluxes, i.e. if “fainter” corresponds to mags getting smaller.
Returns: (stimes, smags, serrs) – The sigma-clipped and nan-stripped time-series in stimes, smags, serrs and the associated values of the extparams in sextparams.
Return type: tuple
- times,mags,errs (np.array) –
-
astrobase.lcmath.
phase_magseries
(times, mags, period, epoch, wrap=True, sort=True)[source]¶ Phases a magnitude/flux time-series using a given period and epoch.
The equation used is:
phase = (times - epoch)/period - floor((times - epoch)/period)
This phases the given magnitude timeseries using the given period and epoch. If wrap is True, wraps the result around 0.0 (and returns an array that has twice the number of the original elements). If sort is True, returns the magnitude timeseries in phase sorted order.
Parameters: - times,mags (np.array) – The magnitude/flux time-series values to phase using the provided period and epoch. Non-fiinite values will be removed.
- period (float) – The period to use to phase the time-series.
- epoch (float) – The epoch to phase the time-series. This is usually the time-of-minimum or time-of-maximum of some periodic light curve phenomenon. Alternatively, one can use the minimum time value in times.
- wrap (bool) – If this is True, the returned phased time-series will be wrapped around phase 0.0, which is useful for plotting purposes. The arrays returned will have twice the number of input elements because of this wrapping.
- sort (bool) – If this is True, the returned phased time-series will be sorted in increasing phase order.
Returns: A dict of the following form is returned:
{'phase': the phase values, 'mags': the mags/flux values at each phase, 'period': the input `period` used to phase the time-series, 'epoch': the input `epoch` used to phase the time-series}
Return type: dict
-
astrobase.lcmath.
phase_magseries_with_errs
(times, mags, errs, period, epoch, wrap=True, sort=True)[source]¶ Phases a magnitude/flux time-series using a given period and epoch.
The equation used is:
phase = (times - epoch)/period - floor((times - epoch)/period)
This phases the given magnitude timeseries using the given period and epoch. If wrap is True, wraps the result around 0.0 (and returns an array that has twice the number of the original elements). If sort is True, returns the magnitude timeseries in phase sorted order.
Parameters: - times,mags,errs (np.array) – The magnitude/flux time-series values and associated measurement errors to phase using the provided period and epoch. Non-fiinite values will be removed.
- period (float) – The period to use to phase the time-series.
- epoch (float) – The epoch to phase the time-series. This is usually the time-of-minimum or time-of-maximum of some periodic light curve phenomenon. Alternatively, one can use the minimum time value in times.
- wrap (bool) – If this is True, the returned phased time-series will be wrapped around phase 0.0, which is useful for plotting purposes. The arrays returned will have twice the number of input elements because of this wrapping.
- sort (bool) – If this is True, the returned phased time-series will be sorted in increasing phase order.
Returns: A dict of the following form is returned:
{'phase': the phase values, 'mags': the mags/flux values at each phase, 'errs': the err values at each phase, 'period': the input `period` used to phase the time-series, 'epoch': the input `epoch` used to phase the time-series}
Return type: dict
-
astrobase.lcmath.
time_bin_magseries
(times, mags, binsize=540.0, minbinelems=7)[source]¶ Bins the given mag/flux time-series in time using the bin size given.
Parameters: - times,mags (np.array) – The magnitude/flux time-series to bin in time. Non-finite elements will be removed from these arrays. At least 10 elements in each array are required for this function to operate.
- binsize (float) – The bin size to use to group together measurements closer than this amount in time. This is in seconds.
- minbinelems (int) – The minimum number of elements required per bin to include it in the output.
Returns: A dict of the following form is returned:
{'jdbin_indices': a list of the index arrays into the nan-filtered input arrays per each bin, 'jdbins': list of bin boundaries for each bin, 'nbins': the number of bins generated, 'binnedtimes': the time values associated with each time bin; this is the median of the times in each bin, 'binnedmags': the mag/flux values associated with each time bin; this is the median of the mags/fluxes in each bin}
Return type: dict
-
astrobase.lcmath.
time_bin_magseries_with_errs
(times, mags, errs, binsize=540.0, minbinelems=7)[source]¶ Bins the given mag/flux time-series in time using the bin size given.
Parameters: - times,mags,errs (np.array) – The magnitude/flux time-series and associated measurement errors to bin in time. Non-finite elements will be removed from these arrays. At least 10 elements in each array are required for this function to operate.
- binsize (float) – The bin size to use to group together measurements closer than this amount in time. This is in seconds.
- minbinelems (int) – The minimum number of elements required per bin to include it in the output.
Returns: A dict of the following form is returned:
{'jdbin_indices': a list of the index arrays into the nan-filtered input arrays per each bin, 'jdbins': list of bin boundaries for each bin, 'nbins': the number of bins generated, 'binnedtimes': the time values associated with each time bin; this is the median of the times in each bin, 'binnedmags': the mag/flux values associated with each time bin; this is the median of the mags/fluxes in each bin, 'binnederrs': the err values associated with each time bin; this is the median of the errs in each bin}
Return type: dict
-
astrobase.lcmath.
phase_bin_magseries
(phases, mags, binsize=0.005, minbinelems=7)[source]¶ Bins a phased magnitude/flux time-series using the bin size provided.
Parameters: - phases,mags (np.array) – The phased magnitude/flux time-series to bin in phase. Non-finite elements will be removed from these arrays. At least 10 elements in each array are required for this function to operate.
- binsize (float) – The bin size to use to group together measurements closer than this amount in phase. This is in units of phase.
- minbinelems (int) – The minimum number of elements required per bin to include it in the output.
Returns: A dict of the following form is returned:
{'phasebin_indices': a list of the index arrays into the nan-filtered input arrays per each bin, 'phasebins': list of bin boundaries for each bin, 'nbins': the number of bins generated, 'binnedphases': the phase values associated with each phase bin; this is the median of the phase value in each bin, 'binnedmags': the mag/flux values associated with each phase bin; this is the median of the mags/fluxes in each bin}
Return type: dict
-
astrobase.lcmath.
phase_bin_magseries_with_errs
(phases, mags, errs, binsize=0.005, minbinelems=7, weights=None)[source]¶ Bins a phased magnitude/flux time-series using the bin size provided.
Parameters: - phases,mags,errs (np.array) – The phased magnitude/flux time-series and associated errs to bin in phase. Non-finite elements will be removed from these arrays. At least 10 elements in each array are required for this function to operate.
- binsize (float) – The bin size to use to group together measurements closer than this amount in phase. This is in units of phase.
- minbinelems (int) – The minimum number of elements required per bin to include it in the output.
- weights (np.array or None) – Optional weight vector to be applied during binning. If if is passed,
np.average is used to bin, rather than np.median. A good choice
would be to pass
weights=1/errs**2
, to weight by the inverse variance.
Returns: A dict of the following form is returned:
{'phasebin_indices': a list of the index arrays into the nan-filtered input arrays per each bin, 'phasebins': list of bin boundaries for each bin, 'nbins': the number of bins generated, 'binnedphases': the phase values associated with each phase bin; this is the median of the phase value in each bin, 'binnedmags': the mag/flux values associated with each phase bin; this is the median of the mags/fluxes in each bin, 'binnederrs': the err values associated with each phase bin; this is the median of the errs in each bin}
Return type: dict
-
astrobase.lcmath.
fill_magseries_gaps
(times, mags, errs, fillgaps=0.0, sigclip=3.0, magsarefluxes=False, filterwindow=11, forcetimebin=None, verbose=True)[source]¶ This fills in gaps in a light curve.
This is mainly intended for use in ACF period-finding, but maybe useful otherwise (i.e. when we figure out ARMA stuff for LCs). The main steps here are:
- normalize the light curve to zero
- remove giant outliers
- interpolate gaps in the light curve (since ACF requires evenly spaced sampling)
From McQuillan+ 2013a (https://doi.org/10.1093/mnras/stt536):
“The ACF calculation requires the light curves to be regularly sampled and normalized to zero. We divided the flux in each quarter by its median and subtracted unity. Gaps in the light curve longer than the Kepler long cadence were filled using linear interpolation with added white Gaussian noise. This noise level was estimated using the variance of the residuals following subtraction of a smoothed version of the flux. To smooth the flux, we applied an iterative non-linear filter which consists of a median filter followed by a boxcar filter, both with 11-point windows, with iterative 3σ clipping of outliers.”Parameters: - times,mags,errs (np.array) – The magnitude/flux time-series and associated measurement errors to operate on. Non-finite elements will be removed from these arrays. At least 10 elements in each array are required for this function to operate.
- fillgaps ({'noiselevel', 'nan'} or float) – If fillgap=’noiselevel’, fills the gaps with the noise level obtained via the procedure above. If fillgaps=’nan’, fills the gaps with np.nan. Otherwise, if fillgaps is a float, will use that value to fill the gaps. The default is to fill the gaps with 0.0 (as in McQuillan+ 2014) to “…prevent them contributing to the ACF”.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- magsareflux (bool) – True if your “mags” are in fact fluxes, i.e. if “fainter” corresponds to mags getting smaller.
- filterwindow (int) – The number of time-series points to include in the Savitsky-Golay filter operation when smoothing the light curve. This should be an odd integer.
- forcetimebin (float or None) – If forcetimebin is a float, this value will be used to generate the interpolated time series, effectively binning the light curve to this cadence. If forcetimebin is None, the mode of the gaps (the forward difference between successive time values in times) in the provided light curve will be used as the effective cadence. NOTE: forcetimebin must be in the same units as times, e.g. if times are JD then forcetimebin must be in days as well
- verbose (bool) – If this is True, will indicate progress at various stages in the operation.
Returns: A dict of the following form is returned:
{'itimes': the interpolated time values after gap-filling, 'imags': the interpolated mag/flux values after gap-filling, 'ierrs': the interpolated mag/flux values after gap-filling, 'cadence': the cadence of the output mag/flux time-series}
Return type: dict
astrobase.lcmodels package¶
This contains various light curve models for variable stars. Useful for first order fits to distinguish between variable types, and for generating these variables’ light curves for a recovery simulation.
astrobase.lcmodels.transits
: trapezoid-shaped planetary transit light curves.astrobase.lcmodels.eclipses
: double inverted-gaussian shaped eclipsing binary light curves.astrobase.lcmodels.flares
: stellar flare model from Pitkin+ 2014.astrobase.lcmodels.sinusoidal
: sinusoidal light curve generation for pulsating variables.
Submodules¶
astrobase.lcmodels.eclipses module¶
This contains a double gaussian model for first order modeling of eclipsing binaries.
-
astrobase.lcmodels.eclipses.
invgauss_eclipses_func
(ebparams, times, mags, errs)[source]¶ This returns a double eclipse shaped function.
Suitable for first order modeling of eclipsing binaries.
Parameters: - ebparams (list of float) –
This contains the parameters for the eclipsing binary:
ebparams = [period (time), epoch (time), pdepth: primary eclipse depth (mags), pduration: primary eclipse duration (phase), psdepthratio: primary-secondary eclipse depth ratio, secondaryphase: center phase of the secondary eclipse]
period is the period in days.
epoch is the time of minimum in JD.
pdepth is the depth of the primary eclipse.
- for magnitudes -> pdepth should be < 0
- for fluxes -> pdepth should be > 0
pduration is the length of the primary eclipse in phase.
psdepthratio is the ratio in the eclipse depths: depth_secondary/depth_primary. This is generally the same as the ratio of the T_effs of the two stars.
secondaryphase is the phase at which the minimum of the secondary eclipse is located. This effectively parameterizes eccentricity.
All of these will then have fitted values after the fit is done.
- times,mags,errs (np.array) – The input time-series of measurements and associated errors for which the eclipse model will be generated. The times will be used to generate model mags, and the input times, mags, and errs will be resorted by model phase and returned.
Returns: (modelmags, phase, ptimes, pmags, perrs) – Returns the model mags and phase values. Also returns the input times, mags, and errs sorted by the model’s phase.
Return type: tuple
- ebparams (list of float) –
-
astrobase.lcmodels.eclipses.
invgauss_eclipses_curvefit_func
(times, period, epoch, pdepth, pduration, psdepthratio, secondaryphase, zerolevel=0.0, fixed_params=None)[source]¶ This is the inv-gauss eclipses function used with scipy.optimize.curve_fit.
Parameters: - times (np.array) – The array of times at which the model will be evaluated.
- period (float) – The period of the eclipsing binary.
- epoch (float) – The mid eclipse time of the primary eclipse. In the same units as times.
- pdepth (float) – The depth of the primary eclipse.
- pduration (float) – The duration of the primary eclipse. In units of phase.
- psdepthratio (float) – The ratio between the depths of the primary and secondary eclipse.
- secondaryphase (float) – The phase of the secondary eclipse.
- zerolevel (float) – The out of eclipse value of the model.
- fixed_params (dict or None) –
If this is provided, must be a dict containing the parameters to fix and their values. Should be of the form below:
{'period': fixed value, 'epoch': fixed value, 'pdepth': fixed value, 'pduration': fixed value, 'psdepthratio': fixed value, 'secondaryphase': fixed value}
Any parameter in the dict provided will have its parameter fixed to the provided value. This is best done with an application of functools.partial before passing the function to the scipy.optimize.curve_fit function, e.g.:
curvefit_func = functools.partial( eclipses.invgauss_eclipses_curvefit_func, zerolevel=np.median(mags), fixed_params={'secondaryphase':0.5}) fit_params, fit_cov = scipy.optimize.curve_fit( curvefit_func, times, mags, p0=initial_params, sigma=errs, ...)
Returns: model – Returns the transit model as an np.array. This is in the same order as the times input array.
Return type: np.array
-
astrobase.lcmodels.eclipses.
invgauss_eclipses_residual
(ebparams, times, mags, errs)[source]¶ This returns the residual between the modelmags and the actual mags.
Parameters: - ebparams (list of float) –
This contains the parameters for the eclipsing binary:
ebparams = [period (time), epoch (time), pdepth: primary eclipse depth (mags), pduration: primary eclipse duration (phase), psdepthratio: primary-secondary eclipse depth ratio, secondaryphase: center phase of the secondary eclipse]
period is the period in days.
epoch is the time of minimum in JD.
pdepth is the depth of the primary eclipse.
- for magnitudes -> pdepth should be < 0
- for fluxes -> pdepth should be > 0
pduration is the length of the primary eclipse in phase.
psdepthratio is the ratio in the eclipse depths: depth_secondary/depth_primary. This is generally the same as the ratio of the T_effs of the two stars.
secondaryphase is the phase at which the minimum of the secondary eclipse is located. This effectively parameterizes eccentricity.
All of these will then have fitted values after the fit is done.
- times,mags,errs (np.array) – The input time-series of measurements and associated errors for which the eclipse model will be generated. The times will be used to generate model mags, and the input times, mags, and errs will be resorted by model phase and returned.
Returns: The residuals between the input mags and generated modelmags, weighted by the measurement errors in errs.
Return type: np.array
- ebparams (list of float) –
astrobase.lcmodels.flares module¶
This contains a stellar flare model from Pitkin+ 2014.
http://adsabs.harvard.edu/abs/2014MNRAS.445.2268P
-
astrobase.lcmodels.flares.
flare_model
(flareparams, times, mags, errs)[source]¶ This is a flare model function, similar to Kowalski+ 2011.
From the paper by Pitkin+ 2014: http://adsabs.harvard.edu/abs/2014MNRAS.445.2268P
Parameters: - flareparams (list of float) –
This defines the flare model:
[amplitude, flare_peak_time, rise_gaussian_stdev, decay_time_constant]
where:
amplitude: the maximum flare amplitude in mags or flux. If flux, then amplitude should be positive. If mags, amplitude should be negative.
flare_peak_time: time at which the flare maximum happens.
rise_gaussian_stdev: the stdev of the gaussian describing the rise of the flare.
decay_time_constant: the time constant of the exponential fall of the flare.
- times,mags,errs (np.array) – The input time-series of measurements and associated errors for which the model will be generated. The times will be used to generate model mags.
Returns: (modelmags, times, mags, errs) – Returns the model mags evaluated at the input time values. Also returns the input times, mags, and errs.
Return type: tuple
- flareparams (list of float) –
-
astrobase.lcmodels.flares.
flare_model_residual
(flareparams, times, mags, errs)[source]¶ This returns the residual between model mags and the actual mags.
Parameters: - flareparams (list of float) –
This defines the flare model:
[amplitude, flare_peak_time, rise_gaussian_stdev, decay_time_constant]
where:
amplitude: the maximum flare amplitude in mags or flux. If flux, then amplitude should be positive. If mags, amplitude should be negative.
flare_peak_time: time at which the flare maximum happens.
rise_gaussian_stdev: the stdev of the gaussian describing the rise of the flare.
decay_time_constant: the time constant of the exponential fall of the flare.
- times,mags,errs (np.array) – The input time-series of measurements and associated errors for which the model will be generated. The times will be used to generate model mags.
Returns: The residuals between the input mags and generated modelmags, weighted by the measurement errors in errs.
Return type: np.array
- flareparams (list of float) –
astrobase.lcmodels.sinusoidal module¶
This contains models for sinusoidal light curves generated using Fourier expansion.
-
astrobase.lcmodels.sinusoidal.
fourier_sinusoidal_func
(fourierparams, times, mags, errs)[source]¶ This generates a sinusoidal light curve using a Fourier cosine series.
Parameters: - fourierparams (list) –
This MUST be a list of the following form like so:
[period, epoch, [amplitude_1, amplitude_2, amplitude_3, ..., amplitude_X], [phase_1, phase_2, phase_3, ..., phase_X]]
where X is the Fourier order.
- times,mags,errs (np.array) – The input time-series of measurements and associated errors for which the model will be generated. The times will be used to generate model mags, and the input times, mags, and errs will be resorted by model phase and returned.
Returns: (modelmags, phase, ptimes, pmags, perrs) – Returns the model mags and phase values. Also returns the input times, mags, and errs sorted by the model’s phase.
Return type: tuple
- fourierparams (list) –
-
astrobase.lcmodels.sinusoidal.
fourier_curvefit_func
(times, period, *fourier_coeffs, zerolevel=0.0, epoch=None, fixed_period=None)[source]¶ This is a function to be used with scipy.optimize.curve_fit.
Parameters: - times (np.array) – An array of times at which the model will be evaluated.
- period (float) – The period of the sinusoidal variability.
- fourier_coeffs (float) – These should be the amplitudes and phases of the sinusoidal series sum. 2N coefficients are required for Fourier order = N. The first N coefficients will be used as the amplitudes and the second N coefficients will be used as the phases.
- zerolevel (float) – The base level of the model.
- epoch (float or None) – The epoch to use to generate the phased light curve. If None, the minimum value of the times array will be used.
- fixed_period (float or None) – If not None, will indicate that the period is to be held fixed at the provided value.
Returns: model – Returns the sinusodial series sum model evaluated at each value of times.
Return type: np.array
-
astrobase.lcmodels.sinusoidal.
fourier_sinusoidal_residual
(fourierparams, times, mags, errs)[source]¶ This returns the residual between the model mags and the actual mags.
Parameters: - fourierparams (list) –
This MUST be a list of the following form like so:
[period, epoch, [amplitude_1, amplitude_2, amplitude_3, ..., amplitude_X], [phase_1, phase_2, phase_3, ..., phase_X]]
where X is the Fourier order.
- times,mags,errs (np.array) – The input time-series of measurements and associated errors for which the model will be generated. The times will be used to generate model mags, and the input times, mags, and errs will be resorted by model phase and returned.
Returns: The residuals between the input mags and generated modelmags, weighted by the measurement errors in errs.
Return type: np.array
- fourierparams (list) –
-
astrobase.lcmodels.sinusoidal.
sine_series_sum
(fourierparams, times, mags, errs)[source]¶ This generates a sinusoidal light curve using a Fourier sine series.
Parameters: - fourierparams (list) –
This MUST be a list of the following form like so:
[period, epoch, [amplitude_1, amplitude_2, amplitude_3, ..., amplitude_X], [phase_1, phase_2, phase_3, ..., phase_X]]
where X is the Fourier order.
- times,mags,errs (np.array) – The input time-series of measurements and associated errors for which the model will be generated. The times will be used to generate model mags, and the input times, mags, and errs will be resorted by model phase and returned.
Returns: (modelmags, phase, ptimes, pmags, perrs) – Returns the model mags and phase values. Also returns the input times, mags, and errs sorted by the model’s phase.
Return type: tuple
- fourierparams (list) –
astrobase.lcmodels.transits module¶
This contains a trapezoid model for first order model of planetary transits light curves.
-
astrobase.lcmodels.transits.
trapezoid_transit_func
(transitparams, times, mags, errs, get_ntransitpoints=False)[source]¶ This returns a trapezoid transit-shaped function.
Suitable for first order modeling of transit signals.
Parameters: - transitparams (list of float) –
This contains the transiting planet trapezoid model:
transitparams = [transitperiod (time), transitepoch (time), transitdepth (flux or mags), transitduration (phase), ingressduration (phase)]
All of these will then have fitted values after the fit is done.
- for magnitudes -> transitdepth should be < 0
- for fluxes -> transitdepth should be > 0
- times,mags,errs (np.array) – The input time-series of measurements and associated errors for which the transit model will be generated. The times will be used to generate model mags, and the input times, mags, and errs will be resorted by model phase and returned.
Returns: (modelmags, phase, ptimes, pmags, perrs) – Returns the model mags and phase values. Also returns the input times, mags, and errs sorted by the model’s phase.
Return type: tuple
- transitparams (list of float) –
-
astrobase.lcmodels.transits.
trapezoid_transit_curvefit_func
(times, period, epoch, depth, duration, ingressduration, zerolevel=0.0, fixed_params=None)[source]¶ This is the function used for scipy.optimize.curve_fit.
Parameters: - times (np.array) – The array of times used to construct the transit model.
- period (float) – The period of the transit.
- epoch (float) – The time of mid-transit (phase 0.0). Must be in the same units as times.
- depth (float) – The depth of the transit.
- duration (float) – The duration of the transit in phase units.
- ingressduration (float) – The ingress duration of the transit in phase units.
- zerolevel (float) – The level of the measurements outside transit.
- fixed_params (dict or None) –
If this is provided, must be a dict containing the parameters to fix and their values. Should be of the form below:
{'period': fixed value, 'epoch': fixed value, 'depth': fixed value, 'duration': fixed value, 'ingressduration': fixed value}
Any parameter in the dict provided will have its parameter fixed to the provided value. This is best done with an application of functools.partial before passing the function to the scipy.optimize.curve_fit function, e.g.:
curvefit_func = functools.partial( transits.trapezoid_transit_curvefit_func, zerolevel=np.median(mags), fixed_params={'ingressduration':0.05}) fit_params, fit_cov = scipy.optimize.curve_fit( curvefit_func, times, mags, p0=initial_params, sigma=errs, ...)
Returns: model – Returns the transit model as an np.array. This is in the same order as the times input array.
Return type: np.array
-
astrobase.lcmodels.transits.
trapezoid_transit_residual
(transitparams, times, mags, errs)[source]¶ This returns the residual between the modelmags and the actual mags.
Parameters: - transitparams (list of float) –
This contains the transiting planet trapezoid model:
transitparams = [transitperiod (time), transitepoch (time), transitdepth (flux or mags), transitduration (phase), ingressduration (phase)]
All of these will then have fitted values after the fit is done.
- for magnitudes -> transitdepth should be < 0
- for fluxes -> transitdepth should be > 0
- times,mags,errs (np.array) – The input time-series of measurements and associated errors for which the transit model will be generated. The times will be used to generate model mags, and the input times, mags, and errs will be resorted by model phase and returned.
Returns: The residuals between the input mags and generated modelmags, weighted by the measurement errors in errs.
Return type: np.array
- transitparams (list of float) –
astrobase.varbase package¶
Contains functions to deal with light curve variability, fitting functions, masking signals, autocorrelation, etc.
astrobase.varbase.autocorr
: calculating the autocorrelation function of light curves.astrobase.varbase.signals
: masking periodic signals, pre-whitening light curves.astrobase.varbase.transits
: light curve tools specifically for planetary transits.astrobase.varbase.trends
: tools for running External Parameter Decorrelation (EPD) on light curves.
FIXME: finish up the astrobase.varbase.flares
module to find flares in
LCs.
Submodules¶
astrobase.varbase.autocorr module¶
Calculates the autocorrelation for magnitude time series.
-
astrobase.varbase.autocorr.
_autocorr_func1
(mags, lag, maglen, magmed, magstd)[source]¶ Calculates the autocorr of mag series for specific lag.
This version of the function is taken from: Kim et al. (2011)
Parameters: - mags (np.array) – This is the magnitudes array. MUST NOT have any nans.
- lag (float) – The specific lag value to calculate the auto-correlation for. This MUST be less than total number of observations in mags.
- maglen (int) – The number of elements in the mags array.
- magmed (float) – The median of the mags array.
- magstd (float) – The standard deviation of the mags array.
Returns: The auto-correlation at this specific lag value.
Return type: float
-
astrobase.varbase.autocorr.
_autocorr_func2
(mags, lag, maglen, magmed, magstd)[source]¶ This is an alternative function to calculate the autocorrelation.
This version is from (first definition):
https://en.wikipedia.org/wiki/Correlogram#Estimation_of_autocorrelations
Parameters: - mags (np.array) – This is the magnitudes array. MUST NOT have any nans.
- lag (float) – The specific lag value to calculate the auto-correlation for. This MUST be less than total number of observations in mags.
- maglen (int) – The number of elements in the mags array.
- magmed (float) – The median of the mags array.
- magstd (float) – The standard deviation of the mags array.
Returns: The auto-correlation at this specific lag value.
Return type: float
-
astrobase.varbase.autocorr.
_autocorr_func3
(mags, lag, maglen, magmed, magstd)[source]¶ This is yet another alternative to calculate the autocorrelation.
Taken from: Bayesian Methods for Hackers by Cameron Pilon
(This should be the fastest method to calculate ACFs.)
Parameters: - mags (np.array) – This is the magnitudes array. MUST NOT have any nans.
- lag (float) – The specific lag value to calculate the auto-correlation for. This MUST be less than total number of observations in mags.
- maglen (int) – The number of elements in the mags array.
- magmed (float) – The median of the mags array.
- magstd (float) – The standard deviation of the mags array.
Returns: The auto-correlation at this specific lag value.
Return type: float
-
astrobase.varbase.autocorr.
autocorr_magseries
(times, mags, errs, maxlags=1000, func=<function _autocorr_func3>, fillgaps=0.0, filterwindow=11, forcetimebin=None, sigclip=3.0, magsarefluxes=False, verbose=True)[source]¶ This calculates the ACF of a light curve.
This will pre-process the light curve to fill in all the gaps and normalize everything to zero. If fillgaps = ‘noiselevel’, fills the gaps with the noise level obtained via the procedure above. If fillgaps = ‘nan’, fills the gaps with np.nan.
Parameters: - times,mags,errs (np.array) – The measurement time-series and associated errors.
- maxlags (int) – The maximum number of lags to calculate.
- func (Python function) – This is a function to calculate the lags.
- fillgaps ('noiselevel' or float) – This sets what to use to fill in gaps in the time series. If this is ‘noiselevel’, will smooth the light curve using a point window size of filterwindow (this should be an odd integer), subtract the smoothed LC from the actual LC and estimate the RMS. This RMS will be used to fill in the gaps. Other useful values here are 0.0, and npnan.
- filterwindow (int) – The light curve’s smoothing filter window size to use if fillgaps=’noiselevel’.
- forcetimebin (None or float) – This is used to force a particular cadence in the light curve other than the automatically determined cadence. This effectively rebins the light curve to this cadence. This should be in the same time units as times.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- magsarefluxes (bool) – If your input measurements in mags are actually fluxes instead of mags, set this is True.
- verbose (bool) – If True, will indicate progress and report errors.
Returns: A dict of the following form is returned:
{'itimes': the interpolated time values after gap-filling, 'imags': the interpolated mag/flux values after gap-filling, 'ierrs': the interpolated mag/flux values after gap-filling, 'cadence': the cadence of the output mag/flux time-series, 'minitime': the minimum value of the interpolated times array, 'lags': the lags used to calculate the auto-correlation function, 'acf': the value of the ACF at each lag used}
Return type: dict
astrobase.varbase.flares module¶
Contains functions to deal with finding stellar flares in time series.
FIXME: finish this module.
-
astrobase.varbase.flares.
add_flare_model
(flareparams, times, mags, errs)[source]¶ This adds a flare model function to the input magnitude/flux time-series.
Parameters: - flareparams (list of float) –
This defines the flare model:
[amplitude, flare_peak_time, rise_gaussian_stdev, decay_time_constant]
where:
amplitude: the maximum flare amplitude in mags or flux. If flux, then amplitude should be positive. If mags, amplitude should be negative.
flare_peak_time: time at which the flare maximum happens.
rise_gaussian_stdev: the stdev of the gaussian describing the rise of the flare.
decay_time_constant: the time constant of the exponential fall of the flare.
- times,mags,errs (np.array) – The input time-series of measurements and associated errors for which the model will be generated. The times will be used to generate model mags.
- magsarefluxes (bool) – Sets the correct direction of the flare amplitude (+ve) for fluxes if True and for mags (-ve) if False.
Returns: A dict of the form below is returned:
{'times': the original times array 'mags': the original mags + the flare model mags evaluated at times, 'errs': the original errs array, 'flareparams': the input list of flare params}
Return type: dict
- flareparams (list of float) –
-
astrobase.varbase.flares.
simple_flare_find
(times, mags, errs, smoothbinsize=97, flare_minsigma=4.0, flare_maxcadencediff=1, flare_mincadencepoints=3, magsarefluxes=False, savgol_polyorder=2, **savgol_kwargs)[source]¶ This finds flares in time series using the method in Walkowicz+ 2011.
FIXME: finish this.
Parameters: - times,mags,errs (np.array) – The input time-series to find flares in.
- smoothbinsize (int) – The number of consecutive light curve points to smooth over in the time series using a Savitsky-Golay filter. The smoothed light curve is then subtracted from the actual light curve to remove trends that potentially last smoothbinsize light curve points. The default value is chosen as ~6.5 hours (97 x 4 minute cadence for HATNet/HATSouth).
- flare_minsigma (float) – The minimum sigma above the median LC level to designate points as belonging to possible flares.
- flare_maxcadencediff (int) – The maximum number of light curve points apart each possible flare event measurement is allowed to be. If this is 1, then we’ll look for consecutive measurements.
- flare_mincadencepoints (int) – The minimum number of light curve points (each flare_maxcadencediff points apart) required that are at least flare_minsigma above the median light curve level to call an event a flare.
- magsarefluxes (bool) – If True, indicates that mags is actually an array of fluxes.
- savgol_polyorder (int) – The polynomial order of the function used by the Savitsky-Golay filter.
- savgol_kwargs (extra kwargs) – Any remaining keyword arguments are passed directly to the savgol_filter function from scipy.signal.
Returns: (nflares, flare_indices) – Returns the total number of flares found and their time-indices (start, end) as tuples.
Return type: tuple
astrobase.varbase.signals module¶
Contains functions to deal with masking and removing periodic signals in light curves.
-
astrobase.varbase.signals.
prewhiten_magseries
(times, mags, errs, whitenperiod, whitenparams, sigclip=3.0, magsarefluxes=False, plotfit=None, plotfitphasedlconly=True, rescaletomedian=True)[source]¶ Removes a periodic sinusoidal signal generated using whitenparams from the input magnitude time series.
Parameters: - times,mags,errs (np.array) – The input mag/flux time-series to prewhiten.
- whitenperiod (float) – The period of the sinusoidal signal to remove.
- whitenparams (list of floats) –
This contains the Fourier amplitude and phase coefficients of the sinusoidal signal to remove:
[ampl_1, ampl_2, ampl_3, ..., ampl_X, pha_1, pha_2, pha_3, ..., pha_X]
where X is the Fourier order. These are usually the output of a previous Fourier fit to the light curve (from
astrobase.lcfit.sinusoidal.fourier_fit_magseries()
for example). - sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- magsarefluxes (bool) – If True, will treat the input values of mags as fluxes for purposes of plotting the fit and sig-clipping.
- plotfit (str or False) – If this is a string, this function will make a plot showing the effect of the pre-whitening on the mag/flux time-series and write the plot to the path specified here.
- plotfitphasedlconly (bool) – If True, will plot only the phased LC for showing the effect of pre-whitening, and skip plotting the unphased LC.
- rescaletomedian (bool) – If this is True, then we add back the constant median term of the magnitudes to the final pre-whitened mag series.
Returns: Returns a dict of the form:
{'wtimes':times array after pre-whitening, 'wphase':phase array after pre-whitening, 'wmags':mags array after pre-whitening, 'werrs':errs array after pre-whitening, 'whitenparams':the input pre-whitening params used, 'whitenperiod':the input pre-whitening period used, 'fitplotfile':the output plot file if plotfit was set}
Return type: dict
-
astrobase.varbase.signals.
gls_prewhiten
(times, mags, errs, fourierorder=3, initfparams=None, startp_gls=None, endp_gls=None, stepsize=0.0001, autofreq=True, sigclip=30.0, magsarefluxes=False, nbestpeaks=5, nworkers=4, plotfits=None)[source]¶ Iterative pre-whitening of a magnitude series using the L-S periodogram.
This finds the best period, fits a fourier series with the best period, then whitens the time series with the best period, and repeats until nbestpeaks are done.
Parameters: - times,mags,errs (np.array) – The input mag/flux time-series to iteratively pre-whiten.
- fourierorder (int) – The Fourier order of the sinusoidal signal to fit to the time-series and iteratively remove.
- initfparams (list or None) –
If this is provided, should be a list of Fourier amplitudes and phases in the following format:
[ampl_1, ampl_2, ampl_3, ..., ampl_X, pha_1, pha_2, pha_3, ..., pha_X]
where X is the Fourier order. These are usually the output of a previous Fourier fit to the light curve (from
astrobase.lcfit.sinusoidal.fourier_fit_magseries()
for example). You MUST provide ONE of fourierorder and initfparams, but not both. If both are provided or both are None, a sinusoidal signal of Fourier order 3 will be used by default. - endp_gls (startp_gls,) – If these are provided, will serve as input to the Generalized Lomb-Scargle function that will attempt to find the best nbestpeaks periods in the time-series. These set the minimum and maximum period to search for in the time-series.
- stepsize (float) – The step-size in frequency to use when constructing a frequency grid for the period search.
- autofreq (bool) – If this is True, the value of stepsize will be ignored and the
astrobase.periodbase.get_frequency_grid()
function will be used to generate a frequency grid based on startp, and endp. If these are None as well, startp will be set to 0.1 and endp will be set to times.max() - times.min(). - sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- magsarefluxes (bool) – If the input measurement values in mags and errs are in fluxes, set this to True.
- nbestpeaks (int) – The number of ‘best’ peaks to return from the periodogram results, starting from the global maximum of the periodogram peak values.
- nworkers (int) – The number of parallel workers to use when calculating the periodogram.
- plotfits (None or str) – If this is a str, should indicate the file to which a plot of the successive iterations of pre-whitening will be written to. This will contain a row of plots indicating the before/after states of the light curves for each round of pre-whitening.
Returns: (bestperiods, plotfile) – This returns a list of the best periods (with the “highest” peak in the periodogram) after each round of pre-whitening is done. If plotfit is a str, will also return the path to the generated plot file.
Return type: tuple
-
astrobase.varbase.signals.
mask_signal
(times, mags, errs, signalperiod, signalepoch, magsarefluxes=False, maskphases=(0, 0, 0.5, 1.0), maskphaselength=0.1, plotfit=None, plotfitphasedlconly=True, sigclip=30.0)[source]¶ This removes repeating signals in the magnitude time series.
Useful for masking planetary transit signals in light curves to search for other variability.
A small worked example of using this and prewhiten_magseries above:
https://github.com/waqasbhatti/astrobase/issues/77#issuecomment-463803558
Parameters: - times,mags,errs (np.array) – The input mag/flux time-series to run the masking on.
- signalperiod (float) – The period of the signal to mask.
- signalepoch (float) – The epoch of the signal to mask.
- magsarefluxes (bool) – Set to True if mags is actually an array of fluxes.
- maskphases (sequence of floats) – This defines which phase values will be masked. For each item in this sequence, this function will mask a length of phase given by maskphaselength centered on each maskphases value, and remove all LC points in these regions from the light curve.
- maskphaselength (float) – The length in phase to mask for each phase value provided in maskphases.
- plotfit (str or None) – If provided as a str, indicates the output plot file.
- plotfitphasedlconly (bool) – If True, will only plot the effect of masking the signal as requested on the phased LC. If False, will also plot the unphased LC.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
astrobase.varbase.transits module¶
Contains tools for analyzing transits.
-
astrobase.varbase.transits.
transit_duration_range
(period, min_radius_hint, max_radius_hint)[source]¶ This figures out the minimum and max transit duration (q) given a period and min/max stellar radius hints.
One can get stellar radii from various places:
- GAIA distances and luminosities
- the TESS input catalog
- isochrone fits
The equation used is:
q ~ 0.076 x R**(2/3) x P**(-2/3) P = period in days R = stellar radius in solar radii
Parameters: - period (float) – The orbital period of the transiting planet.
- min_radius_hint,max_radius_hint (float) – The minimum and maximum radii of the star the planet is orbiting around.
Returns: (min_transit_duration, max_transit_duration) – The returned tuple contains the minimum and maximum transit durations allowed for the orbital geometry of this planetary system. These can be used with the BLS period-search functions in
astrobase.periodbase.kbls
orastrobase.periodbase.abls
to refine the period-search to only physically possible transit durations.Return type: tuple
-
astrobase.varbase.transits.
get_snr_of_dip
(times, mags, modeltimes, modelmags, atol_normalization=1e-08, indsforrms=None, magsarefluxes=False, verbose=True, transitdepth=None, npoints_in_transit=None)[source]¶ Calculate the total SNR of a transit assuming gaussian uncertainties.
modelmags gets interpolated onto the cadence of mags. The noise is calculated as the 1-sigma std deviation of the residual (see below).
Following Carter et al. 2009:
Q = sqrt( Γ T ) * δ / σ
for Q the total SNR of the transit in the r->0 limit, where:
r = Rp/Rstar, T = transit duration, δ = transit depth, σ = RMS of the lightcurve in transit. Γ = sampling rate
Thus Γ * T is roughly the number of points obtained during transit. (This doesn’t correctly account for the SNR during ingress/egress, but this is a second-order correction).
Note this is the same total SNR as described by e.g., Kovacs et al. 2002, their Equation 11.
NOTE: this only works with fluxes at the moment.
Parameters: - times,mags (np.array) – The input flux time-series to process.
- modeltimes,modelmags (np.array) – A transiting planet model, either from BLS, a trapezoid model, or a Mandel-Agol model.
- atol_normalization (float) – The absolute tolerance to which the median of the passed model fluxes must be equal to 1.
- indsforrms (np.array) – A array of bools of len(mags) used to select points for the RMS measurement. If not passed, the RMS of the entire passed timeseries is used as an approximation. Genearlly, it’s best to use out of transit points, so the RMS measurement is not model-dependent.
- magsarefluxes (bool) – Currently forced to be True because this function only works with fluxes.
- verbose (bool) – If True, indicates progress and warns about problems.
- transitdepth (float or None) – If the transit depth is known, pass it in here. Otherwise, it is calculated assuming OOT flux is 1.
- npoints_in_transits (int or None) – If the number of points in transit is known, pass it in here. Otherwise, the function will guess at this value.
Returns: (snr, transit_depth, noise) – The returned tuple contains the calculated SNR, transit depth, and noise of the residual lightcurve calculated using the relation described above.
Return type: tuple
-
astrobase.varbase.transits.
estimate_achievable_tmid_precision
(snr, t_ingress_min=10, t_duration_hr=2.14)[source]¶ Using Carter et al. 2009’s estimate, calculate the theoretical optimal precision on mid-transit time measurement possible given a transit of a particular SNR.
The relation used is:
sigma_tc = Q^{-1} * T * sqrt(θ/2) Q = SNR of the transit. T = transit duration, which is 2.14 hours from discovery paper. θ = τ/T = ratio of ingress to total duration ~= (few minutes [guess]) / 2.14 hours
Parameters: - snr (float) – The measured signal-to-noise of the transit, e,g. from
astrobase.periodbase.kbls.bls_stats_singleperiod()
or from running the .compute_stats() method on an Astropy BoxLeastSquares object. - t_ingress_min (float) – The ingress duration in minutes. This is t_I to t_II in Winn (2010) nomenclature.
- t_duration_hr (float) – The transit duration in hours. This is t_I to t_IV in Winn (2010) nomenclature.
Returns: Returns the precision achievable for transit-center time as calculated from the relation above. This is in days.
Return type: float
- snr (float) – The measured signal-to-noise of the transit, e,g. from
-
astrobase.varbase.transits.
get_transit_times
(blsd, time, extra_maskfrac, trapd=None, nperiodint=1000)[source]¶ Given a BLS period, epoch, and transit ingress/egress points (usually from
astrobase.periodbase.kbls.bls_stats_singleperiod()
), return the times within transit durations + extra_maskfrac of each transit.Optionally, can use the (more accurate) trapezoidal fit period and epoch, if it’s passed. Useful for inspecting individual transits, and masking them out if desired.
Parameters: - blsd (dict) – This is the dict returned by
astrobase.periodbase.kbls.bls_stats_singleperiod()
. - time (np.array) – The times from the time-series of transit observations used to calculate the initial period.
- extra_maskfrac (float) – This is the separation from in-transit points you desire, in units of the transit duration. extra_maskfrac = 0 if you just want points inside transit (see below).
- trapd (dict) – This is a dict returned by
astrobase.lcfit.transits.traptransit_fit_magseries()
containing the trapezoid transit model. - nperiodint (int) – This indicates how many periods backwards/forwards to try and identify transits from the epochs reported in blsd or trapd.
Returns: (tmids_obsd, t_starts, t_ends) –
The returned items are:
tmids_obsd (np.ndarray): best guess of transit midtimes in lightcurve. Has length number of transits in lightcurve. t_starts (np.ndarray): t_Is - extra_maskfrac*tdur, for t_Is transit first contact point. t_ends (np.ndarray): t_Is + extra_maskfrac*tdur, for t_Is transit first contact point.
Return type: tuple of np.array
- blsd (dict) – This is the dict returned by
-
astrobase.varbase.transits.
given_lc_get_transit_tmids_tstarts_tends
(time, flux, err_flux, blsfit_savpath=None, trapfit_savpath=None, magsarefluxes=True, nworkers=1, sigclip=None, extra_maskfrac=0.03)[source]¶ Gets the transit start, middle, and end times for transits in a given time-series of observations.
Parameters: - time,flux,err_flux (np.array) – The input flux time-series measurements and their associated measurement errors
- blsfit_savpath (str or None) – If provided as a str, indicates the path of the fit plot to make for a simple BLS model fit to the transit using the obtained period and epoch.
- trapfit_savpath (str or None) – If provided as a str, indicates the path of the fit plot to make for a trapezoidal transit model fit to the transit using the obtained period and epoch.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- magsarefluxes (bool) – This is by default True for this function, since it works on fluxes only at the moment.
- nworkers (int) – The number of parallel BLS period-finder workers to use.
- extra_maskfrac (float) –
This is the separation (N) from in-transit points you desire, in units of the transit duration. extra_maskfrac = 0 if you just want points inside transit, otherwise:
t_starts = t_Is - N*tdur, t_ends = t_IVs + N*tdur
Thus setting N=0.03 masks slightly more than the guessed transit duration.
Returns: (tmids_obsd, t_starts, t_ends) –
The returned items are:
tmids_obsd (np.ndarray): best guess of transit midtimes in lightcurve. Has length number of transits in lightcurve. t_starts (np.ndarray): t_Is - extra_maskfrac*tdur, for t_Is transit first contact point. t_ends (np.ndarray): t_Is + extra_maskfrac*tdur, for t_Is transit first contact point.
Return type: tuple
-
astrobase.varbase.transits.
given_lc_get_out_of_transit_points
(time, flux, err_flux, blsfit_savpath=None, trapfit_savpath=None, in_out_transit_savpath=None, sigclip=None, magsarefluxes=True, nworkers=1, extra_maskfrac=0.03)[source]¶ This gets the out-of-transit light curve points.
Relevant during iterative masking of transits for multiple planet system search.
Parameters: - time,flux,err_flux (np.array) – The input flux time-series measurements and their associated measurement errors
- blsfit_savpath (str or None) – If provided as a str, indicates the path of the fit plot to make for a simple BLS model fit to the transit using the obtained period and epoch.
- trapfit_savpath (str or None) – If provided as a str, indicates the path of the fit plot to make for a trapezoidal transit model fit to the transit using the obtained period and epoch.
- in_out_transit_savpath (str or None) – If provided as a str, indicates the path of the plot file that will be made for a plot showing the in-transit points and out-of-transit points tagged separately.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- magsarefluxes (bool) – This is by default True for this function, since it works on fluxes only at the moment.
- nworkers (int) – The number of parallel BLS period-finder workers to use.
- extra_maskfrac (float) –
This is the separation (N) from in-transit points you desire, in units of the transit duration. extra_maskfrac = 0 if you just want points inside transit, otherwise:
t_starts = t_Is - N*tdur, t_ends = t_IVs + N*tdur
Thus setting N=0.03 masks slightly more than the guessed transit duration.
Returns: (times_oot, fluxes_oot, errs_oot) – The times, flux, err_flux values from the input at the time values out-of-transit are returned.
Return type: tuple of np.array
astrobase.varbase.trends module¶
Contains light curve trend-removal tools, such as external parameter decorrelation (EPD) and smoothing.
-
astrobase.varbase.trends.
smooth_magseries_ndimage_medfilt
(mags, windowsize)[source]¶ This smooths the magseries with a median filter that reflects the array at the boundary.
See https://docs.scipy.org/doc/scipy/reference/tutorial/ndimage.html for details.
Parameters: - mags (np.array) – The input mags/flux time-series to smooth.
- windowsize (int) – This is a odd integer containing the smoothing window size.
Returns: The smoothed mag/flux time-series array.
Return type: np.array
-
astrobase.varbase.trends.
smooth_magseries_signal_medfilt
(mags, windowsize)[source]¶ This smooths the magseries with a simple median filter.
This function pads with zeros near the boundary, see:
https://stackoverflow.com/questions/24585706/scipy-medfilt-wrong-result
Typically this is bad.
Parameters: - mags (np.array) – The input mags/flux time-series to smooth.
- windowsize (int) – This is a odd integer containing the smoothing window size.
Returns: The smoothed mag/flux time-series array.
Return type: np.array
-
astrobase.varbase.trends.
smooth_magseries_gaussfilt
(mags, windowsize, windowfwhm=7)[source]¶ This smooths the magseries with a Gaussian kernel.
Parameters: - mags (np.array) – The input mags/flux time-series to smooth.
- windowsize (int) – This is a odd integer containing the smoothing window size.
- windowfwhm (int) – This is an odd integer containing the FWHM of the applied Gaussian window function.
Returns: The smoothed mag/flux time-series array.
Return type: np.array
-
astrobase.varbase.trends.
smooth_magseries_savgol
(mags, windowsize, polyorder=2)[source]¶ This smooths the magseries with a Savitsky-Golay filter.
Parameters: - mags (np.array) – The input mags/flux time-series to smooth.
- windowsize (int) – This is a odd integer containing the smoothing window size.
- polyorder (int) – This is an integer containing the polynomial degree order to use when generating the Savitsky-Golay filter.
Returns: The smoothed mag/flux time-series array.
Return type: np.array
-
astrobase.varbase.trends.
epd_magseries
(times, mags, errs, fsv, fdv, fkv, xcc, ycc, bgv, bge, iha, izd, magsarefluxes=False, epdsmooth_sigclip=3.0, epdsmooth_windowsize=21, epdsmooth_func=<function smooth_magseries_savgol>, epdsmooth_extraparams=None)[source]¶ Detrends a magnitude series using External Parameter Decorrelation.
Requires a set of external parameters similar to those present in HAT light curves. At the moment, the HAT light-curve-specific external parameters are:
- S: the ‘fsv’ column in light curves,
- D: the ‘fdv’ column in light curves,
- K: the ‘fkv’ column in light curves,
- x coords: the ‘xcc’ column in light curves,
- y coords: the ‘ycc’ column in light curves,
- background value: the ‘bgv’ column in light curves,
- background error: the ‘bge’ column in light curves,
- hour angle: the ‘iha’ column in light curves,
- zenith distance: the ‘izd’ column in light curves
S, D, and K are defined as follows:
- S -> measure of PSF sharpness (~1/sigma^2 sosmaller S = wider PSF)
- D -> measure of PSF ellipticity in xy direction
- K -> measure of PSF ellipticity in cross direction
S, D, K are related to the PSF’s variance and covariance, see eqn 30-33 in A. Pal’s thesis: https://arxiv.org/abs/0906.3486
NOTE: The errs are completely ignored and returned unchanged (except for sigclip and finite filtering).
Parameters: - times,mags,errs (np.array) – The input mag/flux time-series to detrend.
- fsv (np.array) – Array containing the external parameter S of the same length as times.
- fdv (np.array) – Array containing the external parameter D of the same length as times.
- fkv (np.array) – Array containing the external parameter K of the same length as times.
- xcc (np.array) – Array containing the external parameter x-coords of the same length as times.
- ycc (np.array) – Array containing the external parameter y-coords of the same length as times.
- bgv (np.array) – Array containing the external parameter background value of the same length as times.
- bge (np.array) – Array containing the external parameter background error of the same length as times.
- iha (np.array) – Array containing the external parameter hour angle of the same length as times.
- izd (np.array) – Array containing the external parameter zenith distance of the same length as times.
- magsarefluxes (bool) – Set this to True if mags actually contains fluxes.
- epdsmooth_sigclip (float or int or sequence of two floats/ints or None) –
This specifies how to sigma-clip the input LC before fitting the EPD function to it.
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- epdsmooth_windowsize (int) – This is the number of LC points to smooth over to generate a smoothed light curve that will be used to fit the EPD function.
- epdsmooth_func (Python function) –
This sets the smoothing filter function to use. A Savitsky-Golay filter is used to smooth the light curve by default. The functions that can be used with this kwarg are listed in varbase.trends. If you want to use your own function, it MUST have the following signature:
def smoothfunc(mags_array, window_size, **extraparams)
and return a numpy array of the same size as mags_array with the smoothed time-series. Any extra params can be provided using the extraparams dict.
- epdsmooth_extraparams (dict) – This is a dict of any extra filter params to supply to the smoothing function.
Returns: Returns a dict of the following form:
{'times':the input times after non-finite elems removed, 'mags':the EPD detrended mag values (the EPD mags), 'errs':the errs after non-finite elems removed, 'fitcoeffs':EPD fit coefficient values, 'fitinfo':the full tuple returned by scipy.leastsq, 'fitmags':the EPD fit function evaluated at times, 'mags_median': this is median of the EPD mags, 'mags_mad': this is the MAD of EPD mags}
Return type: dict
-
astrobase.varbase.trends.
epd_magseries_extparams
(times, mags, errs, externalparam_arrs, initial_coeff_guess, magsarefluxes=False, epdsmooth_sigclip=3.0, epdsmooth_windowsize=21, epdsmooth_func=<function smooth_magseries_savgol>, epdsmooth_extraparams=None, objective_func=<function _epd_residual2>, objective_kwargs=None, optimizer_func=<function least_squares>, optimizer_kwargs=None)[source]¶ This does EPD on a mag-series with arbitrary external parameters.
Parameters: - times,mags,errs (np.array) – The input mag/flux time-series to run EPD on.
- externalparam_arrs (list of np.arrays) – This is a list of ndarrays of external parameters to decorrelate against. These should all be the same size as times, mags, errs.
- initial_coeff_guess (np.array) – An array of initial fit coefficients to pass into the objective function.
- epdsmooth_sigclip (float or int or sequence of two floats/ints or None) –
This specifies how to sigma-clip the input LC before smoothing it and fitting the EPD function to it. The actual LC will not be sigma-clipped.
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- epdsmooth_windowsize (int) – This is the number of LC points to smooth over to generate a smoothed light curve that will be used to fit the EPD function.
- epdsmooth_func (Python function) –
This sets the smoothing filter function to use. A Savitsky-Golay filter is used to smooth the light curve by default. The functions that can be used with this kwarg are listed in varbase.trends. If you want to use your own function, it MUST have the following signature:
def smoothfunc(mags_array, window_size, **extraparams)
and return a numpy array of the same size as mags_array with the smoothed time-series. Any extra params can be provided using the extraparams dict.
- epdsmooth_extraparams (dict) – This is a dict of any extra filter params to supply to the smoothing function.
- objective_func (Python function) –
The function that calculates residuals between the model and the smoothed mag-series. This must have the following signature:
def objective_func(fit_coeffs, times, mags, errs, *external_params, **objective_kwargs)
where times, mags, errs are arrays of the sigma-clipped and smoothed time-series, fit_coeffs is an array of EPD fit coefficients, external_params is a tuple of the passed in external parameter arrays, and objective_kwargs is a dict of any optional kwargs to pass into the objective function.
This should return the value of the residual based on evaluating the model function (and any weights based on errs or times).
- objective_kwargs (dict or None) – A dict of kwargs to pass into the objective_func function.
- optimizer_func (Python function) –
The function that minimizes the residual between the model and the smoothed mag-series using the objective_func. This should have a signature similar to one of the optimizer functions in scipy.optimize, i.e.:
def optimizer_func(objective_func, initial_coeffs, args=(), ...)
and return a scipy.optimize.OptimizeResult. We’ll rely on the
.success
attribute to determine if the EPD fit was successful, and the.x
attribute to get the values of the fit coefficients. - optimizer_kwargs (dict or None) – A dict of kwargs to pass into the optimizer_func function.
Returns: Returns a dict of the following form:
{'times':the input times after non-finite elems removed, 'mags':the EPD detrended mag values (the EPD mags), 'errs':the errs after non-finite elems removed, 'fitcoeffs':EPD fit coefficient values, 'fitinfo':the result returned by the optimizer function, 'mags_median': this is the median of the EPD mags, 'mags_mad': this is the MAD of EPD mags}
Return type: dict
-
astrobase.varbase.trends.
rfepd_magseries
(times, mags, errs, externalparam_arrs, magsarefluxes=False, epdsmooth=True, epdsmooth_sigclip=3.0, epdsmooth_windowsize=21, epdsmooth_func=<function smooth_magseries_savgol>, epdsmooth_extraparams=None, rf_subsample=1.0, rf_ntrees=300, rf_extraparams={'criterion': 'mse', 'n_jobs': -1, 'oob_score': False})[source]¶ This uses a RandomForestRegressor to de-correlate the given magseries.
Parameters: - times,mags,errs (np.array) – The input mag/flux time-series to run EPD on.
- externalparam_arrs (list of np.arrays) – This is a list of ndarrays of external parameters to decorrelate against. These should all be the same size as times, mags, errs.
- epdsmooth (bool) – If True, sets the training LC for the RandomForestRegress to be a smoothed version of the sigma-clipped light curve provided in times, mags, errs.
- epdsmooth_sigclip (float or int or sequence of two floats/ints or None) –
This specifies how to sigma-clip the input LC before smoothing it and fitting the EPD function to it. The actual LC will not be sigma-clipped.
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- epdsmooth_windowsize (int) – This is the number of LC points to smooth over to generate a smoothed light curve that will be used to fit the EPD function.
- epdsmooth_func (Python function) –
This sets the smoothing filter function to use. A Savitsky-Golay filter is used to smooth the light curve by default. The functions that can be used with this kwarg are listed in varbase.trends. If you want to use your own function, it MUST have the following signature:
def smoothfunc(mags_array, window_size, **extraparams)
and return a numpy array of the same size as mags_array with the smoothed time-series. Any extra params can be provided using the extraparams dict.
- epdsmooth_extraparams (dict) – This is a dict of any extra filter params to supply to the smoothing function.
- rf_subsample (float) – Defines the fraction of the size of the mags array to use for training the random forest regressor.
- rf_ntrees (int) – This is the number of trees to use for the RandomForestRegressor.
- rf_extraprams (dict) – This is a dict of any extra kwargs to provide to the RandomForestRegressor instance used.
Returns: Returns a dict with decorrelated mags and the usual info from the RandomForestRegressor: variable importances, etc.
Return type: dict
astrobase.plotbase module¶
Contains various useful functions for plotting light curves and associated data.
-
astrobase.plotbase.
plot_magseries
(times, mags, magsarefluxes=False, errs=None, out=None, sigclip=30.0, normto='globalmedian', normmingap=4.0, timebin=None, yrange=None, segmentmingap=100.0, plotdpi=100)[source]¶ This plots a magnitude/flux time-series.
Parameters: - times,mags (np.array) – The mag/flux time-series to plot as a function of time.
- magsarefluxes (bool) –
Indicates if the input mags array is actually an array of flux measurements instead of magnitude measurements. If this is set to True, then the plot y-axis will be set as appropriate for mag or fluxes. In addition:
- if normto is ‘zero’, then the median flux is divided from each observation’s flux value to yield normalized fluxes with 1.0 as the global median.
- if normto is ‘globalmedian’, then the global median flux value across the entire time series is multiplied with each measurement.
- if norm is set to a float, then this number is multiplied with the flux value for each measurement.
- errs (np.array or None) – If this is provided, contains the measurement errors associated with each measurement of flux/mag in time-series. Providing this kwarg will add errbars to the output plot.
- out (str or StringIO/BytesIO object or None) –
Sets the output type and target:
- If out is a string, will save the plot to the specified file name.
- If out is a StringIO/BytesIO object, will save the plot to that file handle. This can be useful to carry out additional operations on the output binary stream, or convert it to base64 text for embedding in HTML pages.
- If out is None, will save the plot to a file called ‘magseries-plot.png’ in the current working directory.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- normto ({'globalmedian', 'zero'} or a float) –
Sets the normalization target:
'globalmedian' -> norms each mag to the global median of the LC column 'zero' -> norms each mag to zero a float -> norms each mag to this specified float value.
- normmingap (float) – This defines how much the difference between consecutive measurements is allowed to be to consider them as parts of different timegroups. By default it is set to 4.0 days.
- timebin (float or None) – The bin size to use to group together measurements closer than this amount in time. This is in seconds. If this is None, no time-binning will be performed.
- yrange (list of two floats or None) – This is used to provide a custom y-axis range to the plot. If None, will automatically determine y-axis range.
- segmentmingap (float or None) – This controls the minimum length of time (in days) required to consider a timegroup in the light curve as a separate segment. This is useful when the light curve consists of measurements taken over several seasons, so there’s lots of dead space in the plot that can be cut out to zoom in on the interesting stuff. If segmentmingap is not None, the magseries plot will be cut in this way and the x-axis will show these breaks.
- plotdpi (int) – Sets the resolution in DPI for PNG plots (default = 100).
Returns: Returns based on the input:
- If out is a str or None, the path to the generated plot file is returned.
- If out is a StringIO/BytesIO object, will return the StringIO/BytesIO object to which the plot was written.
Return type: str or BytesIO/StringIO object
-
astrobase.plotbase.
plot_phased_magseries
(times, mags, period, epoch='min', fitknotfrac=0.01, errs=None, magsarefluxes=False, normto='globalmedian', normmingap=4.0, sigclip=30.0, phasewrap=True, phasesort=True, phasebin=None, plotphaselim=(-0.8, 0.8), yrange=None, xtimenotphase=False, xaxlabel='phase', yaxlabel=None, modelmags=None, modeltimes=None, modelerrs=None, outfile=None, plotdpi=100)[source]¶ Plots a phased magnitude/flux time-series using the period provided.
Parameters: - times,mags (np.array) – The mag/flux time-series to plot as a function of phase given period.
- period (float) – The period to use to phase-fold the time-series. Should be the same unit as times (usually in days)
- epoch ('min' or float or None) –
This indicates how to get the epoch to use for phasing the light curve:
- If None, uses the min(times) as the epoch for phasing.
- If epoch is the string ‘min’, then fits a cubic spline to the phased light curve using min(times) as the initial epoch, finds the magnitude/flux minimum of this phased light curve fit, and finally uses the that time value as the epoch. This is useful for plotting planetary transits and eclipsing binary phased light curves so that phase 0.0 corresponds to the mid-center time of primary eclipse (or transit).
- If epoch is a float, then uses that directly to phase the light curve and as the epoch of the phased mag series plot.
- fitknotfrac (float) – If epoch=’min’, this function will attempt to fit a cubic spline to the phased light curve to find a time of light minimum as phase 0.0. This kwarg sets the number of knots to generate the spline as a fraction of the total number of measurements in the input time-series. By default, this is set so that 100 knots are used to generate a spline for fitting the phased light curve consisting of 10000 measurements.
- errs (np.array or None) – If this is provided, contains the measurement errors associated with each measurement of flux/mag in time-series. Providing this kwarg will add errbars to the output plot.
- magsarefluxes (bool) – Indicates if the input mags array is actually an array of flux measurements instead of magnitude measurements. If this is set to True, then the plot y-axis will be set as appropriate for mag or fluxes.
- normto ({'globalmedian', 'zero'} or a float) –
Sets the normalization target:
'globalmedian' -> norms each mag to the global median of the LC column 'zero' -> norms each mag to zero a float -> norms each mag to this specified float value.
- normmingap (float) – This defines how much the difference between consecutive measurements is allowed to be to consider them as parts of different timegroups. By default it is set to 4.0 days.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- phasewrap (bool) – If this is True, the phased time-series will be wrapped around phase 0.0.
- phasesort (bool) – If this is True, the phased time-series will be sorted in phase.
- phasebin (float or None) – If this is provided, indicates the bin size to use to group together measurements closer than this amount in phase. This is in units of phase. The binned phased light curve will be overplotted on top of the phased light curve. Useful for when one has many measurement points and needs to pick out a small trend in an otherwise noisy phased light curve.
- plotphaselim (sequence of two floats or None) – The x-axis limits to use when making the phased light curve plot. By default, this is (-0.8, 0.8), which places phase 0.0 at the center of the plot and covers approximately two cycles in phase to make any trends clear.
- yrange (list of two floats or None) – This is used to provide a custom y-axis range to the plot. If None, will automatically determine y-axis range.
- xtimenotphase (bool) – If True, the x-axis gets units of time (multiplies phase by period).
- xaxlabel (str) – Sets the label for the x-axis.
- yaxlabel (str or None) – Sets the label for the y-axis. If this is None, the appropriate label will be used based on the value of the magsarefluxes kwarg.
- modeltimes,modelmags,modelerrs (np.array or None) – If all of these are provided, then this function will overplot the values of modeltimes and modelmags on top of the actual phased light curve. This is useful for plotting variability models on top of the light curve (e.g. plotting a Mandel-Agol transit model over the actual phased light curve. These arrays will be phased using the already provided period and epoch.
- outfile (str or StringIO/BytesIO or matplotlib.axes.Axes or None) –
- a string filename for the file where the plot will be written.
- a StringIO/BytesIO object to where the plot will be written.
- a matplotlib.axes.Axes object to where the plot will be written.
- if None, plots to ‘magseries-phased-plot.png’ in current dir.
- plotdpi (int) – Sets the resolution in DPI for PNG plots (default = 100).
Returns: This returns based on the input:
- If outfile is a str or None, the path to the generated plot file is returned.
- If outfile is a StringIO/BytesIO object, will return the StringIO/BytesIO object to which the plot was written.
- If outfile is a matplotlib.axes.Axes object, will return the Axes object with the plot elements added to it. One can then directly include this Axes object in some other Figure.
Return type: str or StringIO/BytesIO or matplotlib.axes.Axes
-
astrobase.plotbase.
skyview_stamp
(ra, decl, survey='DSS2 Red', scaling='Linear', sizepix=300, flip=True, convolvewith=None, forcefetch=False, cachedir='~/.astrobase/stamp-cache', timeout=45.0, retry_failed=False, savewcsheader=True, verbose=False)[source]¶ This downloads a DSS FITS stamp centered on the coordinates specified.
This wraps the function
astrobase.services.skyview.get_stamp()
, which downloads Digitized Sky Survey stamps in FITS format from the NASA SkyView service:https://skyview.gsfc.nasa.gov/current/cgi/query.pl
Also adds some useful operations on top of the FITS file returned.
Parameters: - ra,decl (float) – The center coordinates for the stamp in decimal degrees.
- survey (str) – The survey name to get the stamp from. This is one of the values in the ‘SkyView Surveys’ option boxes on the SkyView webpage. Currently, we’ve only tested using ‘DSS2 Red’ as the value for this kwarg, but the other ones should work in principle.
- scaling (str) – This is the pixel value scaling function to use. Can be any of the strings (“Log”, “Linear”, “Sqrt”, “HistEq”).
- sizepix (int) – Size of the requested stamp, in pixels. (DSS scale is ~1arcsec/px).
- flip (bool) – Will flip the downloaded image top to bottom. This should usually be True because matplotlib and FITS have different image coord origin conventions. Alternatively, set this to False and use the origin=’lower’ in any call to matplotlib.pyplot.imshow when plotting this image.
- convolvewith (astropy.convolution Kernel object or None) –
If convolvewith is an astropy.convolution Kernel object from:
http://docs.astropy.org/en/stable/convolution/kernels.html
then, this function will return the stamp convolved with that kernel. This can be useful to see effects of wide-field telescopes (like the HATNet and HATSouth lenses) degrading the nominal 1 arcsec/px of DSS, causing blending of targets and any variability.
- forcefetch (bool) – If True, will disregard any existing cached copies of the stamp already downloaded corresponding to the requested center coordinates and redownload the FITS from the SkyView service.
- cachedir (str) – This is the path to the astrobase cache directory. All downloaded FITS stamps are stored here as .fits.gz files so we can immediately respond with the cached copy when a request is made for a coordinate center that’s already been downloaded.
- timeout (float) – Sets the timeout in seconds to wait for a response from the NASA SkyView service.
- retry_failed (bool) – If the initial request to SkyView fails, and this is True, will retry until it succeeds.
- savewcsheader (bool) – If this is True, also returns the WCS header of the downloaded FITS stamp in addition to the FITS image itself. Useful for projecting object coordinates onto image xy coordinates for visualization.
- verbose (bool) – If True, indicates progress.
Returns: This returns based on the value of savewcsheader:
- If savewcsheader=True, returns a tuple: (FITS stamp image as a numpy array, FITS header)
- If savewcsheader=False, returns only the FITS stamp image as numpy array.
- If the stamp retrieval fails, returns None.
Return type: tuple or array or None
-
astrobase.plotbase.
fits_finder_chart
(fitsfile, outfile, fitsext=0, wcsfrom=None, scale=<astropy.visualization.interval.ZScaleInterval object>, stretch=<astropy.visualization.stretch.LinearStretch object>, colormap=<matplotlib.colors.LinearSegmentedColormap object>, findersize=None, finder_coordlimits=None, overlay_ra=None, overlay_decl=None, overlay_pltopts={'marker': 'o', 'markeredgecolor': 'red', 'markeredgewidth': 2.0, 'markerfacecolor': 'none', 'markersize': 10.0}, overlay_zoomcontain=False, grid=False, gridcolor='k')[source]¶ This makes a finder chart for a given FITS with an optional object position overlay.
Parameters: - fitsfile (str) – fitsfile is the FITS file to use to make the finder chart.
- outfile (str) – outfile is the name of the output file. This can be a png or pdf or whatever else matplotlib can write given a filename and extension.
- fitsext (int) – Sets the FITS extension in fitsfile to use to extract the image array from.
- wcsfrom (str or None) – If wcsfrom is None, the WCS to transform the RA/Dec to pixel x/y will be taken from the FITS header of fitsfile. If this is not None, it must be a FITS or similar file that contains a WCS header in its first extension.
- scale (astropy.visualization.Interval object) – scale sets the normalization for the FITS pixel values. This is an astropy.visualization Interval object. See http://docs.astropy.org/en/stable/visualization/normalization.html for details on scale and stretch objects.
- stretch (astropy.visualization.Stretch object) – stretch sets the stretch function for mapping FITS pixel values to output pixel values. This is an astropy.visualization Stretch object. See http://docs.astropy.org/en/stable/visualization/normalization.html for details on scale and stretch objects.
- colormap (matplotlib Colormap object) – colormap is a matplotlib color map object to use for the output image.
- findersize (None or tuple of two ints) – If findersize is None, the output image size will be set by the NAXIS1 and NAXIS2 keywords in the input fitsfile FITS header. Otherwise, findersize must be a tuple with the intended x and y size of the image in inches (all output images will use a DPI = 100).
- finder_coordlimits (list of four floats or None) – If not None, finder_coordlimits sets x and y limits for the plot, effectively zooming it in if these are smaller than the dimensions of the FITS image. This should be a list of the form: [minra, maxra, mindecl, maxdecl] all in decimal degrees.
- overlay_decl (overlay_ra,) – overlay_ra and overlay_decl are ndarrays containing the RA and Dec values to overplot on the image as an overlay. If these are both None, then no overlay will be plotted.
- overlay_pltopts (dict) – overlay_pltopts controls how the overlay points will be plotted. This a dict with standard matplotlib marker, etc. kwargs as key-val pairs, e.g. ‘markersize’, ‘markerfacecolor’, etc. The default options make red outline circles at the location of each object in the overlay.
- overlay_zoomcontain (bool) – overlay_zoomcontain controls if the finder chart will be zoomed to just contain the overlayed points. Everything outside the footprint of these points will be discarded.
- grid (bool) – grid sets if a grid will be made on the output image.
- gridcolor (str) – gridcolor sets the color of the grid lines. This is a usual matplotib color spec string.
Returns: The filename of the generated output image if successful. None otherwise.
Return type: str or None
-
astrobase.plotbase.
plot_periodbase_lsp
(lspinfo, outfile=None, plotdpi=100)[source]¶ Makes a plot of periodograms obtained from periodbase functions.
This takes the output dict produced by any astrobase.periodbase period-finder function or a pickle filename containing such a dict and makes a periodogram plot.
Parameters: - lspinfo (dict or str) –
If lspinfo is a dict, it must be a dict produced by an astrobase.periodbase period-finder function or a dict from your own period-finder function or routine that is of the form below with at least these keys:
{'periods': np.array of all periods searched by the period-finder, 'lspvals': np.array of periodogram power value for each period, 'bestperiod': a float value that is the period with the highest peak in the periodogram, i.e. the most-likely actual period, 'method': a three-letter code naming the period-finder used; must be one of the keys in the `METHODLABELS` dict above, 'nbestperiods': a list of the periods corresponding to periodogram peaks (`nbestlspvals` below) to annotate on the periodogram plot so they can be called out visually, 'nbestlspvals': a list of the power values associated with periodogram peaks to annotate on the periodogram plot so they can be called out visually; should be the same length as `nbestperiods` above}
If lspinfo is a str, then it must be a path to a pickle file that contains a dict of the form described above.
- outfile (str or None) – If this is a str, will write the periodogram plot to the file specified by this string. If this is None, will write to a file called ‘lsp-plot.png’ in the current working directory.
- plotdpi (int) – Sets the resolution in DPI of the output periodogram plot PNG file.
Returns: Absolute path to the periodogram plot file created.
Return type: str
- lspinfo (dict or str) –
astrobase.lcproc package¶
This package contains functions that help drive large batch-processing jobs for light curves.
This top level module contains functions to import custom light curve formats. Once you have your own LC format registered with lcproc, all of the submodules in this package can be used to process these LCs:
astrobase.lcproc.awsrun
: contains driver functions that run batch-processing of light curve period-finding and checkplot making using resources from Amazon AWS: EC2 for processing, S3 for storage, and SQS for queuing work.astrobase.lcproc.catalogs
: contains functions that generate catalogs from collections of light curves, make KD-Trees for fast spatial matching, and augment these catalogs from the rich object information contained in checkplot pickles.astrobase.lcproc.checkplotgen
: contains functions that drive batch-jobs to make checkplot pickles for a large collection of light curves (and optional period-finding results).astrobase.lcproc.checkplotproc
: contains functions that add extra information to checkplot pickles, including color-magnitude diagrams, updating neighbor light curves, and cross-matches to external catalogs.astrobase.lcproc.epd
: contains functions that drive batch-jobs for External Parameter Decorrelation on collections of light curves.astrobase.lcproc.lcbin
: contains functions that drive batch-jobs for time-binning collections of light curves to a specified cadence.astrobase.lcproc.lcpfeatures
: contains functions that drive batch-jobs to calculate features of phased light curves, if period-finding results for these are available. These periodic light curve features can be used later to do variable star classification.astrobase.lcproc.lcsfeatures
: contains functions that drive batch-jobs to calculate color, coordinate, and neighbor proximity features for a collection of light curves. These can be used later to do variable star classification.astrobase.lcproc.lcvfeatures
: contains functions that drive batch-jobs to calculate non-periodic features of unphased light curves (e.g. time-series moments and variability indices). These can be used later to do variable star classification.astrobase.lcproc.periodsearch
: contains functions that drive batch-jobs to run period-finding using any of the methods inastrobase.periodbase
on collections of light curves. These produce period-finder result pickles that can be used transparently by the functions inastrobase.lcproc.checkplotgen
andastrobase.lcproc.checkplotproc
to generate and update checkplot pickles.astrobase.lcproc.tfa
: contains functions that drive the application of the Trend Filtering Algorithm (TFA) to large collections of light curves.astrobase.lcproc.varthreshold
: contains functions that help decide where to place thresholds on several variability indices for a collection of light curves to maximize recovery of actual variable stars.
-
astrobase.lcproc.
register_lcformat
(formatkey, fileglob, timecols, magcols, errcols, readerfunc_module, readerfunc, readerfunc_kwargs=None, normfunc_module=None, normfunc=None, normfunc_kwargs=None, magsarefluxes=False, overwrite_existing=False, lcformat_dir='~/.astrobase/lcformat-jsons')[source]¶ This adds a new LC format to the astrobase LC format registry.
Allows handling of custom format light curves for astrobase lcproc drivers. Once the format is successfully registered, light curves should work transparently with all of the functions in this module, by simply calling them with the formatkey in the lcformat keyword argument.
LC format specifications are generated as JSON files. astrobase comes with several of these in <astrobase install path>/data/lcformats. LC formats you add by using this function will have their specifiers written to the ~/.astrobase/lcformat-jsons directory in your home directory.
Parameters: - formatkey (str) – A str used as the unique ID of this LC format for all lcproc functions and can be used to look it up later and import the correct functions needed to support it for lcproc operations. For example, we use ‘kep-fits’ as a the specifier for Kepler FITS light curves, which can be read by the astrobase.astrokep.read_kepler_fitslc function as specified by the <astrobase install path>/data/lcformats/kep-fits.json LC format specification JSON produced by register_lcformat.
- fileglob (str) – The default UNIX fileglob to use to search for light curve files in this LC format. This is a string like ‘-whatever-???-.*??-.lc’.
- timecols,magcols,errcols (list of str) –
These are all lists of strings indicating which keys in the lcdict produced by your lcreader_func that will be extracted and used by lcproc functions for processing. The lists must all have the same dimensions, e.g. if timecols = [‘timecol1’,’timecol2’], then magcols must be something like [‘magcol1’,’magcol2’] and errcols must be something like [‘errcol1’, ‘errcol2’]. This allows you to process multiple apertures or multiple types of measurements in one go.
Each element in these lists can be a simple key, e.g. ‘time’ (which would correspond to lcdict[‘time’]), or a composite key, e.g. ‘aperture1.times.rjd’ (which would correspond to lcdict[‘aperture1’][‘times’][‘rjd’]). See the examples in the lcformat specification JSON files in <astrobase install path>/data/lcformats.
- readerfunc_module (str) –
This is either:
- a Python module import path, e.g. ‘astrobase.lcproc.catalogs’ or
- a path to a Python file, e.g. ‘/astrobase/hatsurveys/hatlc.py’
that contains the Python module that contains functions used to open (and optionally normalize) a custom LC format that’s not natively supported by astrobase.
- readerfunc (str) –
This is the function name in readerfunc_module to use to read light curves in the custom format. This MUST always return a dictionary (the ‘lcdict’) with the following signature (the keys listed below are required, but others are allowed):
{'objectid': this object's identifier as a string, 'objectinfo':{'ra': this object's right ascension in decimal deg, 'decl': this object's declination in decimal deg, 'ndet': the number of observations in this LC, 'objectid': the object ID again for legacy reasons}, ...other time columns, mag columns go in as their own keys}
- normfunc_kwargs (dict or None) – This is a dictionary containing any kwargs to pass through to the light curve norm function.
- normfunc_module (str or None) –
This is either:
- a Python module import path, e.g. ‘astrobase.lcproc.catalogs’ or
- a path to a Python file, e.g. ‘/astrobase/hatsurveys/hatlc.py’
- None, in which case we’ll use default normalization
that contains the Python module that contains functions used to normalize a custom LC format that’s not natively supported by astrobase.
- normfunc (str or None) –
This is the function name in normfunc_module to use to normalize light curves in the custom format. If None, the default normalization method used by lcproc is to find gaps in the time-series, normalize measurements grouped by these gaps to zero, then normalize the entire magnitude time series to global time series median using the astrobase.lcmath.normalize_magseries function.
If this is provided, the normalization function should take and return an lcdict of the same form as that produced by readerfunc above. For an example of a specific normalization function, see normalize_lcdict_by_inst in the astrobase.hatsurveys.hatlc module.
- normfunc_kwargs – This is a dictionary containing any kwargs to pass through to the light curve normalization function.
- magsarefluxes (bool) – If this is True, then all lcproc functions will treat the measurement columns in the lcdict produced by your readerfunc as flux instead of mags, so things like default normalization and sigma-clipping will be done correctly. If this is False, magnitudes will be treated as magnitudes.
- overwrite_existing (bool) – If this is True, this function will overwrite any existing LC format specification JSON with the same name as that provided in the formatkey arg. This can be used to update LC format specifications while keeping the formatkey the same.
- lcformat_dir (str) – This specifies the directory where the the LC format specification JSON produced by this function will be written. By default, this goes to the .astrobase/lcformat-jsons directory in your home directory.
Returns: Returns the file path to the generated LC format specification JSON file.
Return type: str
-
astrobase.lcproc.
get_lcformat
(formatkey, use_lcformat_dir=None)[source]¶ This loads an LC format description from a previously-saved JSON file.
Parameters: - formatkey (str) – The key used to refer to the LC format. This is part of the JSON file’s name, e.g. the format key ‘hat-csv’ maps to the format JSON file: ‘<astrobase install path>/data/lcformats/hat-csv.json’.
- use_lcformat_dir (str or None) –
If provided, must be the path to a directory that contains the corresponding lcformat JSON file for formatkey. If this is None, this function will look for lcformat JSON files corresponding to the given formatkey:
- first, in the directory specified in this kwarg,
- if not found there, in the home directory: ~/.astrobase/lcformat-jsons
- if not found there, in: <astrobase install path>/data/lcformats
Returns: A tuple of the following form is returned:
(fileglob : the file glob of the associated LC files, readerfunc_in : the imported Python function for reading LCs, timecols : list of time col keys to get from the lcdict, magcols : list of mag col keys to get from the lcdict , errcols : list of err col keys to get from the lcdict, magsarefluxes : True if the measurements are fluxes not mags, normfunc_in : the imported Python function for normalizing LCs)
All astrobase.lcproc functions can then use this tuple to dynamically import your LC reader and normalization functions to work with your LC format transparently.
Return type: tuple
Submodules¶
astrobase.lcproc.awsrun module¶
This contains lcproc worker loops useful for AWS processing of light curves.
The basic workflow is:
LCs from S3 -> SQS -> worker loop -> products back to S3 | result JSON to SQS
All functions here assume AWS credentials have been loaded already using awscli as described at:
https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html
General recommendations:
- use t3.medium or t3.micro instances for runcp_consumer_loop. Checkplot making isn’t really a CPU intensive activity, so using these will be cheaper.
- use c5.2xlarge or above instances for runpf_consumer_loop. Period-finders require a decent number of fast cores, so a spot fleet of these instances should be cost-effective.
- you may want a t3.micro instance running in the same region and VPC as your worker node instances to serve as a head node driving the producer_loop functions. This can be done from a machine outside AWS, but you’ll incur (probably tiny) charges for network egress from the output queues.
- It’s best not to download results from S3 as soon as they’re produced. Leave them on S3 until everything is done, then use rclone (https://rclone.org) to sync them back to your machines using –transfers <large number>.
The user_data and instance_user_data kwargs for the make_ec2_nodes and make_spot_fleet_cluster functions can be used to start processing loops as soon as EC2 brings up the VM instance. This is especially useful for spot fleets set to maintain a target capacity, since worker nodes will be terminated and automatically replaced. Bringing up the processing loop at instance start up makes it easy to continue processing light curves exactly where you left off without having to manually intervene.
Example script for user_data bringing up a checkplot-making loop on instance creation (assuming we’re using Amazon Linux 2):
#!/bin/bash
cat << 'EOF' > /home/ec2-user/launch-runcp.sh
#!/bin/bash
sudo yum -y install python3-devel gcc-gfortran jq htop emacs-nox git
# create the virtualenv
python3 -m venv /home/ec2-user/py3
# get astrobase
cd /home/ec2-user
git clone https://github.com/waqasbhatti/astrobase
# install it
cd /home/ec2-user/astrobase
/home/ec2-user/py3/bin/pip install pip setuptools numpy -U
/home/ec2-user/py3/bin/pip install -e .[aws]
# make the work dir
mkdir /home/ec2-user/work
cd /home/ec2-user/work
# wait a bit for the instance info to be populated
sleep 5
# set some environ vars for boto3 and the processing loop
export AWS_DEFAULT_REGION=`curl --silent http://169.254.169.254/latest/dynamic/instance-identity/document/ | jq '.region' | tr -d '"'`
export NCPUS=`lscpu -J | jq ".lscpu[3].data|tonumber"`
# launch the processor loops
for s in `seq $NCPUS`; do nohup /home/ec2-user/py3/bin/python3 -u -c "from astrobase.lcproc import awsrun as lcp; lcp.runcp_consumer_loop('https://queue-url','.','s3://path/to/lclist.pkl')" > runcp-$s-loop.out & done
EOF
# run the script we just created as ec2-user
chown ec2-user /home/ec2-user/launch-runcp.sh
su ec2-user -c 'bash /home/ec2-user/launch-runcp.sh'
Here’s a similar script for a runpf consumer loop. We launch only a single instance of the loop because runpf will use all CPUs by default for its period-finder parallelized functions:
#!/bin/bash
cat << 'EOF' > /home/ec2-user/launch-runpf.sh
#!/bin/bash
sudo yum -y install python3-devel gcc-gfortran jq htop emacs-nox git
python3 -m venv /home/ec2-user/py3
cd /home/ec2-user
git clone https://github.com/waqasbhatti/astrobase
cd /home/ec2-user/astrobase
/home/ec2-user/py3/bin/pip install pip setuptools numpy -U
/home/ec2-user/py3/bin/pip install -e .[aws]
mkdir /home/ec2-user/work
cd /home/ec2-user/work
# wait a bit for the instance info to be populated
sleep 5
export AWS_DEFAULT_REGION=`curl --silent http://169.254.169.254/latest/dynamic/instance-identity/document/ | jq '.region' | tr -d '"'`
export NCPUS=`lscpu -J | jq ".lscpu[3].data|tonumber"`
# launch the processes
nohup /home/ec2-user/py3/bin/python3 -u -c "from astrobase.lcproc import awsrun as lcp; lcp.runpf_consumer_loop('https://input-queue-url','.')" > runpf-loop.out &
EOF
chown ec2-user /home/ec2-user/launch-runpf.sh
su ec2-user -c 'bash /home/ec2-user/launch-runpf.sh'
-
astrobase.lcproc.awsrun.
kill_handler
(sig, frame)[source]¶ This raises a KeyboardInterrupt when a SIGKILL comes in.
This is a handle for use with the Python signal.signal function.
-
astrobase.lcproc.awsrun.
cache_clean_handler
(min_age_hours=1)[source]¶ This periodically cleans up the ~/.astrobase cache to save us from disk-space doom.
Parameters: min_age_hours (int) – Files older than this number of hours from the current time will be deleted. Returns: Return type: Nothing.
-
astrobase.lcproc.awsrun.
shutdown_check_handler
()[source]¶ This checks the AWS instance data URL to see if there’s a pending shutdown for the instance.
This is useful for AWS spot instances. If there is a pending shutdown posted to the instance data URL, we’ll use the result of this function break out of the processing loop and shut everything down ASAP before the instance dies.
Returns: - True if the instance is going to die soon.
- False if the instance is still safe.
Return type: bool
-
astrobase.lcproc.awsrun.
runcp_producer_loop
(lightcurve_list, input_queue, input_bucket, result_queue, result_bucket, pfresult_list=None, runcp_kwargs=None, process_list_slice=None, purge_queues_when_done=False, delete_queues_when_done=False, download_when_done=True, save_state_when_done=True, s3_client=None, sqs_client=None)[source]¶ This sends checkplot making tasks to the input queue and monitors the result queue for task completion.
Parameters: - lightcurve_list (str or list of str) – This is either a string pointing to a file containing a list of light curves filenames to process or the list itself. The names must correspond to the full filenames of files stored on S3, including all prefixes, but not include the ‘s3://<bucket name>/’ bit (these will be added automatically).
- input_queue (str) – This is the name of the SQS queue which will receive processing tasks generated by this function. The queue URL will automatically be obtained from AWS.
- input_bucket (str) – The name of the S3 bucket containing the light curve files to process.
- result_queue (str) – This is the name of the SQS queue that this function will listen to for messages from the workers as they complete processing on their input elements. This function will attempt to match input sent to the input_queue with results coming into the result_queue so it knows how many objects have been successfully processed. If this function receives task results that aren’t in its own input queue, it will acknowledge them so they complete successfully, but not download them automatically. This handles leftover tasks completing from a previous run of this function.
- result_bucket (str) – The name of the S3 bucket which will receive the results from the workers.
- pfresult_list (list of str or None) – This is a list of periodfinder result pickle S3 URLs associated with each light curve. If provided, this will be used to add in phased light curve plots to each checkplot pickle. If this is None, the worker loop will produce checkplot pickles that only contain object information, neighbor information, and unphased light curves.
- runcp_kwargs (dict) – This is a dict used to pass any extra keyword arguments to the lcproc.checkplotgen.runcp function that will be run by the worker loop.
- process_list_slice (list) –
This is used to index into the input light curve list so a subset of the full list can be processed in this specific run of this function.
Use None for a slice index elem to emulate single slice spec behavior:
process_list_slice = [10, None] -> lightcurve_list[10:] process_list_slice = [None, 500] -> lightcurve_list[:500]
- purge_queues_when_done (bool) – If this is True, and this function exits (either when all done, or when it is interrupted with a Ctrl+C), all outstanding elements in the input/output queues that have not yet been acknowledged by workers or by this function will be purged. This effectively cancels all outstanding work.
- delete_queues_when_done (bool) – If this is True, and this function exits (either when all done, or when it is interrupted with a Ctrl+C’), all outstanding work items will be purged from the input/queues and the queues themselves will be deleted.
- download_when_done (bool) – If this is True, the generated checkplot pickle for each input work item will be downloaded immediately to the current working directory when the worker functions report they’re done with it.
- save_state_when_done (bool) – If this is True, will save the current state of the work item queue and the work items acknowledged as completed to a pickle in the current working directory. Call the runcp_producer_loop_savedstate function below to resume processing from this saved state later.
- s3_client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its S3 download operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
- sqs_client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its SQS operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
Returns: Returns the current work state as a dict or str path to the generated work state pickle depending on if save_state_when_done is True.
Return type: dict or str
-
astrobase.lcproc.awsrun.
runcp_producer_loop_savedstate
(use_saved_state=None, lightcurve_list=None, input_queue=None, input_bucket=None, result_queue=None, result_bucket=None, pfresult_list=None, runcp_kwargs=None, process_list_slice=None, download_when_done=True, purge_queues_when_done=True, save_state_when_done=True, delete_queues_when_done=False, s3_client=None, sqs_client=None)[source]¶ This wraps the function above to allow for loading previous state from a file.
Parameters: - use_saved_state (str or None) – This is the path to the saved state pickle file produced by a previous run of runcp_producer_loop. Will get all of the arguments to run another instance of the loop from that pickle file. If this is None, you MUST provide all of the appropriate arguments to that function.
- lightcurve_list (str or list of str or None) – This is either a string pointing to a file containing a list of light curves filenames to process or the list itself. The names must correspond to the full filenames of files stored on S3, including all prefixes, but not include the ‘s3://<bucket name>/’ bit (these will be added automatically).
- input_queue (str or None) – This is the name of the SQS queue which will receive processing tasks generated by this function. The queue URL will automatically be obtained from AWS.
- input_bucket (str or None) – The name of the S3 bucket containing the light curve files to process.
- result_queue (str or None) – This is the name of the SQS queue that this function will listen to for messages from the workers as they complete processing on their input elements. This function will attempt to match input sent to the input_queue with results coming into the result_queue so it knows how many objects have been successfully processed. If this function receives task results that aren’t in its own input queue, it will acknowledge them so they complete successfully, but not download them automatically. This handles leftover tasks completing from a previous run of this function.
- result_bucket (str or None) – The name of the S3 bucket which will receive the results from the workers.
- pfresult_list (list of str or None) – This is a list of periodfinder result pickle S3 URLs associated with each light curve. If provided, this will be used to add in phased light curve plots to each checkplot pickle. If this is None, the worker loop will produce checkplot pickles that only contain object information, neighbor information, and unphased light curves.
- runcp_kwargs (dict or None) – This is a dict used to pass any extra keyword arguments to the lcproc.checkplotgen.runcp function that will be run by the worker loop.
- process_list_slice (list or None) –
This is used to index into the input light curve list so a subset of the full list can be processed in this specific run of this function.
Use None for a slice index elem to emulate single slice spec behavior:
process_list_slice = [10, None] -> lightcurve_list[10:] process_list_slice = [None, 500] -> lightcurve_list[:500]
- purge_queues_when_done (bool or None) – If this is True, and this function exits (either when all done, or when it is interrupted with a Ctrl+C), all outstanding elements in the input/output queues that have not yet been acknowledged by workers or by this function will be purged. This effectively cancels all outstanding work.
- delete_queues_when_done (bool or None) – If this is True, and this function exits (either when all done, or when it is interrupted with a Ctrl+C’), all outstanding work items will be purged from the input/queues and the queues themselves will be deleted.
- download_when_done (bool or None) – If this is True, the generated checkplot pickle for each input work item will be downloaded immediately to the current working directory when the worker functions report they’re done with it.
- save_state_when_done (bool or None) – If this is True, will save the current state of the work item queue and the work items acknowledged as completed to a pickle in the current working directory. Call the runcp_producer_loop_savedstate function below to resume processing from this saved state later.
- s3_client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its S3 download operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
- sqs_client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its SQS operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
Returns: Returns the current work state as a dict or str path to the generated work state pickle depending on if save_state_when_done is True.
Return type: dict or str
-
astrobase.lcproc.awsrun.
runcp_consumer_loop
(in_queue_url, workdir, lclist_pkl_s3url, lc_altexts=('', ), wait_time_seconds=5, cache_clean_timer_seconds=3600.0, shutdown_check_timer_seconds=60.0, sqs_client=None, s3_client=None)[source]¶ This runs checkplot pickle making in a loop until interrupted.
Consumes work task items from an input queue set up by runcp_producer_loop above. For the moment, we don’t generate neighbor light curves since this would require a lot more S3 calls.
Parameters: - in_queue_url (str) – The SQS URL of the input queue to listen to for work assignment messages. The task orders will include the input and output S3 bucket names, as well as the URL of the output queue to where this function will report its work-complete or work-failed status.
- workdir (str) – The directory on the local machine where this worker loop will download the input light curves and associated period-finder results (if any), process them, and produce its output checkplot pickles. These will then be uploaded to the specified S3 output bucket and then deleted from the workdir when the upload is confirmed to make it safely to S3.
- lclist_pkl (str) – S3 URL of a catalog pickle generated by lcproc.catalogs.make_lclist that contains objectids and coordinates, as well as a kdtree for all of the objects in the current light curve collection being processed. This is used to look up neighbors for each object being processed.
- lc_altexts (sequence of str) – If not None, this is a sequence of alternate extensions to try for the input light curve file other than the one provided in the input task order. For example, to get anything that’s an .sqlite where .sqlite.gz is expected, use altexts=[‘’] to strip the .gz.
- wait_time_seconds (int) – The amount of time to wait in the input SQS queue for an input task order. If this timeout expires and no task has been received, this function goes back to the top of the work loop.
- cache_clean_timer_seconds (float) – The amount of time in seconds to wait before periodically removing old files (such as finder chart FITS, external service result pickles) from the astrobase cache directory. These accumulate as the work items are processed, and take up significant space, so must be removed periodically.
- shutdown_check_timer_seconds (float) – The amount of time to wait before checking for a pending EC2 shutdown message for the instance this worker loop is operating on. If a shutdown is noticed, the worker loop is cancelled in preparation for instance shutdown.
- sqs_client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its SQS operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
- s3_client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its S3 operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
Returns: Return type: Nothing.
-
astrobase.lcproc.awsrun.
runpf_producer_loop
(lightcurve_list, input_queue, input_bucket, result_queue, result_bucket, pfmethods=('gls', 'pdm', 'mav', 'bls', 'win'), pfkwargs=({}, {}, {}, {}, {}), extra_runpf_kwargs={'getblssnr': True}, process_list_slice=None, purge_queues_when_done=False, delete_queues_when_done=False, download_when_done=True, save_state_when_done=True, s3_client=None, sqs_client=None)[source]¶ This queues up work for period-finders using SQS.
Parameters: - lightcurve_list (str or list of str) – This is either a string pointing to a file containing a list of light curves filenames to process or the list itself. The names must correspond to the full filenames of files stored on S3, including all prefixes, but not include the ‘s3://<bucket name>/’ bit (these will be added automatically).
- input_queue (str) – This is the name of the SQS queue which will receive processing tasks generated by this function. The queue URL will automatically be obtained from AWS.
- input_bucket (str) – The name of the S3 bucket containing the light curve files to process.
- result_queue (str) – This is the name of the SQS queue that this function will listen to for messages from the workers as they complete processing on their input elements. This function will attempt to match input sent to the input_queue with results coming into the result_queue so it knows how many objects have been successfully processed. If this function receives task results that aren’t in its own input queue, it will acknowledge them so they complete successfully, but not download them automatically. This handles leftover tasks completing from a previous run of this function.
- result_bucket (str) – The name of the S3 bucket which will receive the results from the workers.
- pfmethods (sequence of str) – This is a list of period-finder method short names as listed in the lcproc.periodfinding.PFMETHODS dict. This is used to tell the worker loop which period-finders to run on the input light curve.
- pfkwargs (sequence of dicts) – This contains optional kwargs as dicts to be supplied to all of the period-finder functions listed in pfmethods. This should be the same length as that sequence.
- extra_runpf_kwargs (dict) – This is a dict of kwargs to be supplied to runpf driver function itself.
- process_list_slice (list) –
This is used to index into the input light curve list so a subset of the full list can be processed in this specific run of this function.
Use None for a slice index elem to emulate single slice spec behavior:
process_list_slice = [10, None] -> lightcurve_list[10:] process_list_slice = [None, 500] -> lightcurve_list[:500]
- purge_queues_when_done (bool) – If this is True, and this function exits (either when all done, or when it is interrupted with a Ctrl+C), all outstanding elements in the input/output queues that have not yet been acknowledged by workers or by this function will be purged. This effectively cancels all outstanding work.
- delete_queues_when_done (bool) – If this is True, and this function exits (either when all done, or when it is interrupted with a Ctrl+C’), all outstanding work items will be purged from the input/queues and the queues themselves will be deleted.
- download_when_done (bool) – If this is True, the generated periodfinding result pickle for each input work item will be downloaded immediately to the current working directory when the worker functions report they’re done with it.
- save_state_when_done (bool) – If this is True, will save the current state of the work item queue and the work items acknowledged as completed to a pickle in the current working directory. Call the runcp_producer_loop_savedstate function below to resume processing from this saved state later.
- s3_client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its S3 download operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
- sqs_client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its SQS operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
Returns: Returns the current work state as a dict or str path to the generated work state pickle depending on if save_state_when_done is True.
Return type: dict or str
-
astrobase.lcproc.awsrun.
runpf_consumer_loop
(in_queue_url, workdir, lc_altexts=('', ), wait_time_seconds=5, shutdown_check_timer_seconds=60.0, sqs_client=None, s3_client=None)[source]¶ This runs period-finding in a loop until interrupted.
Consumes work task items from an input queue set up by runpf_producer_loop above.
Parameters: - in_queue_url (str) – The SQS URL of the input queue to listen to for work assignment messages. The task orders will include the input and output S3 bucket names, as well as the URL of the output queue to where this function will report its work-complete or work-failed status.
- workdir (str) – The directory on the local machine where this worker loop will download the input light curves, process them, and produce its output periodfinding result pickles. These will then be uploaded to the specified S3 output bucket, and then deleted from the local disk.
- lc_altexts (sequence of str) – If not None, this is a sequence of alternate extensions to try for the input light curve file other than the one provided in the input task order. For example, to get anything that’s an .sqlite where .sqlite.gz is expected, use altexts=[‘’] to strip the .gz.
- wait_time_seconds (int) – The amount of time to wait in the input SQS queue for an input task order. If this timeout expires and no task has been received, this function goes back to the top of the work loop.
- shutdown_check_timer_seconds (float) – The amount of time to wait before checking for a pending EC2 shutdown message for the instance this worker loop is operating on. If a shutdown is noticed, the worker loop is cancelled in preparation for instance shutdown.
- sqs_client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its SQS operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
- s3_client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its S3 operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
Returns: Return type: Nothing.
astrobase.lcproc.catalogs module¶
This contains functions to generate light curve catalogs from collections of light curves.
-
astrobase.lcproc.catalogs.
make_lclist
(basedir, outfile, use_list_of_filenames=None, lcformat='hat-sql', lcformatdir=None, fileglob=None, recursive=True, columns=('objectid', 'objectinfo.ra', 'objectinfo.decl', 'objectinfo.ndet'), makecoordindex=('objectinfo.ra', 'objectinfo.decl'), field_fitsfile=None, field_wcsfrom=None, field_scale=<astropy.visualization.interval.ZScaleInterval object>, field_stretch=<astropy.visualization.stretch.LinearStretch object>, field_colormap=<matplotlib.colors.LinearSegmentedColormap object>, field_findersize=None, field_pltopts={'marker': 'o', 'markeredgecolor': 'red', 'markeredgewidth': 2.0, 'markerfacecolor': 'none', 'markersize': 10.0}, field_grid=False, field_gridcolor='k', field_zoomcontain=True, maxlcs=None, nworkers=2)[source]¶ This generates a light curve catalog for all light curves in a directory.
Given a base directory where all the files are, and a light curve format, this will find all light curves, pull out the keys in each lcdict requested in the columns kwarg for each object, and write them to the requested output pickle file. These keys should be pointers to scalar values (i.e. something like objectinfo.ra is OK, but something like ‘times’ won’t work because it’s a vector).
Generally, this works with light curve reading functions that produce lcdicts as detailed in the docstring for lcproc.register_lcformat. Once you’ve registered your light curve reader functions using the lcproc.register_lcformat function, pass in the formatkey associated with your light curve format, and this function will be able to read all light curves in that format as well as the object information stored in their objectinfo dict.
Parameters: - basedir (str or list of str) –
If this is a str, points to a single directory to search for light curves. If this is a list of str, it must be a list of directories to search for light curves. All of these will be searched to find light curve files matching either your light curve format’s default fileglob (when you registered your LC format), or a specific fileglob that you can pass in using the fileglob kwargh here. If the recursive kwarg is set, the provided directories will be searched recursively.
If use_list_of_filenames is not None, it will override this argument and the function will take those light curves as the list of files it must process instead of whatever is specified in basedir.
- outfile (str) – This is the name of the output file to write. This will be a pickle file, so a good convention to use for this name is something like ‘my-lightcurve-catalog.pkl’.
- use_list_of_filenames (list of str or None) – Use this kwarg to override whatever is provided in basedir and directly pass in a list of light curve files to process. This can speed up this function by a lot because no searches on disk will be performed to find light curve files matching basedir and fileglob.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- fileglob (str or None) – If provided, is a string that is a valid UNIX filename glob. Used to override the default fileglob for this LC format when searching for light curve files in basedir.
- recursive (bool) – If True, the directories specified in basedir will be searched recursively for all light curve files that match the default fileglob for this LC format or a specific one provided in fileglob.
- columns (list of str) –
This is a list of keys in the lcdict produced by your light curve reader function that contain object information, which will be extracted and put into the output light curve catalog. It’s highly recommended that your LC reader function produce a lcdict that contains at least the default keys shown here.
The lcdict keys to extract are specified by using an address scheme:
- First level dict keys can be specified directly: e.g., ‘objectid’ will extract lcdict[‘objectid’]
- Keys at other levels can be specified by using a period to indicate
the level:
- e.g., ‘objectinfo.ra’ will extract lcdict[‘objectinfo’][‘ra’]
- e.g., ‘objectinfo.varinfo.features.stetsonj’ will extract lcdict[‘objectinfo’][‘varinfo’][‘features’][‘stetsonj’]
- makecoordindex (list of two str or None) – This is used to specify which lcdict keys contain the right ascension and declination coordinates for this object. If these are provided, the output light curve catalog will have a kdtree built on all object coordinates, which enables fast spatial searches and cross-matching to external catalogs by checkplot and lcproc functions.
- field_fitsfile (str or None) – If this is not None, it should be the path to a FITS image containing the objects these light curves are for. If this is provided, make_lclist will use the WCS information in the FITS itself if field_wcsfrom is None (or from a WCS header file pointed to by field_wcsfrom) to obtain x and y pixel coordinates for all of the objects in the field. A finder chart will also be made using astrobase.plotbase.fits_finder_chart using the corresponding field_scale, _stretch, _colormap, _findersize, _pltopts, _grid, and _gridcolors kwargs for that function, reproduced here to enable customization of the finder chart plot.
- field_wcsfrom (str or None) – If wcsfrom is None, the WCS to transform the RA/Dec to pixel x/y will be taken from the FITS header of fitsfile. If this is not None, it must be a FITS or similar file that contains a WCS header in its first extension.
- field_scale (astropy.visualization.Interval object) – scale sets the normalization for the FITS pixel values. This is an astropy.visualization Interval object. See http://docs.astropy.org/en/stable/visualization/normalization.html for details on scale and stretch objects.
- field_stretch (astropy.visualization.Stretch object) – stretch sets the stretch function for mapping FITS pixel values to output pixel values. This is an astropy.visualization Stretch object. See http://docs.astropy.org/en/stable/visualization/normalization.html for details on scale and stretch objects.
- field_colormap (matplotlib Colormap object) – colormap is a matplotlib color map object to use for the output image.
- field_findersize (None or tuple of two ints) – If findersize is None, the output image size will be set by the NAXIS1 and NAXIS2 keywords in the input fitsfile FITS header. Otherwise, findersize must be a tuple with the intended x and y size of the image in inches (all output images will use a DPI = 100).
- field_pltopts (dict) – field_pltopts controls how the overlay points will be plotted. This a dict with standard matplotlib marker, etc. kwargs as key-val pairs, e.g. ‘markersize’, ‘markerfacecolor’, etc. The default options make red outline circles at the location of each object in the overlay.
- field_grid (bool) – grid sets if a grid will be made on the output image.
- field_gridcolor (str) – gridcolor sets the color of the grid lines. This is a usual matplotib color spec string.
- field_zoomcontain (bool) – field_zoomcontain controls if the finder chart will be zoomed to just contain the overlayed points. Everything outside the footprint of these points will be discarded.
- maxlcs (int or None) – This sets how many light curves to process in the input LC list generated by searching for LCs in basedir or in the list provided as use_list_of_filenames.
- nworkers (int) – This sets the number of parallel workers to launch to collect information from the light curves.
Returns: Returns the path to the generated light curve catalog pickle file.
Return type: str
- basedir (str or list of str) –
-
astrobase.lcproc.catalogs.
filter_lclist
(lc_catalog, objectidcol='objectid', racol='ra', declcol='decl', xmatchexternal=None, xmatchdistarcsec=3.0, externalcolnums=(0, 1, 2), externalcolnames=('objectid', 'ra', 'decl'), externalcoldtypes='U20, f8, f8', externalcolsep=None, externalcommentchar='#', conesearch=None, conesearchworkers=1, columnfilters=None, field_fitsfile=None, field_wcsfrom=None, field_scale=<astropy.visualization.interval.ZScaleInterval object>, field_stretch=<astropy.visualization.stretch.LinearStretch object>, field_colormap=<matplotlib.colors.LinearSegmentedColormap object>, field_findersize=None, field_pltopts={'marker': 'o', 'markeredgecolor': 'red', 'markeredgewidth': 2.0, 'markerfacecolor': 'none', 'markersize': 10.0}, field_grid=False, field_gridcolor='k', field_zoomcontain=True, copylcsto=None)[source]¶ This is used to perform cone-search, cross-match, and column-filter operations on a light curve catalog generated by make_lclist.
Uses the output of make_lclist above. This function returns a list of light curves matching various criteria specified by the xmatchexternal, conesearch, and columnfilters kwargs. Use this function to generate input lists for other lcproc functions, e.g. lcproc.lcvfeatures.parallel_varfeatures, lcproc.periodfinding.parallel_pf, and lcproc.lcbin.parallel_timebin, among others.
The operations are applied in this order if more than one is specified: xmatchexternal -> conesearch -> columnfilters. All results from these operations are joined using a logical AND operation.
Parameters: - objectidcol (str) – This is the name of the object ID column in the light curve catalog.
- racol (str) – This is the name of the RA column in the light curve catalog.
- declcol (str) – This is the name of the Dec column in the light curve catalog.
- xmatchexternal (str or None) – If provided, this is the filename of a text file containing objectids, ras and decs to match the objects in the light curve catalog to by their positions.
- xmatchdistarcsec (float) – This is the distance in arcseconds to use when cross-matching to the external catalog in xmatchexternal.
- externalcolnums (sequence of int) – This a list of the zero-indexed column numbers of columns to extract from the external catalog file.
- externalcolnames (sequence of str) – This is a list of names of columns that will be extracted from the external catalog file. This is the same length as externalcolnums. These must contain the names provided as the objectid, ra, and decl column names so this function knows which column numbers correspond to those columns and can use them to set up the cross-match.
- externalcoldtypes (str) – This is a CSV string containing numpy dtype definitions for all columns listed to extract from the external catalog file. The number of dtype definitions should be equal to the number of columns to extract.
- externalcolsep (str or None) – The column separator to use when extracting columns from the external catalog file. If None, any whitespace between columns is used as the separator.
- externalcommentchar (str) – The character indicating that a line in the external catalog file is to be ignored.
- conesearch (list of float) –
This is used to specify cone-search parameters. It should be a three element list:
[center_ra_deg, center_decl_deg, search_radius_deg]
- conesearchworkers (int) – The number of parallel workers to launch for the cone-search operation.
- columnfilters (list of str) –
This is a list of strings indicating any filters to apply on each column in the light curve catalog. All column filters are applied in the specified sequence and are combined with a logical AND operator. The format of each filter string should be:
’<lc_catalog column>|<operator>|<operand>’
where:
- <lc_catalog column> is a column in the lc_catalog pickle file
- <operator> is one of: ‘lt’, ‘gt’, ‘le’, ‘ge’, ‘eq’, ‘ne’, which correspond to the usual operators: <, >, <=, >=, ==, != respectively.
- <operand> is a float, int, or string.
- field_fitsfile (str or None) – If this is not None, it should be the path to a FITS image containing the objects these light curves are for. If this is provided, make_lclist will use the WCS information in the FITS itself if field_wcsfrom is None (or from a WCS header file pointed to by field_wcsfrom) to obtain x and y pixel coordinates for all of the objects in the field. A finder chart will also be made using astrobase.plotbase.fits_finder_chart using the corresponding field_scale, _stretch, _colormap, _findersize, _pltopts, _grid, and _gridcolors kwargs for that function, reproduced here to enable customization of the finder chart plot.
- field_wcsfrom (str or None) – If wcsfrom is None, the WCS to transform the RA/Dec to pixel x/y will be taken from the FITS header of fitsfile. If this is not None, it must be a FITS or similar file that contains a WCS header in its first extension.
- field_scale (astropy.visualization.Interval object) – scale sets the normalization for the FITS pixel values. This is an astropy.visualization Interval object. See http://docs.astropy.org/en/stable/visualization/normalization.html for details on scale and stretch objects.
- field_stretch (astropy.visualization.Stretch object) – stretch sets the stretch function for mapping FITS pixel values to output pixel values. This is an astropy.visualization Stretch object. See http://docs.astropy.org/en/stable/visualization/normalization.html for details on scale and stretch objects.
- field_colormap (matplotlib Colormap object) – colormap is a matplotlib color map object to use for the output image.
- field_findersize (None or tuple of two ints) – If findersize is None, the output image size will be set by the NAXIS1 and NAXIS2 keywords in the input fitsfile FITS header. Otherwise, findersize must be a tuple with the intended x and y size of the image in inches (all output images will use a DPI = 100).
- field_pltopts (dict) – field_pltopts controls how the overlay points will be plotted. This a dict with standard matplotlib marker, etc. kwargs as key-val pairs, e.g. ‘markersize’, ‘markerfacecolor’, etc. The default options make red outline circles at the location of each object in the overlay.
- field_grid (bool) – grid sets if a grid will be made on the output image.
- field_gridcolor (str) – gridcolor sets the color of the grid lines. This is a usual matplotib color spec string.
- field_zoomcontain (bool) – field_zoomcontain controls if the finder chart will be zoomed to just contain the overlayed points. Everything outside the footprint of these points will be discarded.
- copylcsto (str) – If this is provided, it is interpreted as a directory target to copy all the light curves that match the specified conditions.
Returns: Returns a two elem tuple: (matching_object_lcfiles, matching_objectids) if conesearch and/or column filters are used. If xmatchexternal is also used, a three-elem tuple is returned: (matching_object_lcfiles, matching_objectids, extcat_matched_objectids).
Return type: tuple
-
astrobase.lcproc.catalogs.
add_cpinfo_to_lclist
(checkplots, initial_lc_catalog, magcol, outfile, checkplotglob='checkplot*.pkl*', infokeys=[('comments', <class 'numpy.str_'>, False, True, '', ''), ('objectinfo.objecttags', <class 'numpy.str_'>, True, True, '', ''), ('objectinfo.twomassid', <class 'numpy.str_'>, True, True, '', ''), ('objectinfo.bmag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.vmag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.rmag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.imag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.jmag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.hmag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.kmag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.sdssu', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.sdssg', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.sdssr', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.sdssi', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.sdssz', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.dered_bmag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.dered_vmag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.dered_rmag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.dered_imag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.dered_jmag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.dered_hmag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.dered_kmag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.dered_sdssu', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.dered_sdssg', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.dered_sdssr', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.dered_sdssi', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.dered_sdssz', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.extinction_bmag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.extinction_vmag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.extinction_rmag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.extinction_imag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.extinction_jmag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.extinction_hmag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.extinction_kmag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.extinction_sdssu', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.extinction_sdssg', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.extinction_sdssr', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.extinction_sdssi', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.extinction_sdssz', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.color_classes', <class 'numpy.str_'>, True, True, '', ''), ('objectinfo.pmra', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.pmdecl', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.propermotion', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.rpmj', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.gl', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.gb', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.gaia_status', <class 'numpy.str_'>, True, True, '', ''), ('objectinfo.gaia_ids.0', <class 'numpy.str_'>, True, True, '', ''), ('objectinfo.gaiamag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.gaia_parallax', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.gaia_parallax_err', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.gaia_absmag', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.simbad_best_mainid', <class 'numpy.str_'>, True, True, '', ''), ('objectinfo.simbad_best_objtype', <class 'numpy.str_'>, True, True, '', ''), ('objectinfo.simbad_best_allids', <class 'numpy.str_'>, True, True, '', ''), ('objectinfo.simbad_best_distarcsec', <class 'numpy.float64'>, True, True, nan, nan), ('objectinfo.ticid', <class 'numpy.str_'>, True, True, '', ''), ('objectinfo.tic_version', <class 'numpy.str_'>, True, True, '', ''), ('objectinfo.tessmag', <class 'numpy.float64'>, True, True, nan, nan), ('varinfo.vartags', <class 'numpy.str_'>, False, True, '', ''), ('varinfo.varperiod', <class 'numpy.float64'>, False, True, nan, nan), ('varinfo.varepoch', <class 'numpy.float64'>, False, True, nan, nan), ('varinfo.varisperiodic', <class 'numpy.int64'>, False, True, 0, 0), ('varinfo.objectisvar', <class 'numpy.int64'>, False, True, 0, 0), ('varinfo.features.median', <class 'numpy.float64'>, False, True, nan, nan), ('varinfo.features.mad', <class 'numpy.float64'>, False, True, nan, nan), ('varinfo.features.stdev', <class 'numpy.float64'>, False, True, nan, nan), ('varinfo.features.mag_iqr', <class 'numpy.float64'>, False, True, nan, nan), ('varinfo.features.skew', <class 'numpy.float64'>, False, True, nan, nan), ('varinfo.features.kurtosis', <class 'numpy.float64'>, False, True, nan, nan), ('varinfo.features.stetsonj', <class 'numpy.float64'>, False, True, nan, nan), ('varinfo.features.stetsonk', <class 'numpy.float64'>, False, True, nan, nan), ('varinfo.features.eta_normal', <class 'numpy.float64'>, False, True, nan, nan), ('varinfo.features.linear_fit_slope', <class 'numpy.float64'>, False, True, nan, nan), ('varinfo.features.magnitude_ratio', <class 'numpy.float64'>, False, True, nan, nan), ('varinfo.features.beyond1std', <class 'numpy.float64'>, False, True, nan, nan)], nworkers=2)[source]¶ This adds checkplot info to the initial light curve catalogs generated by make_lclist.
This is used to incorporate all the extra info checkplots can have for objects back into columns in the light curve catalog produced by make_lclist. Objects are matched between the checkplots and the light curve catalog using their objectid. This then allows one to search this ‘augmented’ light curve catalog by these extra columns. The ‘augmented’ light curve catalog also forms the basis for search interface provided by the LCC-Server.
The default list of keys that will be extracted from a checkplot and added as columns in the initial light curve catalog is listed above in the CPINFO_DEFAULTKEYS list.
Parameters: - checkplots (str or list) – If this is a str, is interpreted as a directory which will be searched for checkplot pickle files using checkplotglob. If this is a list, it will be interpreted as a list of checkplot pickle files to process.
- initial_lc_catalog (str) – This is the path to the light curve catalog pickle made by make_lclist.
- magcol (str) – This is used to indicate the light curve magnitude column to extract magnitude column specific information. For example, Stetson variability indices can be generated using magnitude measurements in separate photometric apertures, which appear in separate magcols in the checkplot. To associate each such feature of the object with its specific magcol, pass that magcol in here. This magcol will then be added as a prefix to the resulting column in the ‘augmented’ LC catalog, e.g. Stetson J will appear as magcol1_stetsonj and magcol2_stetsonj for two separate magcols.
- outfile (str) – This is the file name of the output ‘augmented’ light curve catalog pickle file that will be written.
- infokeys (list of tuples) –
This is a list of keys to extract from the checkplot and some info on how this extraction is to be done. Each key entry is a six-element tuple of the following form:
- key name in the checkplot
- numpy dtype of the value of this key
- False if key is associated with a magcol or True otherwise
- False if subsequent updates to the same column name will append to existing key values in the output augmented light curve catalog or True if these will overwrite the existing key value
- character to use to substitute a None value of the key in the checkplot in the output light curve catalog column
- character to use to substitute a nan value of the key in the checkplot in the output light curve catalog column
See the CPFINFO_DEFAULTKEYS list above for examples.
- nworkers (int) – The number of parallel workers to launch to extract checkplot information.
Returns: Returns the path to the generated ‘augmented’ light curve catalog pickle file.
Return type: str
astrobase.lcproc.checkplotgen module¶
This contains functions to generate checkplot pickles from a collection of light curves (optionally including period-finding results).
-
astrobase.lcproc.checkplotgen.
update_checkplotdict_nbrlcs
(checkplotdict, timecol, magcol, errcol, lcformat='hat-sql', lcformatdir=None, verbose=True)[source]¶ For all neighbors in a checkplotdict, make LCs and phased LCs.
Parameters: - checkplotdict (dict) – This is the checkplot to process. The light curves for the neighbors to the object here will be extracted from the stored file paths, and this function will make plots of these time-series. If the object has ‘best’ periods and epochs generated by period-finder functions in this checkplotdict, phased light curve plots of each neighbor will be made using these to check the effects of blending.
- timecol,magcol,errcol (str) – The timecol, magcol, and errcol keys used to generate this object’s checkplot. This is used to extract the correct times-series from the neighbors’ light curves.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
Returns: The input checkplotdict is returned with the neighor light curve plots added in.
Return type: dict
-
astrobase.lcproc.checkplotgen.
runcp
(pfpickle, outdir, lcbasedir, lcfname=None, cprenorm=False, lclistpkl=None, nbrradiusarcsec=60.0, maxnumneighbors=5, makeneighborlcs=True, fast_mode=False, gaia_max_timeout=60.0, gaia_mirror=None, xmatchinfo=None, xmatchradiusarcsec=3.0, minobservations=99, sigclip=10.0, lcformat='hat-sql', lcformatdir=None, timecols=None, magcols=None, errcols=None, skipdone=False, done_callback=None, done_callback_args=None, done_callback_kwargs=None)[source]¶ This makes a checkplot pickle for the given period-finding result pickle produced by lcproc.periodfinding.runpf.
Parameters: - pfpickle (str or None) – This is the filename of the period-finding result pickle file created by lcproc.periodfinding.runpf. If this is None, the checkplot will be made anyway, but no phased LC information will be collected into the output checkplot pickle. This can be useful for just collecting GAIA and other external information and making LC plots for an object.
- outdir (str) – This is the directory to which the output checkplot pickle will be written.
- lcbasedir (str) – The base directory where this function will look for the light curve file associated with the object in the input period-finding result pickle file.
- lcfname (str or None) –
This is usually None because we’ll get the path to the light curve associated with this period-finding pickle from the pickle itself. If pfpickle is None, however, this function will use lcfname to look up the light curve file instead. If both are provided, the value of lcfname takes precedence.
Providing the light curve file name in this kwarg is useful when you’re making checkplots directly from light curve files and not including period-finder results (perhaps because period-finding takes a long time for large collections of LCs).
- cprenorm (bool) – Set this to True if the light curves should be renormalized by checkplot.checkplot_pickle. This is set to False by default because we do our own normalization in this function using the light curve’s registered normalization function and pass the normalized times, mags, errs to the checkplot.checkplot_pickle function.
- lclistpkl (str or dict) – This is either the filename of a pickle or the actual dict produced by lcproc.make_lclist. This is used to gather neighbor information.
- nbrradiusarcsec (float) – The radius in arcseconds to use for a search conducted around the coordinates of this object to look for any potential confusion and blending of variability amplitude caused by their proximity.
- maxnumneighbors (int) – The maximum number of neighbors that will have their light curves and magnitudes noted in this checkplot as potential blends with the target object.
- makeneighborlcs (bool) – If True, will make light curve and phased light curve plots for all neighbors to the current object found in the catalog passed in using lclistpkl.
- fast_mode (bool or float) –
This runs the external catalog operations in a “fast” mode, with short timeouts and not trying to hit external catalogs that take a long time to respond.
If this is set to True, the default settings for the external requests will then become:
skyview_lookup = False skyview_timeout = 10.0 skyview_retry_failed = False dust_timeout = 10.0 gaia_submit_timeout = 7.0 gaia_max_timeout = 10.0 gaia_submit_tries = 2 complete_query_later = False search_simbad = False
If this is a float, will run in “fast” mode with the provided timeout value in seconds and the following settings:
skyview_lookup = True skyview_timeout = fast_mode skyview_retry_failed = False dust_timeout = fast_mode gaia_submit_timeout = 0.66*fast_mode gaia_max_timeout = fast_mode gaia_submit_tries = 2 complete_query_later = False search_simbad = False
- gaia_max_timeout (float) – Sets the timeout in seconds to use when waiting for the GAIA service to respond to our request for the object’s information. Note that if fast_mode is set, this is ignored.
- gaia_mirror (str or None) – This sets the GAIA mirror to use. This is a key in the services.gaia.GAIA_URLS dict which defines the URLs to hit for each mirror.
- xmatchinfo (str or dict) – This is either the xmatch dict produced by the function load_xmatch_external_catalogs above, or the path to the xmatch info pickle file produced by that function.
- xmatchradiusarcsec (float) – This is the cross-matching radius to use in arcseconds.
- minobservations (int) – The minimum of observations the input object’s mag/flux time-series must have for this function to plot its light curve and phased light curve. If the object has less than this number, no light curves will be plotted, but the checkplotdict will still contain all of the other information.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- timecols (list of str or None) – The timecol keys to use from the lcdict in generating this checkplot.
- magcols (list of str or None) – The magcol keys to use from the lcdict in generating this checkplot.
- errcols (list of str or None) – The errcol keys to use from the lcdict in generating this checkplot.
- skipdone (bool) – This indicates if this function will skip creating checkplots that already exist corresponding to the current objectid and magcol. If skipdone is set to True, this will be done.
- done_callback (Python function or None) –
This is used to provide a function to execute after the checkplot pickles are generated. This is useful if you want to stream the results of checkplot making to some other process, e.g. directly running an ingestion into an LCC-Server collection. The function will always get the list of the generated checkplot pickles as its first arg, and all of the kwargs for runcp in the kwargs dict. Additional args and kwargs can be provided by giving a list in the done_callbacks_args kwarg and a dict in the done_callbacks_kwargs kwarg.
NOTE: the function you pass in here should be pickleable by normal Python if you want to use it with the parallel_cp and parallel_cp_lcdir functions below.
- done_callback_args (tuple or None) – If not None, contains any args to pass into the done_callback function.
- done_callback_kwargs (dict or None) – If not None, contains any kwargs to pass into the done_callback function.
Returns: This returns a list of checkplot pickle filenames with one element for each (timecol, magcol, errcol) combination provided in the default lcformat config or in the timecols, magcols, errcols kwargs.
Return type: list of str
-
astrobase.lcproc.checkplotgen.
runcp_worker
(task)[source]¶ This is the worker for running checkplots.
Parameters: task (tuple) – This is of the form: (pfpickle, outdir, lcbasedir, kwargs). Returns: The list of checkplot pickles returned by the runcp function. Return type: list of str
-
astrobase.lcproc.checkplotgen.
parallel_cp
(pfpicklelist, outdir, lcbasedir, fast_mode=False, lcfnamelist=None, cprenorm=False, lclistpkl=None, gaia_max_timeout=60.0, gaia_mirror=None, nbrradiusarcsec=60.0, maxnumneighbors=5, makeneighborlcs=True, xmatchinfo=None, xmatchradiusarcsec=3.0, sigclip=10.0, minobservations=99, lcformat='hat-sql', lcformatdir=None, timecols=None, magcols=None, errcols=None, skipdone=False, done_callback=None, done_callback_args=None, done_callback_kwargs=None, liststartindex=None, maxobjects=None, nworkers=2)[source]¶ This drives the parallel execution of runcp for a list of periodfinding result pickles.
Parameters: - pfpicklelist (list of str or list of Nones) – This is the list of the filenames of the period-finding result pickles to process. To make checkplots using the light curves directly, set this to a list of Nones with the same length as the list of light curve files that you provide in lcfnamelist.
- outdir (str) – The directory the checkplot pickles will be written to.
- lcbasedir (str) – The base directory that this function will look in to find the light curves pointed to by the period-finding result files. If you’re using lcfnamelist to provide a list of light curve filenames directly, this arg is ignored.
- lcfnamelist (list of str or None) – If this is provided, it must be a list of the input light curve filenames to process. These can either be associated with each input period-finder result pickle, or can be provided standalone to make checkplots without phased LC plots in them. In the second case, you must set pfpicklelist to a list of Nones that matches the length of lcfnamelist.
- cprenorm (bool) – Set this to True if the light curves should be renormalized by checkplot.checkplot_pickle. This is set to False by default because we do our own normalization in this function using the light curve’s registered normalization function and pass the normalized times, mags, errs to the checkplot.checkplot_pickle function.
- lclistpkl (str or dict) – This is either the filename of a pickle or the actual dict produced by lcproc.make_lclist. This is used to gather neighbor information.
- nbrradiusarcsec (float) – The radius in arcseconds to use for a search conducted around the coordinates of this object to look for any potential confusion and blending of variability amplitude caused by their proximity.
- maxnumneighbors (int) – The maximum number of neighbors that will have their light curves and magnitudes noted in this checkplot as potential blends with the target object.
- makeneighborlcs (bool) – If True, will make light curve and phased light curve plots for all neighbors found in the object collection for each input object.
- fast_mode (bool or float) –
This runs the external catalog operations in a “fast” mode, with short timeouts and not trying to hit external catalogs that take a long time to respond.
If this is set to True, the default settings for the external requests will then become:
skyview_lookup = False skyview_timeout = 10.0 skyview_retry_failed = False dust_timeout = 10.0 gaia_submit_timeout = 7.0 gaia_max_timeout = 10.0 gaia_submit_tries = 2 complete_query_later = False search_simbad = False
If this is a float, will run in “fast” mode with the provided timeout value in seconds and the following settings:
skyview_lookup = True skyview_timeout = fast_mode skyview_retry_failed = False dust_timeout = fast_mode gaia_submit_timeout = 0.66*fast_mode gaia_max_timeout = fast_mode gaia_submit_tries = 2 complete_query_later = False search_simbad = False
- gaia_max_timeout (float) – Sets the timeout in seconds to use when waiting for the GAIA service to respond to our request for the object’s information. Note that if fast_mode is set, this is ignored.
- gaia_mirror (str or None) – This sets the GAIA mirror to use. This is a key in the services.gaia.GAIA_URLS dict which defines the URLs to hit for each mirror.
- xmatchinfo (str or dict) – This is either the xmatch dict produced by the function load_xmatch_external_catalogs above, or the path to the xmatch info pickle file produced by that function.
- xmatchradiusarcsec (float) – This is the cross-matching radius to use in arcseconds.
- minobservations (int) – The minimum of observations the input object’s mag/flux time-series must have for this function to plot its light curve and phased light curve. If the object has less than this number, no light curves will be plotted, but the checkplotdict will still contain all of the other information.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- timecols (list of str or None) – The timecol keys to use from the lcdict in generating this checkplot.
- magcols (list of str or None) – The magcol keys to use from the lcdict in generating this checkplot.
- errcols (list of str or None) – The errcol keys to use from the lcdict in generating this checkplot.
- skipdone (bool) – This indicates if this function will skip creating checkplots that already exist corresponding to the current objectid and magcol. If skipdone is set to True, this will be done.
- done_callback (Python function or None) –
This is used to provide a function to execute after the checkplot pickles are generated. This is useful if you want to stream the results of checkplot making to some other process, e.g. directly running an ingestion into an LCC-Server collection. The function will always get the list of the generated checkplot pickles as its first arg, and all of the kwargs for runcp in the kwargs dict. Additional args and kwargs can be provided by giving a list in the done_callbacks_args kwarg and a dict in the done_callbacks_kwargs kwarg.
NOTE: the function you pass in here should be pickleable by normal Python if you want to use it with the parallel_cp and parallel_cp_lcdir functions below.
- done_callback_args (tuple or None) – If not None, contains any args to pass into the done_callback function.
- done_callback_kwargs (dict or None) – If not None, contains any kwargs to pass into the done_callback function.
- liststartindex (int) – The index of the pfpicklelist (and lcfnamelist if provided) to start working at.
- maxobjects (int) – The maximum number of objects to process in this run. Use this with liststartindex to effectively distribute working on a large list of input period-finding result pickles (and light curves if lcfnamelist is also provided) over several sessions or machines.
- nworkers (int) – The number of parallel workers that will work on the checkplot generation process.
Returns: This returns a dict with keys = input period-finding pickles and vals = list of the corresponding checkplot pickles produced.
Return type: dict
-
astrobase.lcproc.checkplotgen.
parallel_cp_pfdir
(pfpickledir, outdir, lcbasedir, pfpickleglob='periodfinding-*.pkl*', lclistpkl=None, cprenorm=False, nbrradiusarcsec=60.0, maxnumneighbors=5, makeneighborlcs=True, fast_mode=False, gaia_max_timeout=60.0, gaia_mirror=None, xmatchinfo=None, xmatchradiusarcsec=3.0, minobservations=99, sigclip=10.0, lcformat='hat-sql', lcformatdir=None, timecols=None, magcols=None, errcols=None, skipdone=False, done_callback=None, done_callback_args=None, done_callback_kwargs=None, maxobjects=None, nworkers=32)[source]¶ This drives the parallel execution of runcp for a directory of periodfinding pickles.
Parameters: - pfpickledir (str) – This is the directory containing all of the period-finding pickles to process.
- outdir (str) – The directory the checkplot pickles will be written to.
- lcbasedir (str) – The base directory that this function will look in to find the light curves pointed to by the period-finding result files. If you’re using lcfnamelist to provide a list of light curve filenames directly, this arg is ignored.
- pkpickleglob (str) – This is a UNIX file glob to select period-finding result pickles in the specified pfpickledir.
- lclistpkl (str or dict) – This is either the filename of a pickle or the actual dict produced by lcproc.make_lclist. This is used to gather neighbor information.
- cprenorm (bool) – Set this to True if the light curves should be renormalized by checkplot.checkplot_pickle. This is set to False by default because we do our own normalization in this function using the light curve’s registered normalization function and pass the normalized times, mags, errs to the checkplot.checkplot_pickle function.
- nbrradiusarcsec (float) – The radius in arcseconds to use for a search conducted around the coordinates of this object to look for any potential confusion and blending of variability amplitude caused by their proximity.
- maxnumneighbors (int) – The maximum number of neighbors that will have their light curves and magnitudes noted in this checkplot as potential blends with the target object.
- makeneighborlcs (bool) – If True, will make light curve and phased light curve plots for all neighbors found in the object collection for each input object.
- fast_mode (bool or float) –
This runs the external catalog operations in a “fast” mode, with short timeouts and not trying to hit external catalogs that take a long time to respond.
If this is set to True, the default settings for the external requests will then become:
skyview_lookup = False skyview_timeout = 10.0 skyview_retry_failed = False dust_timeout = 10.0 gaia_submit_timeout = 7.0 gaia_max_timeout = 10.0 gaia_submit_tries = 2 complete_query_later = False search_simbad = False
If this is a float, will run in “fast” mode with the provided timeout value in seconds and the following settings:
skyview_lookup = True skyview_timeout = fast_mode skyview_retry_failed = False dust_timeout = fast_mode gaia_submit_timeout = 0.66*fast_mode gaia_max_timeout = fast_mode gaia_submit_tries = 2 complete_query_later = False search_simbad = False
- gaia_max_timeout (float) – Sets the timeout in seconds to use when waiting for the GAIA service to respond to our request for the object’s information. Note that if fast_mode is set, this is ignored.
- gaia_mirror (str or None) – This sets the GAIA mirror to use. This is a key in the services.gaia.GAIA_URLS dict which defines the URLs to hit for each mirror.
- xmatchinfo (str or dict) – This is either the xmatch dict produced by the function load_xmatch_external_catalogs above, or the path to the xmatch info pickle file produced by that function.
- xmatchradiusarcsec (float) – This is the cross-matching radius to use in arcseconds.
- minobservations (int) – The minimum of observations the input object’s mag/flux time-series must have for this function to plot its light curve and phased light curve. If the object has less than this number, no light curves will be plotted, but the checkplotdict will still contain all of the other information.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- timecols (list of str or None) – The timecol keys to use from the lcdict in generating this checkplot.
- magcols (list of str or None) – The magcol keys to use from the lcdict in generating this checkplot.
- errcols (list of str or None) – The errcol keys to use from the lcdict in generating this checkplot.
- skipdone (bool) – This indicates if this function will skip creating checkplots that already exist corresponding to the current objectid and magcol. If skipdone is set to True, this will be done.
- done_callback (Python function or None) –
This is used to provide a function to execute after the checkplot pickles are generated. This is useful if you want to stream the results of checkplot making to some other process, e.g. directly running an ingestion into an LCC-Server collection. The function will always get the list of the generated checkplot pickles as its first arg, and all of the kwargs for runcp in the kwargs dict. Additional args and kwargs can be provided by giving a list in the done_callbacks_args kwarg and a dict in the done_callbacks_kwargs kwarg.
NOTE: the function you pass in here should be pickleable by normal Python if you want to use it with the parallel_cp and parallel_cp_lcdir functions below.
- done_callback_args (tuple or None) – If not None, contains any args to pass into the done_callback function.
- done_callback_kwargs (dict or None) – If not None, contains any kwargs to pass into the done_callback function.
- maxobjects (int) – The maximum number of objects to process in this run.
- nworkers (int) – The number of parallel workers that will work on the checkplot generation process.
Returns: This returns a dict with keys = input period-finding pickles and vals = list of the corresponding checkplot pickles produced.
Return type: dict
astrobase.lcproc.checkplotproc module¶
This contains functions to post-process checkplot pickles generated from a collection of light curves beforehand (perhaps using lcproc.checkplotgen).
-
astrobase.lcproc.checkplotproc.
xmatch_cplist_external_catalogs
(cplist, xmatchpkl, xmatchradiusarcsec=2.0, updateexisting=True, resultstodir=None)[source]¶ This xmatches external catalogs to a collection of checkplots.
Parameters: - cplist (list of str) – This is the list of checkplot pickle files to process.
- xmatchpkl (str) – The filename of a pickle prepared beforehand with the checkplot.pkl_xmatch.load_xmatch_external_catalogs function, containing collected external catalogs to cross-match the objects in the input cplist against.
- xmatchradiusarcsec (float) – The match radius to use for the cross-match in arcseconds.
- updateexisting (bool) – If this is True, will only update the xmatch dict in each checkplot pickle with any new cross-matches to the external catalogs. If False, will overwrite the xmatch dict with results from the current run.
- resultstodir (str or None) – If this is provided, then it must be a directory to write the resulting checkplots to after xmatch is done. This can be used to keep the original checkplots in pristine condition for some reason.
Returns: Returns a dict with keys = input checkplot pickle filenames and vals = xmatch status dict for each checkplot pickle.
Return type: dict
-
astrobase.lcproc.checkplotproc.
xmatch_cpdir_external_catalogs
(cpdir, xmatchpkl, cpfileglob='checkplot-*.pkl*', xmatchradiusarcsec=2.0, updateexisting=True, resultstodir=None)[source]¶ This xmatches external catalogs to all checkplots in a directory.
Parameters: - cpdir (str) – This is the directory to search in for checkplots.
- xmatchpkl (str) – The filename of a pickle prepared beforehand with the checkplot.pkl_xmatch.load_xmatch_external_catalogs function, containing collected external catalogs to cross-match the objects in the input cplist against.
- cpfileglob (str) – This is the UNIX fileglob to use in searching for checkplots.
- xmatchradiusarcsec (float) – The match radius to use for the cross-match in arcseconds.
- updateexisting (bool) – If this is True, will only update the xmatch dict in each checkplot pickle with any new cross-matches to the external catalogs. If False, will overwrite the xmatch dict with results from the current run.
- resultstodir (str or None) – If this is provided, then it must be a directory to write the resulting checkplots to after xmatch is done. This can be used to keep the original checkplots in pristine condition for some reason.
Returns: Returns a dict with keys = input checkplot pickle filenames and vals = xmatch status dict for each checkplot pickle.
Return type: dict
-
astrobase.lcproc.checkplotproc.
colormagdiagram_cplist
(cplist, outpkl, color_mag1=('gaiamag', 'sdssg'), color_mag2=('kmag', 'kmag'), yaxis_mag=('gaia_absmag', 'rpmj'))[source]¶ This makes color-mag diagrams for all checkplot pickles in the provided list.
Can make an arbitrary number of CMDs given lists of x-axis colors and y-axis mags to use.
Parameters: - cplist (list of str) – This is the list of checkplot pickles to process.
- outpkl (str) – The filename of the output pickle that will contain the color-mag information for all objects in the checkplots specified in cplist.
- color_mag1 (list of str) –
This a list of the keys in each checkplot’s objectinfo dict that will be used as color_1 in the equation:
x-axis color = color_mag1 - color_mag2
- color_mag2 (list of str) –
This a list of the keys in each checkplot’s objectinfo dict that will be used as color_2 in the equation:
x-axis color = color_mag1 - color_mag2
- yaxis_mag (list of str) – This is a list of the keys in each checkplot’s objectinfo dict that will be used as the (absolute) magnitude y-axis of the color-mag diagrams.
Returns: The path to the generated CMD pickle file for the collection of objects in the input checkplot list.
Return type: str
Notes
This can make many CMDs in one go. For example, the default kwargs for color_mag, color_mag2, and yaxis_mag result in two CMDs generated and written to the output pickle file:
- CMD1 -> gaiamag - kmag on the x-axis vs gaia_absmag on the y-axis
- CMD2 -> sdssg - kmag on the x-axis vs rpmj (J reduced PM) on the y-axis
-
astrobase.lcproc.checkplotproc.
colormagdiagram_cpdir
(cpdir, outpkl, cpfileglob='checkplot*.pkl*', color_mag1=('gaiamag', 'sdssg'), color_mag2=('kmag', 'kmag'), yaxis_mag=('gaia_absmag', 'rpmj'))[source]¶ This makes CMDs for all checkplot pickles in the provided directory.
Can make an arbitrary number of CMDs given lists of x-axis colors and y-axis mags to use.
Parameters: - cpdir (list of str) – This is the directory to get the list of input checkplot pickles from.
- outpkl (str) – The filename of the output pickle that will contain the color-mag information for all objects in the checkplots specified in cplist.
- cpfileglob (str) – The UNIX fileglob to use to search for checkplot pickle files.
- color_mag1 (list of str) –
This a list of the keys in each checkplot’s objectinfo dict that will be used as color_1 in the equation:
x-axis color = color_mag1 - color_mag2
- color_mag2 (list of str) –
This a list of the keys in each checkplot’s objectinfo dict that will be used as color_2 in the equation:
x-axis color = color_mag1 - color_mag2
- yaxis_mag (list of str) – This is a list of the keys in each checkplot’s objectinfo dict that will be used as the (absolute) magnitude y-axis of the color-mag diagrams.
Returns: The path to the generated CMD pickle file for the collection of objects in the input checkplot directory.
Return type: str
Notes
This can make many CMDs in one go. For example, the default kwargs for color_mag, color_mag2, and yaxis_mag result in two CMDs generated and written to the output pickle file:
- CMD1 -> gaiamag - kmag on the x-axis vs gaia_absmag on the y-axis
- CMD2 -> sdssg - kmag on the x-axis vs rpmj (J reduced PM) on the y-axis
-
astrobase.lcproc.checkplotproc.
add_cmd_to_checkplot
(cpx, cmdpkl, require_cmd_magcolor=True, save_cmd_pngs=False)[source]¶ This adds CMD figures to a checkplot dict or pickle.
Looks up the CMDs in cmdpkl, adds the object from cpx as a gold(-ish) star in the plot, and then saves the figure to a base64 encoded PNG, which can then be read and used by the checkplotserver.
Parameters: - cpx (str or dict) – This is the input checkplot pickle or dict to add the CMD to.
- cmdpkl (str or dict) – The CMD pickle generated by the colormagdiagram_cplist or colormagdiagram_cpdir functions above, or the dict produced by reading this pickle in.
- require_cmd_magcolor (bool) – If this is True, a CMD plot will not be made if the color and mag keys required by the CMD are not present or are nan in this checkplot’s objectinfo dict.
- save_cmd_png (bool) – If this is True, then will save the CMD plots that were generated and added back to the checkplotdict as PNGs to the same directory as cpx. If cpx is a dict, will save them to the current working directory.
Returns: If cpx was a str filename of checkplot pickle, this will return that filename to indicate that the CMD was added to the file. If cpx was a checkplotdict, this will return the checkplotdict with a new key called ‘colormagdiagram’ containing the base64 encoded PNG binary streams of all CMDs generated.
Return type: str or dict
-
astrobase.lcproc.checkplotproc.
add_cmds_cplist
(cplist, cmdpkl, require_cmd_magcolor=True, save_cmd_pngs=False)[source]¶ This adds CMDs for each object in cplist.
Parameters: - cplist (list of str) – This is the input list of checkplot pickles to add the CMDs to.
- cmdpkl (str) – This is the filename of the CMD pickle created previously.
- require_cmd_magcolor (bool) – If this is True, a CMD plot will not be made if the color and mag keys required by the CMD are not present or are nan in each checkplot’s objectinfo dict.
- save_cmd_pngs (bool) – If this is True, then will save the CMD plots that were generated and added back to the checkplotdict as PNGs to the same directory as cpx.
Returns: Return type: Nothing.
-
astrobase.lcproc.checkplotproc.
add_cmds_cpdir
(cpdir, cmdpkl, cpfileglob='checkplot*.pkl*', require_cmd_magcolor=True, save_cmd_pngs=False)[source]¶ This adds CMDs for each object in cpdir.
Parameters: - cpdir (list of str) – This is the directory to search for checkplot pickles.
- cmdpkl (str) – This is the filename of the CMD pickle created previously.
- cpfileglob (str) – The UNIX fileglob to use when searching for checkplot pickles to operate on.
- require_cmd_magcolor (bool) – If this is True, a CMD plot will not be made if the color and mag keys required by the CMD are not present or are nan in each checkplot’s objectinfo dict.
- save_cmd_pngs (bool) – If this is True, then will save the CMD plots that were generated and added back to the checkplotdict as PNGs to the same directory as cpx.
Returns: Return type: Nothing.
-
astrobase.lcproc.checkplotproc.
cp_objectinfo_worker
(task)[source]¶ This is a parallel worker for parallel_update_cp_objectinfo.
Parameters: task (tuple) – - task[0] = checkplot pickle file
- task[1] = kwargs
Returns: The name of the checkplot file that was updated. None if the update fails for some reason. Return type: str
-
astrobase.lcproc.checkplotproc.
parallel_update_objectinfo_cplist
(cplist, liststartindex=None, maxobjects=None, nworkers=2, fast_mode=False, findercmap='gray_r', finderconvolve=None, deredden_object=True, custom_bandpasses=None, gaia_submit_timeout=10.0, gaia_submit_tries=3, gaia_max_timeout=180.0, gaia_mirror=None, complete_query_later=True, lclistpkl=None, nbrradiusarcsec=60.0, maxnumneighbors=5, plotdpi=100, findercachedir='~/.astrobase/stamp-cache', verbose=True)[source]¶ This updates objectinfo for a list of checkplots.
Useful in cases where a previous round of GAIA/finderchart/external catalog acquisition failed. This will preserve the following keys in the checkplots if they exist:
comments varinfo objectinfo.objecttags
Parameters: - cplist (list of str) – A list of checkplot pickle file names to update.
- liststartindex (int) – The index of the input list to start working at.
- maxobjects (int) – The maximum number of objects to process in this run. Use this with liststartindex to effectively distribute working on a large list of input checkplot pickles over several sessions or machines.
- nworkers (int) – The number of parallel workers that will work on the checkplot update process.
- fast_mode (bool or float) – This runs the external catalog operations in a “fast” mode, with short timeouts and not trying to hit external catalogs that take a long time to respond. See the docstring for checkplot.pkl_utils._pkl_finder_objectinfo for details on how this works. If this is True, will run in “fast” mode with default timeouts (5 seconds in most cases). If this is a float, will run in “fast” mode with the provided timeout value in seconds.
- findercmap (str or matplotlib.cm.ColorMap object) –
- findercmap – The Colormap object to use for the finder chart image.
- finderconvolve (astropy.convolution.Kernel object or None) – If not None, the Kernel object to use for convolving the finder image.
- deredden_objects (bool) – If this is True, will use the 2MASS DUST service to get extinction coefficients in various bands, and then try to deredden the magnitudes and colors of the object already present in the checkplot’s objectinfo dict.
- custom_bandpasses (dict) – This is a dict used to provide custom bandpass definitions for any magnitude measurements in the objectinfo dict that are not automatically recognized by the varclass.starfeatures.color_features function. See its docstring for details on the required format.
- gaia_submit_timeout (float) – Sets the timeout in seconds to use when submitting a request to look up the object’s information to the GAIA service. Note that if fast_mode is set, this is ignored.
- gaia_submit_tries (int) – Sets the maximum number of times the GAIA services will be contacted to obtain this object’s information. If fast_mode is set, this is ignored, and the services will be contacted only once (meaning that a failure to respond will be silently ignored and no GAIA data will be added to the checkplot’s objectinfo dict).
- gaia_max_timeout (float) – Sets the timeout in seconds to use when waiting for the GAIA service to respond to our request for the object’s information. Note that if fast_mode is set, this is ignored.
- gaia_mirror (str) – This sets the GAIA mirror to use. This is a key in the services.gaia.GAIA_URLS dict which defines the URLs to hit for each mirror.
- complete_query_later (bool) – If this is True, saves the state of GAIA queries that are not yet complete when gaia_max_timeout is reached while waiting for the GAIA service to respond to our request. A later call for GAIA info on the same object will attempt to pick up the results from the existing query if it’s completed. If fast_mode is True, this is ignored.
- lclistpkl (dict or str) – If this is provided, must be a dict resulting from reading a catalog produced by the lcproc.catalogs.make_lclist function or a str path pointing to the pickle file produced by that function. This catalog is used to find neighbors of the current object in the current light curve collection. Looking at neighbors of the object within the radius specified by nbrradiusarcsec is useful for light curves produced by instruments that have a large pixel scale, so are susceptible to blending of variability and potential confusion of neighbor variability with that of the actual object being looked at. If this is None, no neighbor lookups will be performed.
- nbrradiusarcsec (float) – The radius in arcseconds to use for a search conducted around the coordinates of this object to look for any potential confusion and blending of variability amplitude caused by their proximity.
- maxnumneighbors (int) – The maximum number of neighbors that will have their light curves and magnitudes noted in this checkplot as potential blends with the target object.
- plotdpi (int) – The resolution in DPI of the plots to generate in this function (e.g. the finder chart, etc.)
- findercachedir (str) – The path to the astrobase cache directory for finder chart downloads from the NASA SkyView service.
- verbose (bool) – If True, will indicate progress and warn about potential problems.
Returns: Paths to the updated checkplot pickle file.
Return type: list of str
-
astrobase.lcproc.checkplotproc.
parallel_update_objectinfo_cpdir
(cpdir, cpglob='checkplot-*.pkl*', liststartindex=None, maxobjects=None, nworkers=2, fast_mode=False, findercmap='gray_r', finderconvolve=None, deredden_object=True, custom_bandpasses=None, gaia_submit_timeout=10.0, gaia_submit_tries=3, gaia_max_timeout=180.0, gaia_mirror=None, complete_query_later=True, lclistpkl=None, nbrradiusarcsec=60.0, maxnumneighbors=5, plotdpi=100, findercachedir='~/.astrobase/stamp-cache', verbose=True)[source]¶ This updates the objectinfo for a directory of checkplot pickles.
Useful in cases where a previous round of GAIA/finderchart/external catalog acquisition failed. This will preserve the following keys in the checkplots if they exist:
comments varinfo objectinfo.objecttags
Parameters: - cpdir (str) – The directory to look for checkplot pickles in.
- cpglob (str) – The UNIX fileglob to use when searching for checkplot pickle files.
- liststartindex (int) – The index of the input list to start working at.
- maxobjects (int) – The maximum number of objects to process in this run. Use this with liststartindex to effectively distribute working on a large list of input checkplot pickles over several sessions or machines.
- nworkers (int) – The number of parallel workers that will work on the checkplot update process.
- fast_mode (bool or float) – This runs the external catalog operations in a “fast” mode, with short timeouts and not trying to hit external catalogs that take a long time to respond. See the docstring for checkplot.pkl_utils._pkl_finder_objectinfo for details on how this works. If this is True, will run in “fast” mode with default timeouts (5 seconds in most cases). If this is a float, will run in “fast” mode with the provided timeout value in seconds.
- findercmap (str or matplotlib.cm.ColorMap object) –
- findercmap – The Colormap object to use for the finder chart image.
- finderconvolve (astropy.convolution.Kernel object or None) – If not None, the Kernel object to use for convolving the finder image.
- deredden_objects (bool) – If this is True, will use the 2MASS DUST service to get extinction coefficients in various bands, and then try to deredden the magnitudes and colors of the object already present in the checkplot’s objectinfo dict.
- custom_bandpasses (dict) – This is a dict used to provide custom bandpass definitions for any magnitude measurements in the objectinfo dict that are not automatically recognized by the varclass.starfeatures.color_features function. See its docstring for details on the required format.
- gaia_submit_timeout (float) – Sets the timeout in seconds to use when submitting a request to look up the object’s information to the GAIA service. Note that if fast_mode is set, this is ignored.
- gaia_submit_tries (int) – Sets the maximum number of times the GAIA services will be contacted to obtain this object’s information. If fast_mode is set, this is ignored, and the services will be contacted only once (meaning that a failure to respond will be silently ignored and no GAIA data will be added to the checkplot’s objectinfo dict).
- gaia_max_timeout (float) – Sets the timeout in seconds to use when waiting for the GAIA service to respond to our request for the object’s information. Note that if fast_mode is set, this is ignored.
- gaia_mirror (str) – This sets the GAIA mirror to use. This is a key in the services.gaia.GAIA_URLS dict which defines the URLs to hit for each mirror.
- complete_query_later (bool) – If this is True, saves the state of GAIA queries that are not yet complete when gaia_max_timeout is reached while waiting for the GAIA service to respond to our request. A later call for GAIA info on the same object will attempt to pick up the results from the existing query if it’s completed. If fast_mode is True, this is ignored.
- lclistpkl (dict or str) – If this is provided, must be a dict resulting from reading a catalog produced by the lcproc.catalogs.make_lclist function or a str path pointing to the pickle file produced by that function. This catalog is used to find neighbors of the current object in the current light curve collection. Looking at neighbors of the object within the radius specified by nbrradiusarcsec is useful for light curves produced by instruments that have a large pixel scale, so are susceptible to blending of variability and potential confusion of neighbor variability with that of the actual object being looked at. If this is None, no neighbor lookups will be performed.
- nbrradiusarcsec (float) – The radius in arcseconds to use for a search conducted around the coordinates of this object to look for any potential confusion and blending of variability amplitude caused by their proximity.
- maxnumneighbors (int) – The maximum number of neighbors that will have their light curves and magnitudes noted in this checkplot as potential blends with the target object.
- plotdpi (int) – The resolution in DPI of the plots to generate in this function (e.g. the finder chart, etc.)
- findercachedir (str) – The path to the astrobase cache directory for finder chart downloads from the NASA SkyView service.
- verbose (bool) – If True, will indicate progress and warn about potential problems.
Returns: Paths to the updated checkplot pickle file.
Return type: list of str
astrobase.lcproc.epd module¶
This contains functions to run External Parameter Decorrelation (EPD) on a large collection of light curves.
-
astrobase.lcproc.epd.
apply_epd_magseries
(lcfile, timecol, magcol, errcol, externalparams, lcformat='hat-sql', lcformatdir=None, epdsmooth_sigclip=3.0, epdsmooth_windowsize=21, epdsmooth_func=<function smooth_magseries_savgol>, epdsmooth_extraparams=None)[source]¶ This applies external parameter decorrelation (EPD) to a light curve.
Parameters: - lcfile (str) – The filename of the light curve file to process.
- timecol,magcol,errcol (str) – The keys in the lcdict produced by your light curve reader function that correspond to the times, mags/fluxes, and associated measurement errors that will be used as input to the EPD process.
- externalparams (dict or None) –
This is a dict that indicates which keys in the lcdict obtained from the lcfile correspond to the required external parameters. As with timecol, magcol, and errcol, these can be simple keys (e.g. ‘rjd’) or compound keys (‘magaperture1.mags’). The dict should look something like:
{'fsv':'<lcdict key>' array: S values for each observation, 'fdv':'<lcdict key>' array: D values for each observation, 'fkv':'<lcdict key>' array: K values for each observation, 'xcc':'<lcdict key>' array: x coords for each observation, 'ycc':'<lcdict key>' array: y coords for each observation, 'bgv':'<lcdict key>' array: sky background for each observation, 'bge':'<lcdict key>' array: sky background err for each observation, 'iha':'<lcdict key>' array: hour angle for each observation, 'izd':'<lcdict key>' array: zenith distance for each observation}
Alternatively, if these exact keys are already present in the lcdict, indicate this by setting externalparams to None.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- epdsmooth_sigclip (float or int or sequence of two floats/ints or None) –
This specifies how to sigma-clip the input LC before fitting the EPD function to it.
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- epdsmooth_windowsize (int) – This is the number of LC points to smooth over to generate a smoothed light curve that will be used to fit the EPD function.
- epdsmooth_func (Python function) –
This sets the smoothing filter function to use. A Savitsky-Golay filter is used to smooth the light curve by default. The functions that can be used with this kwarg are listed in varbase.trends. If you want to use your own function, it MUST have the following signature:
def smoothfunc(mags_array, window_size, **extraparams)
and return a numpy array of the same size as mags_array with the smoothed time-series. Any extra params can be provided using the extraparams dict.
- epdsmooth_extraparams (dict) – This is a dict of any extra filter params to supply to the smoothing function.
Returns: Writes the output EPD light curve to a pickle that contains the lcdict with an added lcdict[‘epd’] key, which contains the EPD times, mags/fluxes, and errs as lcdict[‘epd’][‘times’], lcdict[‘epd’][‘mags’], and lcdict[‘epd’][‘errs’]. Returns the filename of this generated EPD LC pickle file.
Return type: str
Notes
- S -> measure of PSF sharpness (~1/sigma^2 sosmaller S = wider PSF)
- D -> measure of PSF ellipticity in xy direction
- K -> measure of PSF ellipticity in cross direction
S, D, K are related to the PSF’s variance and covariance, see eqn 30-33 in A. Pal’s thesis: https://arxiv.org/abs/0906.3486
-
astrobase.lcproc.epd.
parallel_epd_worker
(task)[source]¶ This is a parallel worker for the function below.
Parameters: task (tuple) – - task[0] = lcfile
- task[1] = timecol
- task[2] = magcol
- task[3] = errcol
- task[4] = externalparams
- task[5] = lcformat
- task[6] = lcformatdir
- task[7] = epdsmooth_sigclip
- task[8] = epdsmooth_windowsize
- task[9] = epdsmooth_func
- task[10] = epdsmooth_extraparams
Returns: If EPD succeeds for an input LC, returns the filename of the output EPD LC pickle file. If it fails, returns None. Return type: str or None
-
astrobase.lcproc.epd.
parallel_epd_lclist
(lclist, externalparams, timecols=None, magcols=None, errcols=None, lcformat='hat-sql', lcformatdir=None, epdsmooth_sigclip=3.0, epdsmooth_windowsize=21, epdsmooth_func=<function smooth_magseries_savgol>, epdsmooth_extraparams=None, nworkers=2, maxworkertasks=1000)[source]¶ This applies EPD in parallel to all LCs in the input list.
Parameters: - lclist (list of str) – This is the list of light curve files to run EPD on.
- externalparams (dict or None) –
This is a dict that indicates which keys in the lcdict obtained from the lcfile correspond to the required external parameters. As with timecol, magcol, and errcol, these can be simple keys (e.g. ‘rjd’) or compound keys (‘magaperture1.mags’). The dict should look something like:
{'fsv':'<lcdict key>' array: S values for each observation, 'fdv':'<lcdict key>' array: D values for each observation, 'fkv':'<lcdict key>' array: K values for each observation, 'xcc':'<lcdict key>' array: x coords for each observation, 'ycc':'<lcdict key>' array: y coords for each observation, 'bgv':'<lcdict key>' array: sky background for each observation, 'bge':'<lcdict key>' array: sky background err for each observation, 'iha':'<lcdict key>' array: hour angle for each observation, 'izd':'<lcdict key>' array: zenith distance for each observation}
Alternatively, if these exact keys are already present in the lcdict, indicate this by setting externalparams to None.
- timecols,magcols,errcols (lists of str) – The keys in the lcdict produced by your light curve reader function that correspond to the times, mags/fluxes, and associated measurement errors that will be used as inputs to the EPD process. If these are None, the default values for timecols, magcols, and errcols for your light curve format will be used here.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curve files.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- epdsmooth_sigclip (float or int or sequence of two floats/ints or None) –
This specifies how to sigma-clip the input LC before fitting the EPD function to it.
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- epdsmooth_windowsize (int) – This is the number of LC points to smooth over to generate a smoothed light curve that will be used to fit the EPD function.
- epdsmooth_func (Python function) –
This sets the smoothing filter function to use. A Savitsky-Golay filter is used to smooth the light curve by default. The functions that can be used with this kwarg are listed in varbase.trends. If you want to use your own function, it MUST have the following signature:
def smoothfunc(mags_array, window_size, **extraparams)
and return a numpy array of the same size as mags_array with the smoothed time-series. Any extra params can be provided using the extraparams dict.
- epdsmooth_extraparams (dict) – This is a dict of any extra filter params to supply to the smoothing function.
- nworkers (int) – The number of parallel workers to launch when processing the LCs.
- maxworkertasks (int) – The maximum number of tasks a parallel worker will complete before it is replaced with a new one (sometimes helps with memory-leaks).
Returns: Returns a dict organized by all the keys in the input magcols list, containing lists of EPD pickle light curves for that magcol.
Return type: dict
Notes
- S -> measure of PSF sharpness (~1/sigma^2 sosmaller S = wider PSF)
- D -> measure of PSF ellipticity in xy direction
- K -> measure of PSF ellipticity in cross direction
S, D, K are related to the PSF’s variance and covariance, see eqn 30-33 in A. Pal’s thesis: https://arxiv.org/abs/0906.3486
-
astrobase.lcproc.epd.
parallel_epd_lcdir
(lcdir, externalparams, lcfileglob=None, timecols=None, magcols=None, errcols=None, lcformat='hat-sql', lcformatdir=None, epdsmooth_sigclip=3.0, epdsmooth_windowsize=21, epdsmooth_func=<function smooth_magseries_savgol>, epdsmooth_extraparams=None, nworkers=2, maxworkertasks=1000)[source]¶ This applies EPD in parallel to all LCs in a directory.
Parameters: - lcdir (str) – The light curve directory to process.
- externalparams (dict or None) –
This is a dict that indicates which keys in the lcdict obtained from the lcfile correspond to the required external parameters. As with timecol, magcol, and errcol, these can be simple keys (e.g. ‘rjd’) or compound keys (‘magaperture1.mags’). The dict should look something like:
{'fsv':'<lcdict key>' array: S values for each observation, 'fdv':'<lcdict key>' array: D values for each observation, 'fkv':'<lcdict key>' array: K values for each observation, 'xcc':'<lcdict key>' array: x coords for each observation, 'ycc':'<lcdict key>' array: y coords for each observation, 'bgv':'<lcdict key>' array: sky background for each observation, 'bge':'<lcdict key>' array: sky background err for each observation, 'iha':'<lcdict key>' array: hour angle for each observation, 'izd':'<lcdict key>' array: zenith distance for each observation}
- lcfileglob (str or None) – A UNIX fileglob to use to select light curve files in lcdir. If this is not None, the value provided will override the default fileglob for your light curve format.
- timecols,magcols,errcols (lists of str) – The keys in the lcdict produced by your light curve reader function that correspond to the times, mags/fluxes, and associated measurement errors that will be used as inputs to the EPD process. If these are None, the default values for timecols, magcols, and errcols for your light curve format will be used here.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- epdsmooth_sigclip (float or int or sequence of two floats/ints or None) –
This specifies how to sigma-clip the input LC before fitting the EPD function to it.
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- epdsmooth_windowsize (int) – This is the number of LC points to smooth over to generate a smoothed light curve that will be used to fit the EPD function.
- epdsmooth_func (Python function) –
This sets the smoothing filter function to use. A Savitsky-Golay filter is used to smooth the light curve by default. The functions that can be used with this kwarg are listed in varbase.trends. If you want to use your own function, it MUST have the following signature:
def smoothfunc(mags_array, window_size, **extraparams)
and return a numpy array of the same size as mags_array with the smoothed time-series. Any extra params can be provided using the extraparams dict.
- epdsmooth_extraparams (dict) – This is a dict of any extra filter params to supply to the smoothing function.
- nworkers (int) – The number of parallel workers to launch when processing the LCs.
- maxworkertasks (int) – The maximum number of tasks a parallel worker will complete before it is replaced with a new one (sometimes helps with memory-leaks).
Returns: Returns a dict organized by all the keys in the input magcols list, containing lists of EPD pickle light curves for that magcol.
Return type: dict
Notes
- S -> measure of PSF sharpness (~1/sigma^2 sosmaller S = wider PSF)
- D -> measure of PSF ellipticity in xy direction
- K -> measure of PSF ellipticity in cross direction
S, D, K are related to the PSF’s variance and covariance, see eqn 30-33 in A. Pal’s thesis: https://arxiv.org/abs/0906.3486
astrobase.lcproc.lcbin module¶
This contains parallelized functions to bin large numbers of light curves in time.
-
astrobase.lcproc.lcbin.
timebinlc
(lcfile, binsizesec, outdir=None, lcformat='hat-sql', lcformatdir=None, timecols=None, magcols=None, errcols=None, minbinelems=7)[source]¶ This bins the given light curve file in time using the specified bin size.
Parameters: - lcfile (str) – The file name to process.
- binsizesec (float) – The time bin-size in seconds.
- outdir (str or None) – If this is a str, the output LC will be written to outdir. If this is None, the output LC will be written to the same directory as lcfile.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curve file.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- timecols,magcols,errcols (lists of str) – The keys in the lcdict produced by your light curve reader function that correspond to the times, mags/fluxes, and associated measurement errors that will be used as inputs to the binning process. If these are None, the default values for timecols, magcols, and errcols for your light curve format will be used here.
- minbinelems (int) – The minimum number of time-bin elements required to accept a time-bin as valid for the output binned light curve.
Returns: The name of the output pickle file with the binned LC.
Writes the output binned light curve to a pickle that contains the lcdict with an added lcdict[‘binned’][magcol] key, which contains the binned times, mags/fluxes, and errs as lcdict[‘binned’][magcol][‘times’], lcdict[‘binned’][magcol][‘mags’], and lcdict[‘epd’][magcol][‘errs’] for each magcol provided in the input or default magcols value for this light curve format.
Return type: str
-
astrobase.lcproc.lcbin.
timebinlc_worker
(task)[source]¶ This is a parallel worker for the function below.
Parameters: task (tuple) – This is of the form:
task[0] = lcfile task[1] = binsizesec task[3] = {'outdir','lcformat','lcformatdir', 'timecols','magcols','errcols','minbinelems'}
Returns: The output pickle file with the binned LC if successful. None otherwise. Return type: str
-
astrobase.lcproc.lcbin.
parallel_timebin
(lclist, binsizesec, maxobjects=None, outdir=None, lcformat='hat-sql', lcformatdir=None, timecols=None, magcols=None, errcols=None, minbinelems=7, nworkers=2, maxworkertasks=1000)[source]¶ This time-bins all the LCs in the list using the specified bin size.
Parameters: - lclist (list of str) – The input LCs to process.
- binsizesec (float) – The time bin size to use in seconds.
- maxobjects (int or None) – If provided, LC processing will stop at lclist[maxobjects].
- outdir (str or None) – The directory where output LCs will be written. If None, will write to the same directory as the input LCs.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curve file.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- timecols,magcols,errcols (lists of str) – The keys in the lcdict produced by your light curve reader function that correspond to the times, mags/fluxes, and associated measurement errors that will be used as inputs to the binning process. If these are None, the default values for timecols, magcols, and errcols for your light curve format will be used here.
- minbinelems (int) – The minimum number of time-bin elements required to accept a time-bin as valid for the output binned light curve.
- nworkers (int) – Number of parallel workers to launch.
- maxworkertasks (int) – The maximum number of tasks a parallel worker will complete before being replaced to guard against memory leaks.
Returns: The returned dict contains keys = input LCs, vals = output LCs.
Return type: dict
-
astrobase.lcproc.lcbin.
parallel_timebin_lcdir
(lcdir, binsizesec, maxobjects=None, outdir=None, lcformat='hat-sql', lcformatdir=None, timecols=None, magcols=None, errcols=None, minbinelems=7, nworkers=2, maxworkertasks=1000)[source]¶ This time bins all the light curves in the specified directory.
Parameters: - lcdir (list of str) – Directory containing the input LCs to process.
- binsizesec (float) – The time bin size to use in seconds.
- maxobjects (int or None) – If provided, LC processing will stop at lclist[maxobjects].
- outdir (str or None) – The directory where output LCs will be written. If None, will write to the same directory as the input LCs.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curve file.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- timecols,magcols,errcols (lists of str) – The keys in the lcdict produced by your light curve reader function that correspond to the times, mags/fluxes, and associated measurement errors that will be used as inputs to the binning process. If these are None, the default values for timecols, magcols, and errcols for your light curve format will be used here.
- minbinelems (int) – The minimum number of time-bin elements required to accept a time-bin as valid for the output binned light curve.
- nworkers (int) – Number of parallel workers to launch.
- maxworkertasks (int) – The maximum number of tasks a parallel worker will complete before being replaced to guard against memory leaks.
Returns: The returned dict contains keys = input LCs, vals = output LCs.
Return type: dict
astrobase.lcproc.lcpfeatures module¶
This contains functions to generate periodic light curve features for later variable star classification.
-
astrobase.lcproc.lcpfeatures.
get_periodicfeatures
(pfpickle, lcbasedir, outdir, fourierorder=5, transitparams=(-0.01, 0.1, 0.1), ebparams=(-0.2, 0.3, 0.7, 0.5), pdiff_threshold=0.0001, sidereal_threshold=0.0001, sampling_peak_multiplier=5.0, sampling_startp=None, sampling_endp=None, starfeatures=None, timecols=None, magcols=None, errcols=None, lcformat='hat-sql', lcformatdir=None, sigclip=10.0, verbose=True, raiseonfail=False)[source]¶ This gets all periodic features for the object.
Parameters: - pfpickle (str) – The period-finding result pickle containing period-finder results to use for the calculation of LC fit, periodogram, and phased LC features.
- lcbasedir (str) – The base directory where the light curve for the current object is located.
- outdir (str) – The output directory where the results will be written.
- fourierorder (int) – The Fourier order to use to generate sinusoidal function and fit that to the phased light curve.
- transitparams (list of floats) – The transit depth, duration, and ingress duration to use to generate a trapezoid planet transit model fit to the phased light curve. The period used is the one provided in period, while the epoch is automatically obtained from a spline fit to the phased light curve.
- ebparams (list of floats) – The primary eclipse depth, eclipse duration, the primary-secondary depth ratio, and the phase of the secondary eclipse to use to generate an eclipsing binary model fit to the phased light curve. The period used is the one provided in period, while the epoch is automatically obtained from a spline fit to the phased light curve.
- pdiff_threshold (float) – This is the max difference between periods to consider them the same.
- sidereal_threshold (float) – This is the max difference between any of the ‘best’ periods and the sidereal day periods to consider them the same.
- sampling_peak_multiplier (float) – This is the minimum multiplicative factor of a ‘best’ period’s normalized periodogram peak over the sampling periodogram peak at the same period required to accept the ‘best’ period as possibly real.
- sampling_endp (sampling_startp,) – If the pgramlist doesn’t have a time-sampling Lomb-Scargle periodogram, it will be obtained automatically. Use these kwargs to control the minimum and maximum period interval to be searched when generating this periodogram.
- starfeatures (str or None) – If not None, this should be the filename of the
starfeatures-<objectid>.pkl created by
astrobase.lcproc.lcsfeatures.get_starfeatures()
for this object. This is used to get the neighbor’s light curve and phase it with this object’s period to see if this object is blended. - timecols (list of str or None) – The timecol keys to use from the lcdict in calculating the features.
- magcols (list of str or None) – The magcol keys to use from the lcdict in calculating the features.
- errcols (list of str or None) – The errcol keys to use from the lcdict in calculating the features.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- verbose (bool) – If True, will indicate progress while working.
- raiseonfail (bool) – If True, will raise an Exception if something goes wrong.
Returns: Returns a filename for the output pickle containing all of the periodic features for the input object’s LC.
Return type: str
-
astrobase.lcproc.lcpfeatures.
serial_periodicfeatures
(pfpkl_list, lcbasedir, outdir, starfeaturesdir=None, fourierorder=5, transitparams=(-0.01, 0.1, 0.1), ebparams=(-0.2, 0.3, 0.7, 0.5), pdiff_threshold=0.0001, sidereal_threshold=0.0001, sampling_peak_multiplier=5.0, sampling_startp=None, sampling_endp=None, starfeatures=None, timecols=None, magcols=None, errcols=None, lcformat='hat-sql', lcformatdir=None, sigclip=10.0, verbose=False, maxobjects=None)[source]¶ This drives the periodicfeatures collection for a list of periodfinding pickles.
Parameters: - pfpkl_list (list of str) – The list of period-finding pickles to use.
- lcbasedir (str) – The base directory where the associated light curves are located.
- outdir (str) – The directory where the results will be written.
- starfeaturesdir (str or None) – The directory containing the starfeatures-<objectid>.pkl files for each object to use calculate neighbor proximity light curve features.
- fourierorder (int) – The Fourier order to use to generate sinusoidal function and fit that to the phased light curve.
- transitparams (list of floats) – The transit depth, duration, and ingress duration to use to generate a trapezoid planet transit model fit to the phased light curve. The period used is the one provided in period, while the epoch is automatically obtained from a spline fit to the phased light curve.
- ebparams (list of floats) – The primary eclipse depth, eclipse duration, the primary-secondary depth ratio, and the phase of the secondary eclipse to use to generate an eclipsing binary model fit to the phased light curve. The period used is the one provided in period, while the epoch is automatically obtained from a spline fit to the phased light curve.
- pdiff_threshold (float) – This is the max difference between periods to consider them the same.
- sidereal_threshold (float) – This is the max difference between any of the ‘best’ periods and the sidereal day periods to consider them the same.
- sampling_peak_multiplier (float) – This is the minimum multiplicative factor of a ‘best’ period’s normalized periodogram peak over the sampling periodogram peak at the same period required to accept the ‘best’ period as possibly real.
- sampling_endp (sampling_startp,) – If the pgramlist doesn’t have a time-sampling Lomb-Scargle periodogram, it will be obtained automatically. Use these kwargs to control the minimum and maximum period interval to be searched when generating this periodogram.
- timecols (list of str or None) – The timecol keys to use from the lcdict in calculating the features.
- magcols (list of str or None) – The magcol keys to use from the lcdict in calculating the features.
- errcols (list of str or None) – The errcol keys to use from the lcdict in calculating the features.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- verbose (bool) – If True, will indicate progress while working.
- maxobjects (int) – The total number of objects to process from pfpkl_list.
Returns: Return type: Nothing.
-
astrobase.lcproc.lcpfeatures.
parallel_periodicfeatures
(pfpkl_list, lcbasedir, outdir, starfeaturesdir=None, fourierorder=5, transitparams=(-0.01, 0.1, 0.1), ebparams=(-0.2, 0.3, 0.7, 0.5), pdiff_threshold=0.0001, sidereal_threshold=0.0001, sampling_peak_multiplier=5.0, sampling_startp=None, sampling_endp=None, timecols=None, magcols=None, errcols=None, lcformat='hat-sql', lcformatdir=None, sigclip=10.0, verbose=False, maxobjects=None, nworkers=2)[source]¶ This runs periodic feature generation in parallel for all periodfinding pickles in the input list.
Parameters: - pfpkl_list (list of str) – The list of period-finding pickles to use.
- lcbasedir (str) – The base directory where the associated light curves are located.
- outdir (str) – The directory where the results will be written.
- starfeaturesdir (str or None) – The directory containing the starfeatures-<objectid>.pkl files for each object to use calculate neighbor proximity light curve features.
- fourierorder (int) – The Fourier order to use to generate sinusoidal function and fit that to the phased light curve.
- transitparams (list of floats) – The transit depth, duration, and ingress duration to use to generate a trapezoid planet transit model fit to the phased light curve. The period used is the one provided in period, while the epoch is automatically obtained from a spline fit to the phased light curve.
- ebparams (list of floats) – The primary eclipse depth, eclipse duration, the primary-secondary depth ratio, and the phase of the secondary eclipse to use to generate an eclipsing binary model fit to the phased light curve. The period used is the one provided in period, while the epoch is automatically obtained from a spline fit to the phased light curve.
- pdiff_threshold (float) – This is the max difference between periods to consider them the same.
- sidereal_threshold (float) – This is the max difference between any of the ‘best’ periods and the sidereal day periods to consider them the same.
- sampling_peak_multiplier (float) – This is the minimum multiplicative factor of a ‘best’ period’s normalized periodogram peak over the sampling periodogram peak at the same period required to accept the ‘best’ period as possibly real.
- sampling_endp (sampling_startp,) – If the pgramlist doesn’t have a time-sampling Lomb-Scargle periodogram, it will be obtained automatically. Use these kwargs to control the minimum and maximum period interval to be searched when generating this periodogram.
- timecols (list of str or None) – The timecol keys to use from the lcdict in calculating the features.
- magcols (list of str or None) – The magcol keys to use from the lcdict in calculating the features.
- errcols (list of str or None) – The errcol keys to use from the lcdict in calculating the features.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- verbose (bool) – If True, will indicate progress while working.
- maxobjects (int) – The total number of objects to process from pfpkl_list.
- nworkers (int) – The number of parallel workers to launch to process the input.
Returns: A dict containing key: val pairs of the input period-finder result and the output periodic feature result pickles for each input pickle is returned.
Return type: dict
-
astrobase.lcproc.lcpfeatures.
parallel_periodicfeatures_lcdir
(pfpkl_dir, lcbasedir, outdir, pfpkl_glob='periodfinding-*.pkl*', starfeaturesdir=None, fourierorder=5, transitparams=(-0.01, 0.1, 0.1), ebparams=(-0.2, 0.3, 0.7, 0.5), pdiff_threshold=0.0001, sidereal_threshold=0.0001, sampling_peak_multiplier=5.0, sampling_startp=None, sampling_endp=None, timecols=None, magcols=None, errcols=None, lcformat='hat-sql', lcformatdir=None, sigclip=10.0, verbose=False, maxobjects=None, nworkers=2, recursive=True)[source]¶ This runs parallel periodicfeature extraction for a directory of periodfinding result pickles.
Parameters: - pfpkl_dir (str) – The directory containing the pickles to process.
- lcbasedir (str) – The directory where all of the associated light curve files are located.
- outdir (str) – The directory where all the output will be written.
- pfpkl_glob (str) – The UNIX file glob to use to search for period-finder result pickles in pfpkl_dir.
- starfeaturesdir (str or None) – The directory containing the starfeatures-<objectid>.pkl files for each object to use calculate neighbor proximity light curve features.
- fourierorder (int) – The Fourier order to use to generate sinusoidal function and fit that to the phased light curve.
- transitparams (list of floats) – The transit depth, duration, and ingress duration to use to generate a trapezoid planet transit model fit to the phased light curve. The period used is the one provided in period, while the epoch is automatically obtained from a spline fit to the phased light curve.
- ebparams (list of floats) – The primary eclipse depth, eclipse duration, the primary-secondary depth ratio, and the phase of the secondary eclipse to use to generate an eclipsing binary model fit to the phased light curve. The period used is the one provided in period, while the epoch is automatically obtained from a spline fit to the phased light curve.
- pdiff_threshold (float) – This is the max difference between periods to consider them the same.
- sidereal_threshold (float) – This is the max difference between any of the ‘best’ periods and the sidereal day periods to consider them the same.
- sampling_peak_multiplier (float) – This is the minimum multiplicative factor of a ‘best’ period’s normalized periodogram peak over the sampling periodogram peak at the same period required to accept the ‘best’ period as possibly real.
- sampling_endp (sampling_startp,) – If the pgramlist doesn’t have a time-sampling Lomb-Scargle periodogram, it will be obtained automatically. Use these kwargs to control the minimum and maximum period interval to be searched when generating this periodogram.
- timecols (list of str or None) – The timecol keys to use from the lcdict in calculating the features.
- magcols (list of str or None) – The magcol keys to use from the lcdict in calculating the features.
- errcols (list of str or None) – The errcol keys to use from the lcdict in calculating the features.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- verbose (bool) – If True, will indicate progress while working.
- maxobjects (int) – The total number of objects to process from pfpkl_list.
- nworkers (int) – The number of parallel workers to launch to process the input.
Returns: A dict containing key: val pairs of the input period-finder result and the output periodic feature result pickles for each input pickle is returned.
Return type: dict
astrobase.lcproc.lcsfeatures module¶
This contains functions to obtain various star magnitude and color features for large numbers of light curves. Useful later for variable star classification.
-
astrobase.lcproc.lcsfeatures.
get_starfeatures
(lcfile, outdir, kdtree, objlist, lcflist, neighbor_radius_arcsec, deredden=True, custom_bandpasses=None, lcformat='hat-sql', lcformatdir=None)[source]¶ This runs the functions from
astrobase.varclass.starfeatures()
on a single light curve file.Parameters: - lcfile (str) – This is the LC file to extract star features for.
- outdir (str) – This is the directory to write the output pickle to.
- kdtree (scipy.spatial.cKDTree) – This is a scipy.spatial.KDTree or cKDTree used to calculate neighbor proximity features. This is for the light curve catalog this object is in.
- objlist (np.array) – This is a Numpy array of object IDs in the same order as the kdtree.data np.array. This is for the light curve catalog this object is in.
- lcflist (np.array) – This is a Numpy array of light curve filenames in the same order as kdtree.data. This is for the light curve catalog this object is in.
- neighbor_radius_arcsec (float) – This indicates the radius in arcsec to search for neighbors for this object using the light curve catalog’s kdtree, objlist, lcflist, and in GAIA.
- deredden (bool) – This controls if the colors and any color classifications will be dereddened using 2MASS DUST.
- custom_bandpasses (dict or None) –
This is a dict used to define any custom bandpasses in the in_objectinfo dict you want to make this function aware of and generate colors for. Use the format below for this dict:
{ '<bandpass_key_1>':{'dustkey':'<twomass_dust_key_1>', 'label':'<band_label_1>' 'colors':[['<bandkey1>-<bandkey2>', '<BAND1> - <BAND2>'], ['<bandkey3>-<bandkey4>', '<BAND3> - <BAND4>']]}, . ... . '<bandpass_key_N>':{'dustkey':'<twomass_dust_key_N>', 'label':'<band_label_N>' 'colors':[['<bandkey1>-<bandkey2>', '<BAND1> - <BAND2>'], ['<bandkey3>-<bandkey4>', '<BAND3> - <BAND4>']]}, }
Where:
bandpass_key is a key to use to refer to this bandpass in the objectinfo dict, e.g. ‘sdssg’ for SDSS g band
twomass_dust_key is the key to use in the 2MASS DUST result table for reddening per band-pass. For example, given the following DUST result table (using http://irsa.ipac.caltech.edu/applications/DUST/):
|Filter_name|LamEff |A_over_E_B_V_SandF|A_SandF|A_over_E_B_V_SFD|A_SFD| |char |float |float |float |float |float| | |microns| |mags | |mags | CTIO U 0.3734 4.107 0.209 4.968 0.253 CTIO B 0.4309 3.641 0.186 4.325 0.221 CTIO V 0.5517 2.682 0.137 3.240 0.165 . . ...
The twomass_dust_key for ‘vmag’ would be ‘CTIO V’. If you want to skip DUST lookup and want to pass in a specific reddening magnitude for your bandpass, use a float for the value of twomass_dust_key. If you want to skip DUST lookup entirely for this bandpass, use None for the value of twomass_dust_key.
band_label is the label to use for this bandpass, e.g. ‘W1’ for WISE-1 band, ‘u’ for SDSS u, etc.
The ‘colors’ list contains color definitions for all colors you want to generate using this bandpass. this list contains elements of the form:
['<bandkey1>-<bandkey2>','<BAND1> - <BAND2>']
where the the first item is the bandpass keys making up this color, and the second item is the label for this color to be used by the frontends. An example:
['sdssu-sdssg','u - g']
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
Returns: Path to the output pickle containing all of the star features for this object.
Return type: str
-
astrobase.lcproc.lcsfeatures.
serial_starfeatures
(lclist, outdir, lc_catalog_pickle, neighbor_radius_arcsec, maxobjects=None, deredden=True, custom_bandpasses=None, lcformat='hat-sql', lcformatdir=None)[source]¶ This drives the get_starfeatures function for a collection of LCs.
Parameters: - lclist (list of str) – The list of light curve file names to process.
- outdir (str) – The output directory where the results will be placed.
- lc_catalog_pickle (str) –
The path to a catalog containing at a dict with least:
- an object ID array accessible with dict[‘objects’][‘objectid’]
- an LC filename array accessible with dict[‘objects’][‘lcfname’]
- a scipy.spatial.KDTree or cKDTree object to use for finding neighbors for each object accessible with dict[‘kdtree’]
A catalog pickle of the form needed can be produced using
astrobase.lcproc.catalogs.make_lclist()
orastrobase.lcproc.catalogs.filter_lclist()
. - neighbor_radius_arcsec (float) – This indicates the radius in arcsec to search for neighbors for this object using the light curve catalog’s kdtree, objlist, lcflist, and in GAIA.
- maxobjects (int) – The number of objects to process from lclist.
- deredden (bool) – This controls if the colors and any color classifications will be dereddened using 2MASS DUST.
- custom_bandpasses (dict or None) –
This is a dict used to define any custom bandpasses in the in_objectinfo dict you want to make this function aware of and generate colors for. Use the format below for this dict:
{ '<bandpass_key_1>':{'dustkey':'<twomass_dust_key_1>', 'label':'<band_label_1>' 'colors':[['<bandkey1>-<bandkey2>', '<BAND1> - <BAND2>'], ['<bandkey3>-<bandkey4>', '<BAND3> - <BAND4>']]}, . ... . '<bandpass_key_N>':{'dustkey':'<twomass_dust_key_N>', 'label':'<band_label_N>' 'colors':[['<bandkey1>-<bandkey2>', '<BAND1> - <BAND2>'], ['<bandkey3>-<bandkey4>', '<BAND3> - <BAND4>']]}, }
Where:
bandpass_key is a key to use to refer to this bandpass in the objectinfo dict, e.g. ‘sdssg’ for SDSS g band
twomass_dust_key is the key to use in the 2MASS DUST result table for reddening per band-pass. For example, given the following DUST result table (using http://irsa.ipac.caltech.edu/applications/DUST/):
|Filter_name|LamEff |A_over_E_B_V_SandF|A_SandF|A_over_E_B_V_SFD|A_SFD| |char |float |float |float |float |float| | |microns| |mags | |mags | CTIO U 0.3734 4.107 0.209 4.968 0.253 CTIO B 0.4309 3.641 0.186 4.325 0.221 CTIO V 0.5517 2.682 0.137 3.240 0.165 . . ...
The twomass_dust_key for ‘vmag’ would be ‘CTIO V’. If you want to skip DUST lookup and want to pass in a specific reddening magnitude for your bandpass, use a float for the value of twomass_dust_key. If you want to skip DUST lookup entirely for this bandpass, use None for the value of twomass_dust_key.
band_label is the label to use for this bandpass, e.g. ‘W1’ for WISE-1 band, ‘u’ for SDSS u, etc.
The ‘colors’ list contains color definitions for all colors you want to generate using this bandpass. this list contains elements of the form:
['<bandkey1>-<bandkey2>','<BAND1> - <BAND2>']
where the the first item is the bandpass keys making up this color, and the second item is the label for this color to be used by the frontends. An example:
['sdssu-sdssg','u - g']
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
Returns: A list of all star features pickles produced.
Return type: list of str
-
astrobase.lcproc.lcsfeatures.
parallel_starfeatures
(lclist, outdir, lc_catalog_pickle, neighbor_radius_arcsec, maxobjects=None, deredden=True, custom_bandpasses=None, lcformat='hat-sql', lcformatdir=None, nworkers=2)[source]¶ This runs get_starfeatures in parallel for all light curves in lclist.
Parameters: - lclist (list of str) – The list of light curve file names to process.
- outdir (str) – The output directory where the results will be placed.
- lc_catalog_pickle (str) –
The path to a catalog containing at a dict with least:
- an object ID array accessible with dict[‘objects’][‘objectid’]
- an LC filename array accessible with dict[‘objects’][‘lcfname’]
- a scipy.spatial.KDTree or cKDTree object to use for finding neighbors for each object accessible with dict[‘kdtree’]
A catalog pickle of the form needed can be produced using
astrobase.lcproc.catalogs.make_lclist()
orastrobase.lcproc.catalogs.filter_lclist()
. - neighbor_radius_arcsec (float) – This indicates the radius in arcsec to search for neighbors for this object using the light curve catalog’s kdtree, objlist, lcflist, and in GAIA.
- maxobjects (int) – The number of objects to process from lclist.
- deredden (bool) – This controls if the colors and any color classifications will be dereddened using 2MASS DUST.
- custom_bandpasses (dict or None) –
This is a dict used to define any custom bandpasses in the in_objectinfo dict you want to make this function aware of and generate colors for. Use the format below for this dict:
{ '<bandpass_key_1>':{'dustkey':'<twomass_dust_key_1>', 'label':'<band_label_1>' 'colors':[['<bandkey1>-<bandkey2>', '<BAND1> - <BAND2>'], ['<bandkey3>-<bandkey4>', '<BAND3> - <BAND4>']]}, . ... . '<bandpass_key_N>':{'dustkey':'<twomass_dust_key_N>', 'label':'<band_label_N>' 'colors':[['<bandkey1>-<bandkey2>', '<BAND1> - <BAND2>'], ['<bandkey3>-<bandkey4>', '<BAND3> - <BAND4>']]}, }
Where:
bandpass_key is a key to use to refer to this bandpass in the objectinfo dict, e.g. ‘sdssg’ for SDSS g band
twomass_dust_key is the key to use in the 2MASS DUST result table for reddening per band-pass. For example, given the following DUST result table (using http://irsa.ipac.caltech.edu/applications/DUST/):
|Filter_name|LamEff |A_over_E_B_V_SandF|A_SandF|A_over_E_B_V_SFD|A_SFD| |char |float |float |float |float |float| | |microns| |mags | |mags | CTIO U 0.3734 4.107 0.209 4.968 0.253 CTIO B 0.4309 3.641 0.186 4.325 0.221 CTIO V 0.5517 2.682 0.137 3.240 0.165 . . ...
The twomass_dust_key for ‘vmag’ would be ‘CTIO V’. If you want to skip DUST lookup and want to pass in a specific reddening magnitude for your bandpass, use a float for the value of twomass_dust_key. If you want to skip DUST lookup entirely for this bandpass, use None for the value of twomass_dust_key.
band_label is the label to use for this bandpass, e.g. ‘W1’ for WISE-1 band, ‘u’ for SDSS u, etc.
The ‘colors’ list contains color definitions for all colors you want to generate using this bandpass. this list contains elements of the form:
['<bandkey1>-<bandkey2>','<BAND1> - <BAND2>']
where the the first item is the bandpass keys making up this color, and the second item is the label for this color to be used by the frontends. An example:
['sdssu-sdssg','u - g']
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- nworkers (int) – The number of parallel workers to launch.
Returns: A dict with key:val pairs of the input light curve filename and the output star features pickle for each LC processed.
Return type: dict
-
astrobase.lcproc.lcsfeatures.
parallel_starfeatures_lcdir
(lcdir, outdir, lc_catalog_pickle, neighbor_radius_arcsec, fileglob=None, maxobjects=None, deredden=True, custom_bandpasses=None, lcformat='hat-sql', lcformatdir=None, nworkers=2, recursive=True)[source]¶ This runs parallel star feature extraction for a directory of LCs.
Parameters: - lcdir (list of str) – The directory to search for light curves.
- outdir (str) – The output directory where the results will be placed.
- lc_catalog_pickle (str) –
The path to a catalog containing at a dict with least:
- an object ID array accessible with dict[‘objects’][‘objectid’]
- an LC filename array accessible with dict[‘objects’][‘lcfname’]
- a scipy.spatial.KDTree or cKDTree object to use for finding neighbors for each object accessible with dict[‘kdtree’]
A catalog pickle of the form needed can be produced using
astrobase.lcproc.catalogs.make_lclist()
orastrobase.lcproc.catalogs.filter_lclist()
. - neighbor_radius_arcsec (float) – This indicates the radius in arcsec to search for neighbors for this object using the light curve catalog’s kdtree, objlist, lcflist, and in GAIA.
- fileglob (str) – The UNIX file glob to use to search for the light curves in lcdir. If None, the default value for the light curve format specified will be used.
- maxobjects (int) – The number of objects to process from lclist.
- deredden (bool) – This controls if the colors and any color classifications will be dereddened using 2MASS DUST.
- custom_bandpasses (dict or None) –
This is a dict used to define any custom bandpasses in the in_objectinfo dict you want to make this function aware of and generate colors for. Use the format below for this dict:
{ '<bandpass_key_1>':{'dustkey':'<twomass_dust_key_1>', 'label':'<band_label_1>' 'colors':[['<bandkey1>-<bandkey2>', '<BAND1> - <BAND2>'], ['<bandkey3>-<bandkey4>', '<BAND3> - <BAND4>']]}, . ... . '<bandpass_key_N>':{'dustkey':'<twomass_dust_key_N>', 'label':'<band_label_N>' 'colors':[['<bandkey1>-<bandkey2>', '<BAND1> - <BAND2>'], ['<bandkey3>-<bandkey4>', '<BAND3> - <BAND4>']]}, }
Where:
bandpass_key is a key to use to refer to this bandpass in the objectinfo dict, e.g. ‘sdssg’ for SDSS g band
twomass_dust_key is the key to use in the 2MASS DUST result table for reddening per band-pass. For example, given the following DUST result table (using http://irsa.ipac.caltech.edu/applications/DUST/):
|Filter_name|LamEff |A_over_E_B_V_SandF|A_SandF|A_over_E_B_V_SFD|A_SFD| |char |float |float |float |float |float| | |microns| |mags | |mags | CTIO U 0.3734 4.107 0.209 4.968 0.253 CTIO B 0.4309 3.641 0.186 4.325 0.221 CTIO V 0.5517 2.682 0.137 3.240 0.165 . . ...
The twomass_dust_key for ‘vmag’ would be ‘CTIO V’. If you want to skip DUST lookup and want to pass in a specific reddening magnitude for your bandpass, use a float for the value of twomass_dust_key. If you want to skip DUST lookup entirely for this bandpass, use None for the value of twomass_dust_key.
band_label is the label to use for this bandpass, e.g. ‘W1’ for WISE-1 band, ‘u’ for SDSS u, etc.
The ‘colors’ list contains color definitions for all colors you want to generate using this bandpass. this list contains elements of the form:
['<bandkey1>-<bandkey2>','<BAND1> - <BAND2>']
where the the first item is the bandpass keys making up this color, and the second item is the label for this color to be used by the frontends. An example:
['sdssu-sdssg','u - g']
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- nworkers (int) – The number of parallel workers to launch.
Returns: A dict with key:val pairs of the input light curve filename and the output star features pickle for each LC processed.
Return type: dict
astrobase.lcproc.lcvfeatures module¶
This contains functions to generate variability features for large collections of light curves. Useful later for variable star classification.
-
astrobase.lcproc.lcvfeatures.
get_varfeatures
(lcfile, outdir, timecols=None, magcols=None, errcols=None, mindet=1000, lcformat='hat-sql', lcformatdir=None)[source]¶ This runs
astrobase.varclass.varfeatures.all_nonperiodic_features()
on a single LC file.Parameters: - lcfile (str) – The input light curve to process.
- outfile (str) – The filename of the output variable features pickle that will be generated.
- timecols (list of str or None) – The timecol keys to use from the lcdict in calculating the features.
- magcols (list of str or None) – The magcol keys to use from the lcdict in calculating the features.
- errcols (list of str or None) – The errcol keys to use from the lcdict in calculating the features.
- mindet (int) – The minimum number of LC points required to generate variability features.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
Returns: The generated variability features pickle for the input LC, with results for each magcol in the input magcol or light curve format’s default magcol list.
Return type: str
-
astrobase.lcproc.lcvfeatures.
serial_varfeatures
(lclist, outdir, maxobjects=None, timecols=None, magcols=None, errcols=None, mindet=1000, lcformat='hat-sql', lcformatdir=None)[source]¶ This runs variability feature extraction for a list of LCs.
Parameters: - lclist (list of str) – The list of light curve file names to process.
- outdir (str) – The directory where the output varfeatures pickle files will be written.
- maxobjects (int) – The number of LCs to process from lclist.
- timecols (list of str or None) – The timecol keys to use from the lcdict in calculating the features.
- magcols (list of str or None) – The magcol keys to use from the lcdict in calculating the features.
- errcols (list of str or None) – The errcol keys to use from the lcdict in calculating the features.
- mindet (int) – The minimum number of LC points required to generate variability features.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
Returns: List of the generated variability features pickles for the input LCs, with results for each magcol in the input magcol or light curve format’s default magcol list.
Return type: list of str
-
astrobase.lcproc.lcvfeatures.
parallel_varfeatures
(lclist, outdir, maxobjects=None, timecols=None, magcols=None, errcols=None, mindet=1000, lcformat='hat-sql', lcformatdir=None, nworkers=2)[source]¶ This runs variable feature extraction in parallel for all LCs in lclist.
Parameters: - lclist (list of str) – The list of light curve file names to process.
- outdir (str) – The directory where the output varfeatures pickle files will be written.
- maxobjects (int) – The number of LCs to process from lclist.
- timecols (list of str or None) – The timecol keys to use from the lcdict in calculating the features.
- magcols (list of str or None) – The magcol keys to use from the lcdict in calculating the features.
- errcols (list of str or None) – The errcol keys to use from the lcdict in calculating the features.
- mindet (int) – The minimum number of LC points required to generate variability features.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- nworkers (int) – The number of parallel workers to launch.
Returns: A dict with key:val pairs of input LC file name : the generated variability features pickles for each of the input LCs, with results for each magcol in the input magcol or light curve format’s default magcol list.
Return type: dict
-
astrobase.lcproc.lcvfeatures.
parallel_varfeatures_lcdir
(lcdir, outdir, fileglob=None, maxobjects=None, timecols=None, magcols=None, errcols=None, recursive=True, mindet=1000, lcformat='hat-sql', lcformatdir=None, nworkers=2)[source]¶ This runs parallel variable feature extraction for a directory of LCs.
Parameters: - lcdir (str) – The directory of light curve files to process.
- outdir (str) – The directory where the output varfeatures pickle files will be written.
- fileglob (str or None) – The file glob to use when looking for light curve files in lcdir. If None, the default file glob associated for this LC format will be used.
- maxobjects (int) – The number of LCs to process from lclist.
- timecols (list of str or None) – The timecol keys to use from the lcdict in calculating the features.
- magcols (list of str or None) – The magcol keys to use from the lcdict in calculating the features.
- errcols (list of str or None) – The errcol keys to use from the lcdict in calculating the features.
- mindet (int) – The minimum number of LC points required to generate variability features.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- nworkers (int) – The number of parallel workers to launch.
Returns: A dict with key:val pairs of input LC file name : the generated variability features pickles for each of the input LCs, with results for each magcol in the input magcol or light curve format’s default magcol list.
Return type: dict
astrobase.lcproc.periodsearch module¶
This contains functions to run period-finding in a parallelized manner on large collections of light curves.
-
astrobase.lcproc.periodsearch.
runpf
(lcfile, outdir, timecols=None, magcols=None, errcols=None, lcformat='hat-sql', lcformatdir=None, pfmethods=('gls', 'pdm', 'mav', 'win'), pfkwargs=({}, {}, {}, {}), sigclip=10.0, getblssnr=False, nworkers=2, minobservations=500, excludeprocessed=False, raiseonfail=False)[source]¶ This runs the period-finding for a single LC.
Parameters: - lcfile (str) – The light curve file to run period-finding on.
- outdir (str) – The output directory where the result pickle will go.
- timecols (list of str or None) – The timecol keys to use from the lcdict in calculating the features.
- magcols (list of str or None) – The magcol keys to use from the lcdict in calculating the features.
- errcols (list of str or None) – The errcol keys to use from the lcdict in calculating the features.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- pfmethods (list of str) – This is a list of period finding methods to run. Each element is a string matching the keys of the PFMETHODS dict above. By default, this runs GLS, PDM, AoVMH, and the spectral window Lomb-Scargle periodogram.
- pfkwargs (list of dicts) – This is used to provide any special kwargs as dicts to each period-finding method function specified in pfmethods.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- getblssnr (bool) – If this is True and BLS is one of the methods specified in pfmethods, will also calculate the stats for each best period in the BLS results: transit depth, duration, ingress duration, refit period and epoch, and the SNR of the transit.
- nworkers (int) – The number of parallel period-finding workers to launch.
- minobservations (int) – The minimum number of finite LC points required to process a light curve.
- excludeprocessed (bool) –
If this is True, light curves that have existing period-finding result pickles in outdir will not be processed.
FIXME: currently, this uses a dumb method of excluding already-processed files. A smarter way to do this is to (i) generate a SHA512 cachekey based on a repr of {‘lcfile’, ‘timecols’, ‘magcols’, ‘errcols’, ‘lcformat’, ‘pfmethods’, ‘sigclip’, ‘getblssnr’, ‘pfkwargs’}, (ii) make sure all list kwargs in the dict are sorted, (iii) check if the output file has the same cachekey in its filename (last 8 chars of cachekey should work), so the result was processed in exactly the same way as specifed in the input to this function, and can therefore be ignored. Will implement this later.
- raiseonfail (bool) – If something fails and this is True, will raise an Exception instead of returning None at the end.
Returns: The path to the output period-finding result pickle.
Return type: str
-
astrobase.lcproc.periodsearch.
parallel_pf
(lclist, outdir, timecols=None, magcols=None, errcols=None, lcformat='hat-sql', lcformatdir=None, pfmethods=('gls', 'pdm', 'mav', 'win'), pfkwargs=({}, {}, {}, {}), sigclip=10.0, getblssnr=False, nperiodworkers=2, ncontrolworkers=1, liststartindex=None, listmaxobjects=None, minobservations=500, excludeprocessed=True)[source]¶ This drives the overall parallel period processing for a list of LCs.
As a rough benchmark, 25000 HATNet light curves with up to 50000 points per LC take about 26 days in total for an invocation of this function using GLS+PDM+BLS, 10 periodworkers, and 4 controlworkers (so all 40 ‘cores’) on a 2 x Xeon E5-2660v3 machine.
Parameters: - lclist (list of str) – The list of light curve file to process.
- outdir (str) – The output directory where the period-finding result pickles will go.
- timecols (list of str or None) – The timecol keys to use from the lcdict in calculating the features.
- magcols (list of str or None) – The magcol keys to use from the lcdict in calculating the features.
- errcols (list of str or None) – The errcol keys to use from the lcdict in calculating the features.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- pfmethods (list of str) – This is a list of period finding methods to run. Each element is a string matching the keys of the PFMETHODS dict above. By default, this runs GLS, PDM, AoVMH, and the spectral window Lomb-Scargle periodogram.
- pfkwargs (list of dicts) – This is used to provide any special kwargs as dicts to each period-finding method function specified in pfmethods.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- getblssnr (bool) – If this is True and BLS is one of the methods specified in pfmethods, will also calculate the stats for each best period in the BLS results: transit depth, duration, ingress duration, refit period and epoch, and the SNR of the transit.
- nperiodworkers (int) – The number of parallel period-finding workers to launch per object task.
- ncontrolworkers (int) – The number of controlling processes to launch. This effectively sets how many objects from lclist will be processed in parallel.
- liststartindex (int or None) – This sets the index from where to start in lclist.
- listmaxobjects (int or None) – This sets the maximum number of objects in lclist to run period-finding for in this invocation. Together with liststartindex, listmaxobjects can be used to distribute processing over several independent machines if the number of light curves is very large.
- minobservations (int) – The minimum number of finite LC points required to process a light curve.
- excludeprocessed (bool) –
If this is True, light curves that have existing period-finding result pickles in outdir will not be processed.
FIXME: currently, this uses a dumb method of excluding already-processed files. A smarter way to do this is to (i) generate a SHA512 cachekey based on a repr of {‘lcfile’, ‘timecols’, ‘magcols’, ‘errcols’, ‘lcformat’, ‘pfmethods’, ‘sigclip’, ‘getblssnr’, ‘pfkwargs’}, (ii) make sure all list kwargs in the dict are sorted, (iii) check if the output file has the same cachekey in its filename (last 8 chars of cachekey should work), so the result was processed in exactly the same way as specifed in the input to this function, and can therefore be ignored. Will implement this later.
Returns: A list of the period-finding pickles created for all of input LCs processed.
Return type: list of str
-
astrobase.lcproc.periodsearch.
parallel_pf_lcdir
(lcdir, outdir, fileglob=None, recursive=True, timecols=None, magcols=None, errcols=None, lcformat='hat-sql', lcformatdir=None, pfmethods=('gls', 'pdm', 'mav', 'win'), pfkwargs=({}, {}, {}, {}), sigclip=10.0, getblssnr=False, nperiodworkers=2, ncontrolworkers=1, liststartindex=None, listmaxobjects=None, minobservations=500, excludeprocessed=True)[source]¶ This runs parallel light curve period finding for directory of LCs.
Parameters: - lcdir (str) – The directory containing the LCs to process.
- outdir (str) – The directory where the resulting period-finding pickles will go.
- fileglob (str or None) – The UNIX file glob to use to search for LCs in lcdir. If None, the default file glob associated with the registered LC format will be used instead.
- recursive (bool) – If True, will search recursively in lcdir for light curves to process.
- timecols (list of str or None) – The timecol keys to use from the lcdict in calculating the features.
- magcols (list of str or None) – The magcol keys to use from the lcdict in calculating the features.
- errcols (list of str or None) – The errcol keys to use from the lcdict in calculating the features.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- pfmethods (list of str) – This is a list of period finding methods to run. Each element is a string matching the keys of the PFMETHODS dict above. By default, this runs GLS, PDM, AoVMH, and the spectral window Lomb-Scargle periodogram.
- pfkwargs (list of dicts) – This is used to provide any special kwargs as dicts to each period-finding method function specified in pfmethods.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- getblssnr (bool) – If this is True and BLS is one of the methods specified in pfmethods, will also calculate the stats for each best period in the BLS results: transit depth, duration, ingress duration, refit period and epoch, and the SNR of the transit.
- nperiodworkers (int) – The number of parallel period-finding workers to launch per object task.
- ncontrolworkers (int) – The number of controlling processes to launch. This effectively sets how many objects from lclist will be processed in parallel.
- liststartindex (int or None) – This sets the index from where to start in lclist.
- listmaxobjects (int or None) – This sets the maximum number of objects in lclist to run period-finding for in this invocation. Together with liststartindex, listmaxobjects can be used to distribute processing over several independent machines if the number of light curves is very large.
- minobservations (int) – The minimum number of finite LC points required to process a light curve.
- excludeprocessed (bool) –
If this is True, light curves that have existing period-finding result pickles in outdir will not be processed.
FIXME: currently, this uses a dumb method of excluding already-processed files. A smarter way to do this is to (i) generate a SHA512 cachekey based on a repr of {‘lcfile’, ‘timecols’, ‘magcols’, ‘errcols’, ‘lcformat’, ‘pfmethods’, ‘sigclip’, ‘getblssnr’, ‘pfkwargs’}, (ii) make sure all list kwargs in the dict are sorted, (iii) check if the output file has the same cachekey in its filename (last 8 chars of cachekey should work), so the result was processed in exactly the same way as specifed in the input to this function, and can therefore be ignored. Will implement this later.
Returns: A list of the period-finding pickles created for all of input LCs processed.
Return type: list of str
astrobase.lcproc.tfa module¶
This contains functions to run the Trend Filtering Algorithm (TFA) in a parallelized manner on large collections of light curves.
-
astrobase.lcproc.tfa.
tfa_templates_lclist
(lclist, outfile, lcinfo_pkl=None, target_template_frac=0.1, max_target_frac_obs=0.25, min_template_number=10, max_template_number=1000, max_rms=0.15, max_mult_above_magmad=1.5, max_mult_above_mageta=1.5, mag_bandpass='sdssr', custom_bandpasses=None, mag_bright_limit=10.0, mag_faint_limit=12.0, process_template_lcs=True, template_sigclip=5.0, template_interpolate='nearest', lcformat='hat-sql', lcformatdir=None, timecols=None, magcols=None, errcols=None, nworkers=2, maxworkertasks=1000)[source]¶ This selects template objects for TFA.
Selection criteria for TFA template ensemble objects:
- not variable: use a poly fit to the mag-MAD relation and eta-normal variability index to get nonvar objects
- not more than 10% of the total number of objects in the field or max_tfa_templates at most and no more than max_target_frac_obs x template_ndet objects.
- allow shuffling of the templates if the target ends up in them
- nothing with less than the median number of observations in the field
- sigma-clip the input time series observations
- TODO: select randomly in xi-eta space. This doesn’t seem to make a huge difference at the moment, so removed those bits for now. This function makes plots of xi-eta for the selected template objects so the distributions can be visualized.
This also determines the effective cadence that all TFA LCs will be binned to as the template LC with the largest number of non-nan observations will be used. All template LCs will be renormed to zero.
Parameters: - lclist (list of str) – This is a list of light curves to use as input to generate the template set.
- outfile (str) – This is the pickle filename to which the TFA template list will be written to.
- lcinfo_pkl (str or None) – If provided, is a file path to a pickle file created by this function on a previous run containing the LC information. This will be loaded directly instead of having to re-run LC info collection. If None, will be placed in the same directory as outfile.
- target_template_frac (float) – This is the fraction of total objects in lclist to use for the number of templates.
- max_target_frac_obs (float) – This sets the number of templates to generate if the number of observations for the light curves is smaller than the number of objects in the collection. The number of templates will be set to this fraction of the number of observations if this is the case.
- min_template_number (int) – This is the minimum number of templates to generate.
- max_template_number (int) – This is the maximum number of templates to generate. If target_template_frac times the number of objects is greater than max_template_number, only max_template_number templates will be used.
- max_rms (float) – This is the maximum light curve RMS for an object to consider it as a possible template ensemble member.
- max_mult_above_magmad (float) – This is the maximum multiplier above the mag-RMS fit to consider an object as variable and thus not part of the template ensemble.
- max_mult_above_mageta (float) – This is the maximum multiplier above the mag-eta (variable index) fit to consider an object as variable and thus not part of the template ensemble.
- mag_bandpass (str) – This sets the key in the light curve dict’s objectinfo dict to use as the canonical magnitude for the object and apply any magnitude limits to.
- custom_bandpasses (dict or None) – This can be used to provide any custom band name keys to the star feature collection function.
- mag_bright_limit (float or list of floats) – This sets the brightest mag (in the mag_bandpass filter) for a potential member of the TFA template ensemble. If this is a single float, the value will be used for all magcols. If this is a list of floats with len = len(magcols), the specific bright limits will be used for each magcol individually.
- mag_faint_limit (float or list of floats) – This sets the faintest mag (in the mag_bandpass filter) for a potential member of the TFA template ensemble. If this is a single float, the value will be used for all magcols. If this is a list of floats with len = len(magcols), the specific faint limits will be used for each magcol individually.
- process_template_lcs (bool) – If True, will reform the template light curves to the chosen time-base. If False, will only select light curves for templates but not process them. This is useful for initial exploration of how the template LC are selected.
- template_sigclip (float or sequence of floats or None) – This sets the sigma-clip to be applied to the template light curves.
- template_interpolate (str) – This sets the kwarg to pass to scipy.interpolate.interp1d to set the kind of interpolation to use when reforming light curves to the TFA template timebase.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- timecols (list of str or None) – The timecol keys to use from the lcdict in calculating the features.
- magcols (list of str or None) – The magcol keys to use from the lcdict in calculating the features.
- errcols (list of str or None) – The errcol keys to use from the lcdict in calculating the features.
- nworkers (int) – The number of parallel workers to launch.
- maxworkertasks (int) – The maximum number of tasks to run per worker before it is replaced by a fresh one.
Returns: This function returns a dict that can be passed directly to apply_tfa_magseries below. It can optionally produce a pickle with the same dict, which can also be passed to that function.
Return type: dict
-
astrobase.lcproc.tfa.
apply_tfa_magseries
(lcfile, timecol, magcol, errcol, templateinfo, mintemplatedist_arcmin=10.0, lcformat='hat-sql', lcformatdir=None, interp='nearest', sigclip=5.0)[source]¶ This applies the TFA correction to an LC given TFA template information.
Parameters: - lcfile (str) – This is the light curve file to apply the TFA correction to.
- timecol,magcol,errcol (str) – These are the column keys in the lcdict for the LC file to apply the TFA correction to.
- templateinfo (dict or str) – This is either the dict produced by tfa_templates_lclist or the pickle produced by the same function.
- mintemplatedist_arcmin (float) – This sets the minimum distance required from the target object for objects in the TFA template ensemble. Objects closer than this distance will be removed from the ensemble.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- interp (str) – This is passed to scipy.interpolate.interp1d as the kind of interpolation to use when reforming this light curve to the timebase of the TFA templates.
- sigclip (float or sequence of two floats or None) – This is the sigma clip to apply to this light curve before running TFA on it.
Returns: This returns the filename of the light curve file generated after TFA applications. This is a pickle (that can be read by lcproc.read_pklc) in the same directory as lcfile. The magcol will be encoded in the filename, so each magcol in lcfile gets its own output file.
Return type: str
-
astrobase.lcproc.tfa.
parallel_tfa_lclist
(lclist, templateinfo, timecols=None, magcols=None, errcols=None, lcformat='hat-sql', lcformatdir=None, interp='nearest', sigclip=5.0, mintemplatedist_arcmin=10.0, nworkers=2, maxworkertasks=1000)[source]¶ This applies TFA in parallel to all LCs in the given list of file names.
Parameters: - lclist (str) – This is a list of light curve files to apply TFA correction to.
- templateinfo (dict or str) – This is either the dict produced by tfa_templates_lclist or the pickle produced by the same function.
- timecols (list of str or None) – The timecol keys to use from the lcdict in applying TFA corrections.
- magcols (list of str or None) – The magcol keys to use from the lcdict in applying TFA corrections.
- errcols (list of str or None) – The errcol keys to use from the lcdict in applying TFA corrections.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- interp (str) – This is passed to scipy.interpolate.interp1d as the kind of interpolation to use when reforming the light curves to the timebase of the TFA templates.
- sigclip (float or sequence of two floats or None) – This is the sigma clip to apply to the light curves before running TFA on it.
- mintemplatedist_arcmin (float) – This sets the minimum distance required from the target object for objects in the TFA template ensemble. Objects closer than this distance will be removed from the ensemble.
- nworkers (int) – The number of parallel workers to launch
- maxworkertasks (int) – The maximum number of tasks per worker allowed before it’s replaced by a fresh one.
Returns: Contains the input file names and output TFA light curve filenames per input file organized by each magcol in magcols.
Return type: dict
-
astrobase.lcproc.tfa.
parallel_tfa_lcdir
(lcdir, templateinfo, lcfileglob=None, timecols=None, magcols=None, errcols=None, lcformat='hat-sql', lcformatdir=None, interp='nearest', sigclip=5.0, mintemplatedist_arcmin=10.0, nworkers=2, maxworkertasks=1000)[source]¶ This applies TFA in parallel to all LCs in a directory.
Parameters: - lcdir (str) – This is the directory containing the light curve files to process..
- templateinfo (dict or str) – This is either the dict produced by tfa_templates_lclist or the pickle produced by the same function.
- lcfileglob (str or None) – The UNIX file glob to use when searching for light curve files in lcdir. If None, the default file glob associated with registered LC format provided is used.
- timecols (list of str or None) – The timecol keys to use from the lcdict in applying TFA corrections.
- magcols (list of str or None) – The magcol keys to use from the lcdict in applying TFA corrections.
- errcols (list of str or None) – The errcol keys to use from the lcdict in applying TFA corrections.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- interp (str) – This is passed to scipy.interpolate.interp1d as the kind of interpolation to use when reforming the light curves to the timebase of the TFA templates.
- sigclip (float or sequence of two floats or None) – This is the sigma clip to apply to the light curves before running TFA on it.
- mintemplatedist_arcmin (float) – This sets the minimum distance required from the target object for objects in the TFA template ensemble. Objects closer than this distance will be removed from the ensemble.
- nworkers (int) – The number of parallel workers to launch
- maxworkertasks (int) – The maximum number of tasks per worker allowed before it’s replaced by a fresh one.
Returns: Contains the input file names and output TFA light curve filenames per input file organized by each magcol in magcols.
Return type: dict
astrobase.lcproc.varthreshold module¶
This contains functions to investigate where to set a threshold for several variability indices to distinguish between variable and non-variable stars.
-
astrobase.lcproc.varthreshold.
variability_threshold
(featuresdir, outfile, magbins=array([ 8., 8.25, 8.5, 8.75, 9., 9.25, 9.5, 9.75, 10., 10.25, 10.5, 10.75, 11., 11.25, 11.5, 11.75, 12., 12.25, 12.5, 12.75, 13., 13.25, 13.5, 13.75, 14., 14.25, 14.5, 14.75, 15., 15.25, 15.5, 15.75, 16. ]), maxobjects=None, timecols=None, magcols=None, errcols=None, lcformat='hat-sql', lcformatdir=None, min_lcmad_stdev=5.0, min_stetj_stdev=2.0, min_iqr_stdev=2.0, min_inveta_stdev=2.0, verbose=True)[source]¶ This generates a list of objects with stetson J, IQR, and 1.0/eta above some threshold value to select them as potential variable stars.
Use this to pare down the objects to review and put through period-finding. This does the thresholding per magnitude bin; this should be better than one single cut through the entire magnitude range. Set the magnitude bins using the magbins kwarg.
FIXME: implement a voting classifier here. this will choose variables based on the thresholds in IQR, stetson, and inveta based on weighting carried over from the variability recovery sims.
Parameters: - featuresdir (str) – This is the directory containing variability feature pickles created by
astrobase.lcproc.lcpfeatures.parallel_varfeatures()
or similar. - outfile (str) – This is the output pickle file that will contain all the threshold information.
- magbins (np.array of floats) – This sets the magnitude bins to use for calculating thresholds.
- maxobjects (int or None) – This is the number of objects to process. If None, all objects with feature pickles in featuresdir will be processed.
- timecols (list of str or None) – The timecol keys to use from the lcdict in calculating the thresholds.
- magcols (list of str or None) – The magcol keys to use from the lcdict in calculating the thresholds.
- errcols (list of str or None) – The errcol keys to use from the lcdict in calculating the thresholds.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- min_lcmad_stdev,min_stetj_stdev,min_iqr_stdev,min_inveta_stdev (float or np.array) – These are all the standard deviation multiplier for the distributions of light curve standard deviation, Stetson J variability index, the light curve interquartile range, and 1/eta variability index respectively. These multipliers set the minimum values of these measures to use for selecting variable stars. If provided as floats, the same value will be used for all magbins. If provided as np.arrays of size = magbins.size - 1, will be used to apply possibly different sigma cuts for each magbin.
- verbose (bool) – If True, will report progress and warn about any problems.
Returns: Contains all of the variability threshold information along with indices into the array of the object IDs chosen as variables.
Return type: dict
- featuresdir (str) – This is the directory containing variability feature pickles created by
-
astrobase.lcproc.varthreshold.
plot_variability_thresholds
(varthreshpkl, xmin_lcmad_stdev=5.0, xmin_stetj_stdev=2.0, xmin_iqr_stdev=2.0, xmin_inveta_stdev=2.0, lcformat='hat-sql', lcformatdir=None, magcols=None)[source]¶ This makes plots for the variability threshold distributions.
Parameters: - varthreshpkl (str) – The pickle produced by the function above.
- xmin_lcmad_stdev,xmin_stetj_stdev,xmin_iqr_stdev,xmin_inveta_stdev (float or np.array) – Values of the threshold values to override the ones in the vartresholdpkl. If provided, will plot the thresholds accordingly instead of using the ones in the input pickle directly.
- lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- magcols (list of str or None) – The magcol keys to use from the lcdict.
Returns: The file name of the threshold plot generated.
Return type: str
astrobase.lcfit
: functions for fitting various light curve models to observations, including sinusoidal, trapezoidal and full Mandel-Agol planet transits, eclipses, and splines.astrobase.lcmath
: functions for light curve operations such as phasing, normalization, binning (in time and phase), sigma-clipping, external parameter decorrelation (EPD), etc.astrobase.lcmodels
: modules that contain simple models for several variable star classes, including sinusoidal variables, eclipsing binaries, and transiting planets. Useful for fitting these with the functions in theastrobase.lcfit
module.astrobase.varbase
: functions for dealing with periodic signals including masking and pre-whitening them, ACF calculations, light curve detrending, and specific tools for planetary transits.astrobase.plotbase
: functions to plot light curves, phased light curves, periodograms, and download Digitized Sky Survey cutouts from the NASA SkyView service.astrobase.lcproc
: driver functions for running an end-to-end pipeline including: (i) object selection from a collection of light curves by position, cross-matching to external catalogs, or light curve objectinfo keys, (ii) running variability feature calculation and detection, (iii) running period-finding, and (iv) object review using the checkplotserver webapp for variability classification. This also contains an Amazon AWS-enabled lcproc implementation.
astrobase.checkplot package¶
Contains functions to make checkplots: quick views for determining periodic variability for light curves and sanity-checking results from period-finding functions (e.g., from periodbase).
The astrobase.checkplot.pkl.checkplot_pickle()
function takes, for a
single object, an arbitrary number of results from independent period-finding
functions (e.g. BLS, PDM, AoV, GLS, etc.) in periodbase, and generates a pickle
file that contains object and variability information, finder chart, mag series
plot, and for each period-finding result: a periodogram and phased mag series
plots for an arbitrary number of ‘best periods’.
This is intended for use with an external checkplot viewer: the Tornado webapp
checkplotserver.py, but you can also use the
astrobase.checkplot.pkl_png.checkplot_pickle_to_png()
function to
render this to a PNG that will look something like:
[ finder ] [ objectinfo ] [ variableinfo ] [ unphased LC ]
[ periodogram1 ] [ phased LC P1 ] [ phased LC P2 ] [ phased LC P3 ]
[ periodogram2 ] [ phased LC P1 ] [ phased LC P2 ] [ phased LC P3 ]
.
.
[ periodogramN ] [ phased LC P1 ] [ phased LC P2 ] [ phased LC P3 ]
for N independent period-finding methods producing:
- periodogram1,2,3…N: the periodograms from each method
- phased LC P1,P2,P3: the phased lightcurves using the best 3 peaks in each periodogram
The astrobase.checkplot.png.checkplot_png()
function takes a single
period-finding result and makes the following 3 x 3 grid and writes to a PNG:
[LSP plot + objectinfo] [ unphased LC ] [ period 1 phased LC ]
[period 1 phased LC /2] [period 1 phased LC x2] [ period 2 phased LC ]
[ period 3 phased LC ] [period 4 phased LC ] [ period 5 phased LC ]
The astrobase.checkplot.png.twolsp_checkplot_png()
function makes a
similar plot for two independent period-finding routines and writes to a PNG:
[ pgram1 + objectinfo ] [ pgram2 ] [ unphased LC ]
[ pgram1 P1 phased LC ] [ pgram1 P2 phased LC ] [ pgram1 P3 phased LC ]
[ pgram2 P1 phased LC ] [ pgram2 P2 phased LC ] [ pgram2 P3 phased LC ]
where:
- pgram1 is the plot for the periodogram in the lspinfo1 dict
- pgram1 P1, P2, and P3 are the best three periods from lspinfo1
- pgram2 is the plot for the periodogram in the lspinfo2 dict
- pgram2 P1, P2, and P3 are the best three periods from lspinfo2
Submodules¶
astrobase.checkplot.pkl module¶
The checkplot_pickle function takes, for a single object, an arbitrary number of results from independent period-finding functions (e.g. BLS, PDM, AoV, GLS, etc.) in periodbase, and generates a pickle file that contains object and variability information, finder chart, mag series plot, and for each period-finding result: a periodogram and phased mag series plots for an arbitrary number of ‘best periods’.
Checkplot pickles are intended for use with an external checkplot viewer: the Tornado webapp astrobase.cpserver.checkplotserver.py, but you can also use the checkplot.pkl_png.checkplot_pickle_to_png function to render checkplot pickles to PNGs that will look something like:
[ finder ] [ objectinfo ] [ variableinfo ] [ unphased LC ]
[ periodogram1 ] [ phased LC P1 ] [ phased LC P2 ] [ phased LC P3 ]
[ periodogram2 ] [ phased LC P1 ] [ phased LC P2 ] [ phased LC P3 ]
.
.
[ periodogramN ] [ phased LC P1 ] [ phased LC P2 ] [ phased LC P3 ]
for N independent period-finding methods producing:
- periodogram1,2,3…N: the periodograms from each method
- phased LC P1,P2,P3: the phased lightcurves using the best 3 peaks in each periodogram
-
astrobase.checkplot.pkl.
checkplot_dict
(lspinfolist, times, mags, errs, fast_mode=False, magsarefluxes=False, nperiodstouse=3, objectinfo=None, deredden_object=True, custom_bandpasses=None, gaia_submit_timeout=10.0, gaia_submit_tries=3, gaia_max_timeout=180.0, gaia_mirror=None, complete_query_later=True, varinfo=None, getvarfeatures=True, lclistpkl=None, nbrradiusarcsec=60.0, maxnumneighbors=5, xmatchinfo=None, xmatchradiusarcsec=3.0, lcfitfunc=None, lcfitparams=None, externalplots=None, findercmap='gray_r', finderconvolve=None, findercachedir='~/.astrobase/stamp-cache', normto='globalmedian', normmingap=4.0, sigclip=4.0, varepoch='min', phasewrap=True, phasesort=True, phasebin=0.002, minbinelems=7, plotxlim=(-0.8, 0.8), xliminsetmode=False, plotdpi=100, bestperiodhighlight=None, xgridlines=None, mindet=99, verbose=True)[source]¶ This writes a multiple lspinfo checkplot to a dict.
This function can take input from multiple lspinfo dicts (e.g. a list of output dicts or gzipped pickles of dicts from independent runs of BLS, PDM, AoV, or GLS period-finders in periodbase).
NOTE: if lspinfolist contains more than one lspinfo object with the same lspmethod (‘pdm’,’gls’,’sls’,’aov’,’bls’), the latest one in the list will overwrite the earlier ones.
The output dict contains all the plots (magseries and phased magseries), periodograms, object information, variability information, light curves, and phased light curves. This can be written to:
- a pickle with checkplot_pickle below
- a PNG with checkplot.pkl_png.checkplot_pickle_to_png
Parameters: - lspinfolist (list of dicts) –
This is a list of dicts containing period-finder results (‘lspinfo’ dicts). These can be from any of the period-finder methods in astrobase.periodbase. To incorporate external period-finder results into checkplots, these dicts must be of the form below, including at least the keys indicated here:
{'periods': np.array of all periods searched by the period-finder, 'lspvals': np.array of periodogram power value for each period, 'bestperiod': a float value that is the period with the highest peak in the periodogram, i.e. the most-likely actual period, 'method': a three-letter code naming the period-finder used; must be one of the keys in the `astrobase.periodbase.METHODLABELS` dict, 'nbestperiods': a list of the periods corresponding to periodogram peaks (`nbestlspvals` below) to annotate on the periodogram plot so they can be called out visually, 'nbestlspvals': a list of the power values associated with periodogram peaks to annotate on the periodogram plot so they can be called out visually; should be the same length as `nbestperiods` above}
nbestperiods and nbestlspvals in each lspinfo dict must have at least as many elements as the nperiodstouse kwarg to this function.
- times,mags,errs (np.arrays) – The magnitude/flux time-series to process for this checkplot along with their associated measurement errors.
- fast_mode (bool or float) –
This runs the external catalog operations in a “fast” mode, with short timeouts and not trying to hit external catalogs that take a long time to respond.
If this is set to True, the default settings for the external requests will then become:
skyview_lookup = False skyview_timeout = 45.0 skyview_retry_failed = False dust_timeout = 10.0 gaia_submit_timeout = 7.0 gaia_max_timeout = 10.0 gaia_submit_tries = 2 complete_query_later = False search_simbad = False
If this is a float, will run in “fast” mode with the provided timeout value in seconds and the following settings:
skyview_lookup = True skyview_timeout = fast_mode skyview_retry_failed = False dust_timeout = fast_mode gaia_submit_timeout = 0.66*fast_mode gaia_max_timeout = fast_mode gaia_submit_tries = 2 complete_query_later = False search_simbad = False
- magsarefluxes (bool) – If True, indicates the input time-series is fluxes and not mags so the plot y-axis direction and range can be set appropriately.
- nperiodstouse (int) – This controls how many ‘best’ periods to make phased LC plots for. By default, this is the 3 best. If this is set to None, all ‘best’ periods present in each lspinfo dict’s ‘nbestperiods’ key will be processed for this checkplot.
- objectinfo (dict or None) –
This is a dict containing information on the object whose light curve is being processed. This function will then be able to look up and download a finder chart for this object and write that to the output checkplotdict. External services such as GAIA, SIMBAD, TIC, etc. will also be used to look up this object by its coordinates, and will add in information available from those services.
This dict must be of the form and contain at least the keys described below:
{'objectid': the name of the object, 'ra': the right ascension of the object in decimal degrees, 'decl': the declination of the object in decimal degrees, 'ndet': the number of observations of this object}
You can also provide magnitudes and proper motions of the object using the following keys and the appropriate values in the objectinfo dict. These will be used to calculate colors, total and reduced proper motion, etc. and display these in the output checkplot PNG:
'pmra' -> the proper motion in mas/yr in right ascension, 'pmdecl' -> the proper motion in mas/yr in declination, 'umag' -> U mag -> colors: U-B, U-V, U-g 'bmag' -> B mag -> colors: U-B, B-V 'vmag' -> V mag -> colors: U-V, B-V, V-R, V-I, V-K 'rmag' -> R mag -> colors: V-R, R-I 'imag' -> I mag -> colors: g-I, V-I, R-I, B-I 'jmag' -> 2MASS J mag -> colors: J-H, J-K, g-J, i-J 'hmag' -> 2MASS H mag -> colors: J-H, H-K 'kmag' -> 2MASS Ks mag -> colors: g-Ks, H-Ks, J-Ks, V-Ks 'sdssu' -> SDSS u mag -> colors: u-g, u-V 'sdssg' -> SDSS g mag -> colors: g-r, g-i, g-K, u-g, U-g, g-J 'sdssr' -> SDSS r mag -> colors: r-i, g-r 'sdssi' -> SDSS i mag -> colors: r-i, i-z, g-i, i-J, i-W1 'sdssz' -> SDSS z mag -> colors: i-z, z-W2, g-z 'ujmag' -> UKIRT J mag -> colors: J-H, H-K, J-K, g-J, i-J 'uhmag' -> UKIRT H mag -> colors: J-H, H-K 'ukmag' -> UKIRT K mag -> colors: g-K, H-K, J-K, V-K 'irac1' -> Spitzer IRAC1 mag -> colors: i-I1, I1-I2 'irac2' -> Spitzer IRAC2 mag -> colors: I1-I2, I2-I3 'irac3' -> Spitzer IRAC3 mag -> colors: I2-I3 'irac4' -> Spitzer IRAC4 mag -> colors: I3-I4 'wise1' -> WISE W1 mag -> colors: i-W1, W1-W2 'wise2' -> WISE W2 mag -> colors: W1-W2, W2-W3 'wise3' -> WISE W3 mag -> colors: W2-W3 'wise4' -> WISE W4 mag -> colors: W3-W4
If you have magnitude measurements in other bands, use the custom_bandpasses kwarg to pass these in.
If this is None, no object information will be incorporated into the checkplot (kind of making it effectively useless for anything other than glancing at the phased light curves at various ‘best’ periods from the period-finder results).
- deredden_object (bool) – If this is True, will use the 2MASS DUST service to get extinction coefficients in various bands, and then try to deredden the magnitudes and colors of the object already present in the checkplot’s objectinfo dict.
- custom_bandpasses (dict) – This is a dict used to provide custom bandpass definitions for any
magnitude measurements in the objectinfo dict that are not automatically
recognized by
astrobase.varclass.starfeatures.color_features()
. - gaia_submit_timeout (float) – Sets the timeout in seconds to use when submitting a request to look up the object’s information to the GAIA service. Note that if fast_mode is set, this is ignored.
- gaia_submit_tries (int) – Sets the maximum number of times the GAIA services will be contacted to obtain this object’s information. If fast_mode is set, this is ignored, and the services will be contacted only once (meaning that a failure to respond will be silently ignored and no GAIA data will be added to the checkplot’s objectinfo dict).
- gaia_max_timeout (float) – Sets the timeout in seconds to use when waiting for the GAIA service to respond to our request for the object’s information. Note that if fast_mode is set, this is ignored.
- gaia_mirror (str or None) – This sets the GAIA mirror to use. This is a key in the services.gaia.GAIA_URLS dict which defines the URLs to hit for each mirror.
- complete_query_later (bool) – If this is True, saves the state of GAIA queries that are not yet complete when gaia_max_timeout is reached while waiting for the GAIA service to respond to our request. A later call for GAIA info on the same object will attempt to pick up the results from the existing query if it’s completed. If fast_mode is True, this is ignored.
- varinfo (dict) –
If this is None, a blank dict of the form below will be added to the checkplotdict:
{'objectisvar': None -> variability flag (None indicates unset), 'vartags': CSV str containing variability type tags from review, 'varisperiodic': None -> periodic variability flag (None -> unset), 'varperiod': the period associated with the periodic variability, 'varepoch': the epoch associated with the periodic variability}
If you provide a dict matching this format in this kwarg, this will be passed unchanged to the output checkplotdict produced.
- getvarfeatures (bool) – If this is True, several light curve variability features for this object will be calculated and added to the output checkpotdict as checkplotdict[‘varinfo’][‘features’]. This uses the function varclass.varfeatures.all_nonperiodic_features so see its docstring for the measures that are calculated (e.g. Stetson J indices, dispersion measures, etc.)
- lclistpkl (dict or str) – If this is provided, must be a dict resulting from reading a catalog produced by the lcproc.catalogs.make_lclist function or a str path pointing to the pickle file produced by that function. This catalog is used to find neighbors of the current object in the current light curve collection. Looking at neighbors of the object within the radius specified by nbrradiusarcsec is useful for light curves produced by instruments that have a large pixel scale, so are susceptible to blending of variability and potential confusion of neighbor variability with that of the actual object being looked at. If this is None, no neighbor lookups will be performed.
- nbrradiusarcsec (flaot) – The radius in arcseconds to use for a search conducted around the coordinates of this object to look for any potential confusion and blending of variability amplitude caused by their proximity.
- maxnumneighbors (int) – The maximum number of neighbors that will have their light curves and magnitudes noted in this checkplot as potential blends with the target object.
- xmatchinfo (str or dict) – This is either the xmatch dict produced by the function load_xmatch_external_catalogs above, or the path to the xmatch info pickle file produced by that function.
- xmatchradiusarcsec (float) – This is the cross-matching radius to use in arcseconds.
- lcfitfunc (Python function or None) –
If provided, this should be a Python function that is used to fit a model to the light curve. This fit is then overplotted for each phased light curve in the checkplot. This function should have the following signature:
def lcfitfunc(times, mags, errs, period, **lcfitparams)
where lcfitparams encapsulates all external parameters (i.e. number of knots for a spline function, the degree of a Legendre polynomial fit, etc., planet transit parameters) This function should return a Python dict with the following structure (similar to the functions in astrobase.lcfit) and at least the keys below:
{'fittype':<str: name of fit method>, 'fitchisq':<float: the chi-squared value of the fit>, 'fitredchisq':<float: the reduced chi-squared value of the fit>, 'fitinfo':{'fitmags':<ndarray: model mags/fluxes from fit func>}, 'magseries':{'times':<ndarray: times where fitmags are evaluated>}}
Additional keys in the dict returned from this function can include fitdict[‘fitinfo’][‘finalparams’] for the final model fit parameters (this will be used by the checkplotserver if present), fitdict[‘fitinfo’][‘fitepoch’] for the minimum light epoch returned by the model fit, among others.
In any case, the output dict of lcfitfunc will be copied to the output checkplotdict as:
checkplotdict[lspmethod][periodind]['lcfit'][<fittype>]
for each phased light curve.
- lcfitparams (dict) – A dict containing the LC fit parameters to use when calling the function provided in lcfitfunc. This contains key-val pairs corresponding to parameter names and their respective initial values to be used by the fit function.
- externalplots (list of tuples of str) –
If provided, this is a list of 4-element tuples containing:
- path to PNG of periodogram from an external period-finding method
- path to PNG of best period phased LC from the external period-finder
- path to PNG of 2nd-best phased LC from the external period-finder
- path to PNG of 3rd-best phased LC from the external period-finder
This can be used to incorporate external period-finding method results into the output checkplot pickle or exported PNG to allow for comparison with astrobase results.
Example of externalplots:
[('/path/to/external/bls-periodogram.png', '/path/to/external/bls-phasedlc-plot-bestpeak.png', '/path/to/external/bls-phasedlc-plot-peak2.png', '/path/to/external/bls-phasedlc-plot-peak3.png'), ('/path/to/external/pdm-periodogram.png', '/path/to/external/pdm-phasedlc-plot-bestpeak.png', '/path/to/external/pdm-phasedlc-plot-peak2.png', '/path/to/external/pdm-phasedlc-plot-peak3.png'), ...]
If externalplots is provided here, these paths will be stored in the output checkplotdict. The checkplot.pkl_png.checkplot_pickle_to_png function can then automatically retrieve these plot PNGs and put them into the exported checkplot PNG.
- findercmap (str or matplotlib.cm.ColorMap object) – The Colormap object to use for the finder chart image.
- finderconvolve (astropy.convolution.Kernel object or None) – If not None, the Kernel object to use for convolving the finder image.
- findercachedir (str) – The path to the astrobase cache directory for finder chart downloads from the NASA SkyView service.
- normto ({'globalmedian', 'zero'} or a float) – These are specified as below: - ‘globalmedian’ -> norms each mag to the global median of the LC column - ‘zero’ -> norms each mag to zero - a float -> norms each mag to this specified float value.
- normmingap (float) – This defines how much the difference between consecutive measurements is allowed to be to consider them as parts of different timegroups. By default it is set to 4.0 days.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- varepoch ('min' or float or list of lists or None) – The epoch to use for this phased light curve plot tile. If this is a float, will use the provided value directly. If this is ‘min’, will automatically figure out the time-of-minimum of the phased light curve. If this is None, will use the mimimum value of stimes as the epoch of the phased light curve plot. If this is a list of lists, will use the provided value of lspmethodind to look up the current period-finder method and the provided value of periodind to look up the epoch associated with that method and the current period. This is mostly only useful when twolspmode is True.
- phasewrap (bool) – If this is True, the phased time-series will be wrapped around phase 0.0.
- phasesort (bool) – If True, will sort the phased light curve in order of increasing phase.
- phasebin (float) – The bin size to use to group together measurements closer than this amount in phase. This is in units of phase. If this is a float, a phase-binned version of the phased light curve will be overplotted on top of the regular phased light curve.
- minbinelems (int) – The minimum number of elements required per phase bin to include it in the phased LC plot.
- plotxlim (sequence of two floats or None) – The x-range (min, max) of the phased light curve plot. If None, will be determined automatically.
- xliminsetmode (bool) – If this is True, the generated phased light curve plot will use the values of plotxlim as the main plot x-axis limits (i.e. zoomed-in if plotxlim is a range smaller than the full phase range), and will show the full phased light curve plot as an smaller inset. Useful for planetary transit light curves.
- plotdpi (int) – The resolution of the output plot PNGs in dots per inch.
- bestperiodhighlight (str or None) – If not None, this is a str with a matplotlib color specification to use as the background color to highlight the phased light curve plot of the ‘best’ period and epoch combination. If None, no highlight will be applied.
- xgridlines (list of floats or None) – If this is provided, must be a list of floats corresponding to the phase values where to draw vertical dashed lines as a means of highlighting these.
- mindet (int) – The minimum of observations the input object’s mag/flux time-series must have for this function to plot its light curve and phased light curve. If the object has less than this number, no light curves will be plotted, but the checkplotdict will still contain all of the other information.
- verbose (bool) – If True, will indicate progress and warn about problems.
Returns: Returns a checkplotdict.
Return type: dict
-
astrobase.checkplot.pkl.
checkplot_pickle
(lspinfolist, times, mags, errs, fast_mode=False, magsarefluxes=False, nperiodstouse=3, objectinfo=None, deredden_object=True, custom_bandpasses=None, gaia_submit_timeout=10.0, gaia_submit_tries=3, gaia_max_timeout=180.0, gaia_mirror=None, complete_query_later=True, varinfo=None, getvarfeatures=True, lclistpkl=None, nbrradiusarcsec=60.0, maxnumneighbors=5, xmatchinfo=None, xmatchradiusarcsec=3.0, lcfitfunc=None, lcfitparams=None, externalplots=None, findercmap='gray_r', finderconvolve=None, findercachedir='~/.astrobase/stamp-cache', normto='globalmedian', normmingap=4.0, sigclip=4.0, varepoch='min', phasewrap=True, phasesort=True, phasebin=0.002, minbinelems=7, plotxlim=(-0.8, 0.8), xliminsetmode=False, plotdpi=100, bestperiodhighlight=None, xgridlines=None, mindet=99, verbose=True, outfile=None, outgzip=False, pickleprotocol=None, returndict=False)[source]¶ This writes a multiple lspinfo checkplot to a (gzipped) pickle file.
This function can take input from multiple lspinfo dicts (e.g. a list of output dicts or gzipped pickles of dicts from independent runs of BLS, PDM, AoV, or GLS period-finders in periodbase).
NOTE: if lspinfolist contains more than one lspinfo object with the same lspmethod (‘pdm’,’gls’,’sls’,’aov’,’bls’), the latest one in the list will overwrite the earlier ones.
The output dict contains all the plots (magseries and phased magseries), periodograms, object information, variability information, light curves, and phased light curves. This can be written to:
- a pickle with checkplot_pickle below
- a PNG with checkplot.pkl_png.checkplot_pickle_to_png
Parameters: - lspinfolist (list of dicts) –
This is a list of dicts containing period-finder results (‘lspinfo’ dicts). These can be from any of the period-finder methods in astrobase.periodbase. To incorporate external period-finder results into checkplots, these dicts must be of the form below, including at least the keys indicated here:
{'periods': np.array of all periods searched by the period-finder, 'lspvals': np.array of periodogram power value for each period, 'bestperiod': a float value that is the period with the highest peak in the periodogram, i.e. the most-likely actual period, 'method': a three-letter code naming the period-finder used; must be one of the keys in the `astrobase.periodbase.METHODLABELS` dict, 'nbestperiods': a list of the periods corresponding to periodogram peaks (`nbestlspvals` below) to annotate on the periodogram plot so they can be called out visually, 'nbestlspvals': a list of the power values associated with periodogram peaks to annotate on the periodogram plot so they can be called out visually; should be the same length as `nbestperiods` above}
nbestperiods and nbestlspvals in each lspinfo dict must have at least as many elements as the nperiodstouse kwarg to this function.
- times,mags,errs (np.arrays) – The magnitude/flux time-series to process for this checkplot along with their associated measurement errors.
- fast_mode (bool or float) –
This runs the external catalog operations in a “fast” mode, with short timeouts and not trying to hit external catalogs that take a long time to respond.
If this is set to True, the default settings for the external requests will then become:
skyview_lookup = False skyview_timeout = 45.0 skyview_retry_failed = False dust_timeout = 10.0 gaia_submit_timeout = 7.0 gaia_max_timeout = 10.0 gaia_submit_tries = 2 complete_query_later = False search_simbad = False
If this is a float, will run in “fast” mode with the provided timeout value in seconds and the following settings:
skyview_lookup = True skyview_timeout = fast_mode skyview_retry_failed = False dust_timeout = fast_mode gaia_submit_timeout = 0.66*fast_mode gaia_max_timeout = fast_mode gaia_submit_tries = 2 complete_query_later = False search_simbad = False
- magsarefluxes (bool) – If True, indicates the input time-series is fluxes and not mags so the plot y-axis direction and range can be set appropriately.
- nperiodstouse (int) – This controls how many ‘best’ periods to make phased LC plots for. By default, this is the 3 best. If this is set to None, all ‘best’ periods present in each lspinfo dict’s ‘nbestperiods’ key will be processed for this checkplot.
- objectinfo (dict or None) –
If provided, this is a dict containing information on the object whose light curve is being processed. This function will then be able to look up and download a finder chart for this object and write that to the output checkplotdict. External services such as GAIA, SIMBAD, TIC@MAST, etc. will also be used to look up this object by its coordinates, and will add in information available from those services.
The objectinfo dict must be of the form and contain at least the keys described below:
{'objectid': the name of the object, 'ra': the right ascension of the object in decimal degrees, 'decl': the declination of the object in decimal degrees, 'ndet': the number of observations of this object}
You can also provide magnitudes and proper motions of the object using the following keys and the appropriate values in the objectinfo dict. These will be used to calculate colors, total and reduced proper motion, etc. and display these in the output checkplot PNG:
'pmra' -> the proper motion in mas/yr in right ascension, 'pmdecl' -> the proper motion in mas/yr in the declination, 'umag' -> U mag -> colors: U-B, U-V, U-g 'bmag' -> B mag -> colors: U-B, B-V 'vmag' -> V mag -> colors: U-V, B-V, V-R, V-I, V-K 'rmag' -> R mag -> colors: V-R, R-I 'imag' -> I mag -> colors: g-I, V-I, R-I, B-I 'jmag' -> 2MASS J mag -> colors: J-H, J-K, g-J, i-J 'hmag' -> 2MASS H mag -> colors: J-H, H-K 'kmag' -> 2MASS Ks mag -> colors: g-Ks, H-Ks, J-Ks, V-Ks 'sdssu' -> SDSS u mag -> colors: u-g, u-V 'sdssg' -> SDSS g mag -> colors: g-r, g-i, g-K, u-g, U-g, g-J 'sdssr' -> SDSS r mag -> colors: r-i, g-r 'sdssi' -> SDSS i mag -> colors: r-i, i-z, g-i, i-J, i-W1 'sdssz' -> SDSS z mag -> colors: i-z, z-W2, g-z 'ujmag' -> UKIRT J mag -> colors: J-H, H-K, J-K, g-J, i-J 'uhmag' -> UKIRT H mag -> colors: J-H, H-K 'ukmag' -> UKIRT K mag -> colors: g-K, H-K, J-K, V-K 'irac1' -> Spitzer IRAC1 mag -> colors: i-I1, I1-I2 'irac2' -> Spitzer IRAC2 mag -> colors: I1-I2, I2-I3 'irac3' -> Spitzer IRAC3 mag -> colors: I2-I3 'irac4' -> Spitzer IRAC4 mag -> colors: I3-I4 'wise1' -> WISE W1 mag -> colors: i-W1, W1-W2 'wise2' -> WISE W2 mag -> colors: W1-W2, W2-W3 'wise3' -> WISE W3 mag -> colors: W2-W3 'wise4' -> WISE W4 mag -> colors: W3-W4
If you have magnitude measurements in other bands, use the custom_bandpasses kwarg to pass these in.
If this is None, no object information will be incorporated into the checkplot (kind of making it effectively useless for anything other than glancing at the phased light curves at various ‘best’ periods from the period-finder results).
- deredden_object (bool) – If this is True, will use the 2MASS DUST service to get extinction coefficients in various bands, and then try to deredden the magnitudes and colors of the object already present in the checkplot’s objectinfo dict.
- custom_bandpasses (dict) – This is a dict used to provide custom bandpass definitions for any
magnitude measurements in the objectinfo dict that are not automatically
recognized by
astrobase.varclass.starfeatures.color_features()
. - gaia_submit_timeout (float) – Sets the timeout in seconds to use when submitting a request to look up the object’s information to the GAIA service. Note that if fast_mode is set, this is ignored.
- gaia_submit_tries (int) – Sets the maximum number of times the GAIA services will be contacted to obtain this object’s information. If fast_mode is set, this is ignored, and the services will be contacted only once (meaning that a failure to respond will be silently ignored and no GAIA data will be added to the checkplot’s objectinfo dict).
- gaia_max_timeout (float) – Sets the timeout in seconds to use when waiting for the GAIA service to respond to our request for the object’s information. Note that if fast_mode is set, this is ignored.
- gaia_mirror (str or None) – This sets the GAIA mirror to use. This is a key in the services.gaia.GAIA_URLS dict which defines the URLs to hit for each mirror.
- complete_query_later (bool) – If this is True, saves the state of GAIA queries that are not yet complete when gaia_max_timeout is reached while waiting for the GAIA service to respond to our request. A later call for GAIA info on the same object will attempt to pick up the results from the existing query if it’s completed. If fast_mode is True, this is ignored.
- varinfo (dict) –
If this is None, a blank dict of the form below will be added to the checkplotdict:
{'objectisvar': None -> variability flag (None indicates unset), 'vartags': CSV str containing variability type tags from review, 'varisperiodic': None -> periodic variability flag (None -> unset), 'varperiod': the period associated with the periodic variability, 'varepoch': the epoch associated with the periodic variability}
If you provide a dict matching this format in this kwarg, this will be passed unchanged to the output checkplotdict produced.
- getvarfeatures (bool) – If this is True, several light curve variability features for this object will be calculated and added to the output checkpotdict as checkplotdict[‘varinfo’][‘features’]. This uses the function varclass.varfeatures.all_nonperiodic_features so see its docstring for the measures that are calculated (e.g. Stetson J indices, dispersion measures, etc.)
- lclistpkl (dict or str) – If this is provided, must be a dict resulting from reading a catalog produced by the lcproc.catalogs.make_lclist function or a str path pointing to the pickle file produced by that function. This catalog is used to find neighbors of the current object in the current light curve collection. Looking at neighbors of the object within the radius specified by nbrradiusarcsec is useful for light curves produced by instruments that have a large pixel scale, so are susceptible to blending of variability and potential confusion of neighbor variability with that of the actual object being looked at. If this is None, no neighbor lookups will be performed.
- nbrradiusarcsec (flaot) – The radius in arcseconds to use for a search conducted around the coordinates of this object to look for any potential confusion and blending of variability amplitude caused by their proximity.
- maxnumneighbors (int) – The maximum number of neighbors that will have their light curves and magnitudes noted in this checkplot as potential blends with the target object.
- xmatchinfo (str or dict) – This is either the xmatch dict produced by the function load_xmatch_external_catalogs above, or the path to the xmatch info pickle file produced by that function.
- xmatchradiusarcsec (float) – This is the cross-matching radius to use in arcseconds.
- lcfitfunc (Python function or None) –
If provided, this should be a Python function that is used to fit a model to the light curve. This fit is then overplotted for each phased light curve in the checkplot. This function should have the following signature:
def lcfitfunc(times, mags, errs, period, **lcfitparams)
where lcfitparams encapsulates all external parameters (i.e. number of knots for a spline function, the degree of a Legendre polynomial fit, etc., planet transit parameters) This function should return a Python dict with the following structure (similar to the functions in astrobase.lcfit) and at least the keys below:
{'fittype':<str: name of fit method>, 'fitchisq':<float: the chi-squared value of the fit>, 'fitredchisq':<float: the reduced chi-squared value of the fit>, 'fitinfo':{'fitmags':<ndarray: model mags/fluxes from fit func>}, 'magseries':{'times':<ndarray: times where fitmags are evaluated>}}
Additional keys in the dict returned from this function can include fitdict[‘fitinfo’][‘finalparams’] for the final model fit parameters (this will be used by the checkplotserver if present), fitdict[‘fitinfo’][‘fitepoch’] for the minimum light epoch returned by the model fit, among others.
In any case, the output dict of lcfitfunc will be copied to the output checkplotdict as checkplotdict[lspmethod][periodind][‘lcfit’][<fittype>] for each phased light curve.
- lcfitparams (dict) – A dict containing the LC fit parameters to use when calling the function provided in lcfitfunc. This contains key-val pairs corresponding to parameter names and their respective initial values to be used by the fit function.
- externalplots (list of tuples of str) –
If provided, this is a list of 4-element tuples containing:
- path to PNG of periodogram from an external period-finding method
- path to PNG of best period phased LC from the external period-finder
- path to PNG of 2nd-best phased LC from the external period-finder
- path to PNG of 3rd-best phased LC from the external period-finder
This can be used to incorporate external period-finding method results into the output checkplot pickle or exported PNG to allow for comparison with astrobase results. Example of externalplots:
[('/path/to/external/bls-periodogram.png', '/path/to/external/bls-phasedlc-plot-bestpeak.png', '/path/to/external/bls-phasedlc-plot-peak2.png', '/path/to/external/bls-phasedlc-plot-peak3.png'), ('/path/to/external/pdm-periodogram.png', '/path/to/external/pdm-phasedlc-plot-bestpeak.png', '/path/to/external/pdm-phasedlc-plot-peak2.png', '/path/to/external/pdm-phasedlc-plot-peak3.png'), ...]
If externalplots is provided here, these paths will be stored in the output checkplotdict. The checkplot.pkl_png.checkplot_pickle_to_png function can then automatically retrieve these plot PNGs and put them into the exported checkplot PNG.
- findercmap (str or matplotlib.cm.ColorMap object) – The Colormap object to use for the finder chart image.
- finderconvolve (astropy.convolution.Kernel object or None) – If not None, the Kernel object to use for convolving the finder image.
- findercachedir (str) – The path to the astrobase cache directory for finder chart downloads from the NASA SkyView service.
- normto ({'globalmedian', 'zero'} or a float) –
This specifies the normalization target:
'globalmedian' -> norms each mag to global median of the LC column 'zero' -> norms each mag to zero a float -> norms each mag to this specified float value.
- normmingap (float) – This defines how much the difference between consecutive measurements is allowed to be to consider them as parts of different timegroups. By default it is set to 4.0 days.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- varepoch ('min' or float or list of lists or None) – The epoch to use for this phased light curve plot tile. If this is a float, will use the provided value directly. If this is ‘min’, will automatically figure out the time-of-minimum of the phased light curve. If this is None, will use the mimimum value of stimes as the epoch of the phased light curve plot. If this is a list of lists, will use the provided value of lspmethodind to look up the current period-finder method and the provided value of periodind to look up the epoch associated with that method and the current period. This is mostly only useful when twolspmode is True.
- phasewrap (bool) – If this is True, the phased time-series will be wrapped around phase 0.0.
- phasesort (bool) – If True, will sort the phased light curve in order of increasing phase.
- phasebin (float) – The bin size to use to group together measurements closer than this amount in phase. This is in units of phase. If this is a float, a phase-binned version of the phased light curve will be overplotted on top of the regular phased light curve.
- minbinelems (int) – The minimum number of elements required per phase bin to include it in the phased LC plot.
- plotxlim (sequence of two floats or None) – The x-range (min, max) of the phased light curve plot. If None, will be determined automatically.
- xliminsetmode (bool) – If this is True, the generated phased light curve plot will use the values of plotxlim as the main plot x-axis limits (i.e. zoomed-in if plotxlim is a range smaller than the full phase range), and will show the full phased light curve plot as an smaller inset. Useful for planetary transit light curves.
- plotdpi (int) – The resolution of the output plot PNGs in dots per inch.
- bestperiodhighlight (str or None) – If not None, this is a str with a matplotlib color specification to use as the background color to highlight the phased light curve plot of the ‘best’ period and epoch combination. If None, no highlight will be applied.
- xgridlines (list of floats or None) – If this is provided, must be a list of floats corresponding to the phase values where to draw vertical dashed lines as a means of highlighting these.
- mindet (int) – The minimum of observations the input object’s mag/flux time-series must have for this function to plot its light curve and phased light curve. If the object has less than this number, no light curves will be plotted, but the checkplotdict will still contain all of the other information.
- verbose (bool) – If True, will indicate progress and warn about problems.
- outfile (str or None) – The name of the output checkplot pickle file. If this is None, will write the checkplot pickle to file called ‘checkplot.pkl’ in the current working directory.
- outgzip (bool) – This controls whether to gzip the output pickle. It turns out that this is the slowest bit in the output process, so if you’re after speed, best not to use this. This is False by default since it turns out that gzip actually doesn’t save that much space (29 MB vs. 35 MB for the average checkplot pickle).
- pickleprotocol (int or None) –
This sets the pickle file protocol to use when writing the pickle:
If None, will choose a protocol using the following rules:
- 4 -> default in Python >= 3.4 - fast but incompatible with Python 2
- 3 -> default in Python 3.0-3.3 - mildly fast
- 2 -> default in Python 2 - very slow, but compatible with Python 2/3
The default protocol kwarg is None, this will make an automatic choice for pickle protocol that’s best suited for the version of Python in use. Note that this will make pickles generated by Py3 incompatible with Py2.
- returndict (bool) – If this is True, will return the checkplotdict instead of returning the filename of the output checkplot pickle.
Returns: If returndict is False, will return the path to the generated checkplot pickle file. If returndict is True, will return the checkplotdict instead.
Return type: dict or str
-
astrobase.checkplot.pkl.
checkplot_pickle_update
(currentcp, updatedcp, outfile=None, outgzip=False, pickleprotocol=None, verbose=True)[source]¶ This updates the current checkplotdict with updated values provided.
Parameters: - currentcp (dict or str) – This is either a checkplotdict produced by checkplot_pickle above or a checkplot pickle file produced by the same function. This checkplot will be updated from the updatedcp checkplot.
- updatedcp (dict or str) – This is either a checkplotdict produced by checkplot_pickle above or a checkplot pickle file produced by the same function. This checkplot will be the source of the update to the currentcp checkplot.
- outfile (str or None) – The name of the output checkplot pickle file. The function will output the new checkplot gzipped pickle file to outfile if outfile is a filename. If currentcp is a file and outfile, this will be set to that filename, so the function updates it in place.
- outgzip (bool) – This controls whether to gzip the output pickle. It turns out that this is the slowest bit in the output process, so if you’re after speed, best not to use this. This is False by default since it turns out that gzip actually doesn’t save that much space (29 MB vs. 35 MB for the average checkplot pickle).
- pickleprotocol (int or None) –
This sets the pickle file protocol to use when writing the pickle:
If None, will choose a protocol using the following rules:
- 4 -> default in Python >= 3.4 - fast but incompatible with Python 2
- 3 -> default in Python 3.0-3.3 - mildly fast
- 2 -> default in Python 2 - very slow, but compatible with Python 2/3
The default protocol kwarg is None, this will make an automatic choice for pickle protocol that’s best suited for the version of Python in use. Note that this will make pickles generated by Py3 incompatible with Py2.
- verbose (bool) – If True, will indicate progress and warn about problems.
Returns: The path to the updated checkplot pickle file. If outfile was None and currentcp was a filename, this will return currentcp to indicate that the checkplot pickle file was updated in place.
Return type: str
astrobase.checkplot.pkl_io module¶
This contains utility functions that support the checkplot.pkl input/output functionality.
-
astrobase.checkplot.pkl_io.
_base64_to_file
(b64str, outfpath, writetostrio=False)[source]¶ This converts the base64 encoded string to a file.
Parameters: - b64str (str) – A base64 encoded strin that is the output of base64.b64encode.
- outfpath (str) – The path to where the file will be written. This should include an appropriate extension for the file (e.g. a base64 encoded string that represents a PNG should have its outfpath end in a ‘.png’) so the OS can open these files correctly.
- writetostrio (bool) – If this is True, will return a StringIO object with the binary stream decoded from the base64-encoded input string b64str. This can be useful to embed these into other files without having to write them to disk.
Returns: If writetostrio is False, will return the output file’s path as a str. If it is True, will return a StringIO object directly. If writing the file fails in either case, will return None.
Return type: str or StringIO object
-
astrobase.checkplot.pkl_io.
_read_checkplot_picklefile
(checkplotpickle)[source]¶ This reads a checkplot gzipped pickle file back into a dict.
NOTE: the try-except is for Python 2 pickles that have numpy arrays in them. Apparently, these aren’t compatible with Python 3. See here:
http://stackoverflow.com/q/11305790
The workaround is noted in this answer:
http://stackoverflow.com/a/41366785
Parameters: checkplotpickle (str) – The path to a checkplot pickle file. This can be a gzipped file (in which case the file extension should end in ‘.gz’) Returns: This returns a checkplotdict. Return type: dict
-
astrobase.checkplot.pkl_io.
_write_checkplot_picklefile
(checkplotdict, outfile=None, protocol=None, outgzip=False)[source]¶ This writes the checkplotdict to a (gzipped) pickle file.
Parameters: - checkplotdict (dict) – This the checkplotdict to write to the pickle file.
- outfile (None or str) –
The path to the output pickle file to write. If outfile is None, writes a (gzipped) pickle file of the form:
checkplot-{objectid}.pkl(.gz)
to the current directory.
- protocol (int) –
This sets the pickle file protocol to use when writing the pickle:
If None, will choose a protocol using the following rules:
- 4 -> default in Python >= 3.4 - fast but incompatible with Python 2
- 3 -> default in Python 3.0-3.3 - mildly fast
- 2 -> default in Python 2 - very slow, but compatible with Python 2/3
The default protocol kwarg is None, this will make an automatic choice for pickle protocol that’s best suited for the version of Python in use. Note that this will make pickles generated by Py3 incompatible with Py2.
- outgzip (bool) – If this is True, will gzip the output file. Note that if the outfile str ends in a gzip, this will be automatically turned on.
Returns: The absolute path to the written checkplot pickle file. None if writing fails.
Return type: str
astrobase.checkplot.pkl_png module¶
This contains utility functions that support the checkplot pickle to PNG export functionality.
-
astrobase.checkplot.pkl_png.
checkplot_pickle_to_png
(checkplotin, outfile, extrarows=None)[source]¶ This reads the checkplot pickle or dict provided, and writes out a PNG.
The output PNG contains most of the information in the input checkplot pickle/dict, and can be used to quickly glance through the highlights instead of having to review the checkplot with the checkplotserver webapp. This is useful for exporting read-only views of finalized checkplots from the checkplotserver as well, to share them with other people.
The PNG has 4 x N tiles:
[ finder ] [ objectinfo ] [ varinfo/comments ] [ unphased LC ] [ periodogram1 ] [ phased LC P1 ] [ phased LC P2 ] [ phased LC P3 ] [ periodogram2 ] [ phased LC P1 ] [ phased LC P2 ] [ phased LC P3 ] . . [ periodogramN ] [ phased LC P1 ] [ phased LC P2 ] [ phased LC P3 ]
for N independent period-finding methods producing:
- periodogram1,2,3…N: the periodograms from each method
- phased LC P1,P2,P3: the phased lightcurves using the best 3 peaks in each periodogram
Parameters: - checkplotin (dict or str) – This is either a checkplotdict produced by
astrobase.checkplot.pkl.checkplot_dict()
or a checkplot pickle file produced byastrobase.checkplot.pkl.checkplot_pickle()
. - outfile (str) – The filename of the output PNG file to create.
- extrarows (list of tuples) –
This is a list of 4-element tuples containing paths to PNG files that will be added to the end of the rows generated from the checkplotin pickle/dict. Each tuple represents a row in the final output PNG file. If there are less than 4 elements per tuple, the missing elements will be filled in with white-space. If there are more than 4 elements per tuple, only the first four will be used.
The purpose of this kwarg is to incorporate periodograms and phased LC plots (in the form of PNGs) generated from an external period-finding function or program (like VARTOOLS) to allow for comparison with astrobase results.
NOTE: the PNG files specified in extrarows here will be added to those already present in the input checkplotdict[‘externalplots’] if that is None because you passed in a similar list of external plots to the
astrobase.checkplot.pkl.checkplot_pickle()
function earlier. In this case, extrarows can be used to add even more external plots if desired.Each external plot PNG will be resized to 750 x 480 pixels to fit into an output image cell.
By convention, each 4-element tuple should contain:
- a periodiogram PNG
- phased LC PNG with 1st best peak period from periodogram
- phased LC PNG with 2nd best peak period from periodogram
- phased LC PNG with 3rd best peak period from periodogram
Example of extrarows:
[('/path/to/external/bls-periodogram.png', '/path/to/external/bls-phasedlc-plot-bestpeak.png', '/path/to/external/bls-phasedlc-plot-peak2.png', '/path/to/external/bls-phasedlc-plot-peak3.png'), ('/path/to/external/pdm-periodogram.png', '/path/to/external/pdm-phasedlc-plot-bestpeak.png', '/path/to/external/pdm-phasedlc-plot-peak2.png', '/path/to/external/pdm-phasedlc-plot-peak3.png'), ...]
Returns: The absolute path to the generated checkplot PNG.
Return type: str
-
astrobase.checkplot.pkl_png.
cp2png
(checkplotin, extrarows=None)[source]¶ This is just a shortened form of the function above for convenience.
This only handles pickle files as input.
Parameters: - checkplotin (str) – File name of a checkplot pickle file to convert to a PNG.
- extrarows (list of tuples) –
This is a list of 4-element tuples containing paths to PNG files that will be added to the end of the rows generated from the checkplotin pickle/dict. Each tuple represents a row in the final output PNG file. If there are less than 4 elements per tuple, the missing elements will be filled in with white-space. If there are more than 4 elements per tuple, only the first four will be used.
The purpose of this kwarg is to incorporate periodograms and phased LC plots (in the form of PNGs) generated from an external period-finding function or program (like VARTOOLS) to allow for comparison with astrobase results.
NOTE: the PNG files specified in extrarows here will be added to those already present in the input checkplotdict[‘externalplots’] if that is None because you passed in a similar list of external plots to the
astrobase.checkplot.pkl.checkplot_pickle()
function earlier. In this case, extrarows can be used to add even more external plots if desired.Each external plot PNG will be resized to 750 x 480 pixels to fit into an output image cell.
By convention, each 4-element tuple should contain:
- a periodiogram PNG
- phased LC PNG with 1st best peak period from periodogram
- phased LC PNG with 2nd best peak period from periodogram
- phased LC PNG with 3rd best peak period from periodogram
Example of extrarows:
[('/path/to/external/bls-periodogram.png', '/path/to/external/bls-phasedlc-plot-bestpeak.png', '/path/to/external/bls-phasedlc-plot-peak2.png', '/path/to/external/bls-phasedlc-plot-peak3.png'), ('/path/to/external/pdm-periodogram.png', '/path/to/external/pdm-phasedlc-plot-bestpeak.png', '/path/to/external/pdm-phasedlc-plot-peak2.png', '/path/to/external/pdm-phasedlc-plot-peak3.png'), ...]
Returns: The absolute path to the generated checkplot PNG.
Return type: str
astrobase.checkplot.pkl_postproc module¶
This contains utility functions that support the checkplot pickle post-processing functionality.
-
astrobase.checkplot.pkl_postproc.
update_checkplot_objectinfo
(cpf, fast_mode=False, findercmap='gray_r', finderconvolve=None, deredden_object=True, custom_bandpasses=None, gaia_submit_timeout=10.0, gaia_submit_tries=3, gaia_max_timeout=180.0, gaia_mirror=None, complete_query_later=True, lclistpkl=None, nbrradiusarcsec=60.0, maxnumneighbors=5, plotdpi=100, findercachedir='~/.astrobase/stamp-cache', verbose=True)[source]¶ This updates a checkplot objectinfo dict.
Useful in cases where a previous round of GAIA/finderchart/external catalog acquisition failed. This will preserve the following keys in the checkplot if they exist:
comments varinfo objectinfo.objecttags
Parameters: - cpf (str) – The path to the checkplot pickle to update.
- fast_mode (bool or float) – This runs the external catalog operations in a “fast” mode, with short
timeouts and not trying to hit external catalogs that take a long time
to respond. See the docstring for
astrobase.checkplot.pkl_utils._pkl_finder_objectinfo()
for details on how this works. If this is True, will run in “fast” mode with default timeouts (5 seconds in most cases). If this is a float, will run in “fast” mode with the provided timeout value in seconds. - findercmap (str or matplotlib.cm.ColorMap object) – The Colormap object to use for the finder chart image.
- finderconvolve (astropy.convolution.Kernel object or None) – If not None, the Kernel object to use for convolving the finder image.
- deredden_objects (bool) – If this is True, will use the 2MASS DUST service to get extinction coefficients in various bands, and then try to deredden the magnitudes and colors of the object already present in the checkplot’s objectinfo dict.
- custom_bandpasses (dict) – This is a dict used to provide custom bandpass definitions for any magnitude measurements in the objectinfo dict that are not automatically recognized by the varclass.starfeatures.color_features function. See its docstring for details on the required format.
- gaia_submit_timeout (float) – Sets the timeout in seconds to use when submitting a request to look up the object’s information to the GAIA service. Note that if fast_mode is set, this is ignored.
- gaia_submit_tries (int) – Sets the maximum number of times the GAIA services will be contacted to obtain this object’s information. If fast_mode is set, this is ignored, and the services will be contacted only once (meaning that a failure to respond will be silently ignored and no GAIA data will be added to the checkplot’s objectinfo dict).
- gaia_max_timeout (float) – Sets the timeout in seconds to use when waiting for the GAIA service to respond to our request for the object’s information. Note that if fast_mode is set, this is ignored.
- gaia_mirror (str) – This sets the GAIA mirror to use. This is a key in the
astrobase.services.gaia.GAIA_URLS
dict which defines the URLs to hit for each mirror. - complete_query_later (bool) – If this is True, saves the state of GAIA queries that are not yet complete when gaia_max_timeout is reached while waiting for the GAIA service to respond to our request. A later call for GAIA info on the same object will attempt to pick up the results from the existing query if it’s completed. If fast_mode is True, this is ignored.
- lclistpkl (dict or str) – If this is provided, must be a dict resulting from reading a catalog produced by the lcproc.catalogs.make_lclist function or a str path pointing to the pickle file produced by that function. This catalog is used to find neighbors of the current object in the current light curve collection. Looking at neighbors of the object within the radius specified by nbrradiusarcsec is useful for light curves produced by instruments that have a large pixel scale, so are susceptible to blending of variability and potential confusion of neighbor variability with that of the actual object being looked at. If this is None, no neighbor lookups will be performed.
- nbrradiusarcsec (float) – The radius in arcseconds to use for a search conducted around the coordinates of this object to look for any potential confusion and blending of variability amplitude caused by their proximity.
- maxnumneighbors (int) – The maximum number of neighbors that will have their light curves and magnitudes noted in this checkplot as potential blends with the target object.
- plotdpi (int) – The resolution in DPI of the plots to generate in this function (e.g. the finder chart, etc.)
- findercachedir (str) – The path to the astrobase cache directory for finder chart downloads from the NASA SkyView service.
- verbose (bool) – If True, will indicate progress and warn about potential problems.
Returns: Path to the updated checkplot pickle file.
Return type: str
-
astrobase.checkplot.pkl_postproc.
finalize_checkplot
(cpx, outdir, all_lclistpkl, objfits=None)[source]¶ This is used to prevent any further changes to the checkplot.
TODO: finish this.
Use this function after all variable classification, period-finding, and object xmatches are done. This function will add a ‘final’ key to the checkplot, which will contain:
- a phased LC plot with the period and epoch set after review using the times, mags, errs after any appropriate filtering and sigclip was done in the checkplotserver UI
- The unphased LC using the times, mags, errs after any appropriate filtering and sigclip was done in the checkplotserver UI
- the same plots for any LC collection neighbors
- the survey cutout for the object if objfits is provided and checks out
- a redone neighbor search using GAIA and all light curves in the collection even if they don’t have at least 1000 observations.
These items will be shown in a special ‘Final’ tab in the checkplotserver webapp (this should be run in readonly mode as well). The final tab will also contain downloadable links for the checkplot pickle in pkl and PNG format, as well as the final times, mags, errs as a gzipped CSV with a header containing all of this info.
Parameters: - cpx (dict or str) – This is the path to the checkplot dict or pickle file to process.
- outdir (str) – This is the directory to where the final pickle will be written. If this is set to the same dir as cpx and cpx is a pickle, the function will return a failure. This is meant to keep the in-process checkplots separate from the finalized versions.
- all_lclistpkl (str or dict) – This is the path to the pickle or the dict created by
astrobase.lcproc.catalogs.make_lclist()
with no restrictions on the number of observations (so ALL light curves in the collection). This is used to make sure all neighbors of this object in the light curve collection have had their proximity to this object noted. - objfits (str or None) – If this is not None, should be a file path to a FITS file containing a WCS header and this object from the instrument that was used to observe it. This will be used to make a stamp cutout of the object using the actual image it was detected on. This will be a useful comparison to the usual DSS POSS-RED2 image used by the checkplots.
Returns: The path to the updated checkplot pickle file with a ‘final’ key added to it , as described above.
Return type: str
-
astrobase.checkplot.pkl_postproc.
parallel_finalize_cplist
(cplist, outdir, objfits=None)[source]¶ This is a parallel driver for finalize_checkplot, operating on a list of checkplots.
Parameters: - cplist (list of str) – This is a list of paths of all checkplot pickles to process and run through the finalization process.
- outdir (str) – The directory to where the finalized checkplot pickles will be written.
- objfits (str) – Path to the FITS file containing a WCS header and detections of all objects observed by the actual instrument that obtained the light curves. This should generally be a high quality ‘reference’ frame so that all of the objects whose checkplots we’re finalizing (in cplist) can be seen on the frame.
Returns: Dict indicating the success/failure (as True/False) of the checkplot finalize operations for each checkplot pickle provided in cplist.
Return type: dict
-
astrobase.checkplot.pkl_postproc.
parallel_finalize_cpdir
(cpdir, outdir, cpfileglob='checkplot-*.pkl*', objfits=None)[source]¶ This is a parallel driver for finalize_checkplot, operating on a directory of checkplots.
Parameters: - cpdir (str) – This is the path to the directory containing all the checkplot pickles to process and run through the finalization process.
- outdir (str) – The directory to where the finalized checkplot pickles will be written.
- objfits (str) – Path to the FITS file containing a WCS header and detections of all objects observed by the actual instrument that obtained the light curves. This should generally be a high quality ‘reference’ frame so that all of the objects whose checkplots we’re finalizing (in cplist) can be seen on the frame.
Returns: Dict indicating the success/failure (as True/False) of the checkplot finalize operations for each checkplot pickle provided in cplist.
Return type: dict
astrobase.checkplot.pkl_utils module¶
This contains utility functions that support checkplot.pkl public functions.
-
astrobase.checkplot.pkl_utils.
_xyzdist_to_distarcsec
(xyzdist)[source]¶ This inverts the xyz unit vector distance -> angular distance relation.
Parameters: xyzdist (float or array-like) – This is the distance in xyz vector space generated from a transform of (RA,Dec) - > (x,y,z) Returns: dist_arcseconds – The distance in arcseconds. Return type: float or array-like
-
astrobase.checkplot.pkl_utils.
_pkl_finder_objectinfo
(objectinfo, varinfo, findercmap, finderconvolve, sigclip, normto, normmingap, deredden_object=True, custom_bandpasses=None, lclistpkl=None, nbrradiusarcsec=30.0, maxnumneighbors=5, plotdpi=100, findercachedir='~/.astrobase/stamp-cache', verbose=True, gaia_submit_timeout=10.0, gaia_submit_tries=3, gaia_max_timeout=180.0, gaia_mirror=None, gaia_data_release='dr2', fast_mode=False, complete_query_later=True)[source]¶ This returns the finder chart and object information as a dict.
Parameters: - objectinfo (dict or None) –
If provided, this is a dict containing information on the object whose light curve is being processed. This function will then be able to look up and download a finder chart for this object and write that to the output checkplotdict. External services such as GAIA, SIMBAD, TIC@MAST, etc. will also be used to look up this object by its coordinates, and will add in information available from those services.
The objectinfo dict must be of the form and contain at least the keys described below:
{'objectid': the name of the object, 'ra': the right ascension of the object in decimal degrees, 'decl': the declination of the object in decimal degrees, 'ndet': the number of observations of this object}
You can also provide magnitudes and proper motions of the object using the following keys and the appropriate values in the objectinfo dict. These will be used to calculate colors, total and reduced proper motion, etc. and display these in the output checkplot PNG:
'pmra' -> the proper motion in mas/yr in right ascension, 'pmdecl' -> the proper motion in mas/yr in declination, 'umag' -> U mag -> colors: U-B, U-V, U-g 'bmag' -> B mag -> colors: U-B, B-V 'vmag' -> V mag -> colors: U-V, B-V, V-R, V-I, V-K 'rmag' -> R mag -> colors: V-R, R-I 'imag' -> I mag -> colors: g-I, V-I, R-I, B-I 'jmag' -> 2MASS J mag -> colors: J-H, J-K, g-J, i-J 'hmag' -> 2MASS H mag -> colors: J-H, H-K 'kmag' -> 2MASS Ks mag -> colors: g-Ks, H-Ks, J-Ks, V-Ks 'sdssu' -> SDSS u mag -> colors: u-g, u-V 'sdssg' -> SDSS g mag -> colors: g-r, g-i, g-K, u-g, U-g, g-J 'sdssr' -> SDSS r mag -> colors: r-i, g-r 'sdssi' -> SDSS i mag -> colors: r-i, i-z, g-i, i-J, i-W1 'sdssz' -> SDSS z mag -> colors: i-z, z-W2, g-z 'ujmag' -> UKIRT J mag -> colors: J-H, H-K, J-K, g-J, i-J 'uhmag' -> UKIRT H mag -> colors: J-H, H-K 'ukmag' -> UKIRT K mag -> colors: g-K, H-K, J-K, V-K 'irac1' -> Spitzer IRAC1 mag -> colors: i-I1, I1-I2 'irac2' -> Spitzer IRAC2 mag -> colors: I1-I2, I2-I3 'irac3' -> Spitzer IRAC3 mag -> colors: I2-I3 'irac4' -> Spitzer IRAC4 mag -> colors: I3-I4 'wise1' -> WISE W1 mag -> colors: i-W1, W1-W2 'wise2' -> WISE W2 mag -> colors: W1-W2, W2-W3 'wise3' -> WISE W3 mag -> colors: W2-W3 'wise4' -> WISE W4 mag -> colors: W3-W4
If you have magnitude measurements in other bands, use the custom_bandpasses kwarg to pass these in.
If this is None, no object information will be incorporated into the checkplot (kind of making it effectively useless for anything other than glancing at the phased light curves at various ‘best’ periods from the period-finder results).
- varinfo (dict or None) –
If this is None, a blank dict of the form below will be added to the checkplotdict:
{'objectisvar': None -> variability flag (None indicates unset), 'vartags': CSV str containing variability type tags from review, 'varisperiodic': None -> periodic variability flag (None -> unset), 'varperiod': the period associated with the periodic variability, 'varepoch': the epoch associated with the periodic variability}
If you provide a dict matching this format in this kwarg, this will be passed unchanged to the output checkplotdict produced.
- findercmap (str or matplotlib.cm.ColorMap object) – The Colormap object to use for the finder chart image.
- finderconvolve (astropy.convolution.Kernel object or None) – If not None, the Kernel object to use for convolving the finder image.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- normto ({'globalmedian', 'zero'} or a float) –
This is specified as below:
'globalmedian' -> norms each mag to global median of the LC column 'zero' -> norms each mag to zero a float -> norms each mag to this specified float value.
- normmingap (float) – This defines how much the difference between consecutive measurements is allowed to be to consider them as parts of different timegroups. By default it is set to 4.0 days.
- deredden_object (bool) – If this is True, will use the 2MASS DUST service to get extinction coefficients in various bands, and then try to deredden the magnitudes and colors of the object already present in the checkplot’s objectinfo dict.
- custom_bandpasses (dict) – This is a dict used to provide custom bandpass definitions for any
magnitude measurements in the objectinfo dict that are not automatically
recognized by
astrobase.varclass.starfeatures.color_features()
. - lclistpkl (dict or str) – If this is provided, must be a dict resulting from reading a catalog produced by the lcproc.catalogs.make_lclist function or a str path pointing to the pickle file produced by that function. This catalog is used to find neighbors of the current object in the current light curve collection. Looking at neighbors of the object within the radius specified by nbrradiusarcsec is useful for light curves produced by instruments that have a large pixel scale, so are susceptible to blending of variability and potential confusion of neighbor variability with that of the actual object being looked at. If this is None, no neighbor lookups will be performed.
- nbrradiusarcsec (float) – The radius in arcseconds to use for a search conducted around the coordinates of this object to look for any potential confusion and blending of variability amplitude caused by their proximity.
- maxnumneighbors (int) – The maximum number of neighbors that will have their light curves and magnitudes noted in this checkplot as potential blends with the target object.
- plotdpi (int) – The resolution in DPI of the plots to generate in this function (e.g. the finder chart, etc.)
- findercachedir (str) – The path to the astrobase cache directory for finder chart downloads from the NASA SkyView service.
- verbose (bool) – If True, will indicate progress and warn about potential problems.
- gaia_submit_timeout (float) – Sets the timeout in seconds to use when submitting a request to look up the object’s information to the GAIA service. Note that if fast_mode is set, this is ignored.
- gaia_submit_tries (int) – Sets the maximum number of times the GAIA services will be contacted to obtain this object’s information. If fast_mode is set, this is ignored, and the services will be contacted only once (meaning that a failure to respond will be silently ignored and no GAIA data will be added to the checkplot’s objectinfo dict).
- gaia_max_timeout (float) – Sets the timeout in seconds to use when waiting for the GAIA service to respond to our request for the object’s information. Note that if fast_mode is set, this is ignored.
- gaia_mirror (str) – This sets the GAIA mirror to use. This is a key in the services.gaia.GAIA_URLS dict which defines the URLs to hit for each mirror.
- gaia_data_release ({'dr2', 'edr3'}) – The Gaia data release to use for the query. This provides hints for which table to use for the GAIA mirror being queried.
- fast_mode (bool or float) –
This runs the external catalog operations in a “fast” mode, with short timeouts and not trying to hit external catalogs that take a long time to respond.
If this is set to True, the default settings for the external requests will then become:
skyview_lookup = False skyview_timeout = 45.0 skyview_retry_failed = False dust_timeout = 10.0 gaia_submit_timeout = 7.0 gaia_max_timeout = 10.0 gaia_submit_tries = 2 complete_query_later = False search_simbad = False
If this is a float, will run in “fast” mode with the provided timeout value in seconds and the following settings:
skyview_lookup = True skyview_timeout = fast_mode skyview_retry_failed = False dust_timeout = fast_mode gaia_submit_timeout = 0.66*fast_mode gaia_max_timeout = fast_mode gaia_submit_tries = 2 complete_query_later = False search_simbad = False
- complete_query_later (bool) – If this is True, saves the state of GAIA queries that are not yet complete when gaia_max_timeout is reached while waiting for the GAIA service to respond to our request. A later call for GAIA info on the same object will attempt to pick up the results from the existing query if it’s completed. If fast_mode is True, this is ignored.
Returns: A checkplotdict is returned containing the objectinfo and varinfo dicts, ready to use with the functions below to add in light curve plots, phased LC plots, xmatch info, etc.
Return type: dict
- objectinfo (dict or None) –
-
astrobase.checkplot.pkl_utils.
_pkl_periodogram
(lspinfo, plotdpi=100, override_pfmethod=None)[source]¶ This returns the periodogram plot PNG as base64, plus info as a dict.
Parameters: - lspinfo (dict) –
This is an lspinfo dict containing results from a period-finding function. If it’s from an astrobase period-finding function in periodbase, this will already be in the correct format. To use external period-finder results with this function, the lspinfo dict must be of the following form, with at least the keys listed below:
{'periods': np.array of all periods searched by the period-finder, 'lspvals': np.array of periodogram power value for each period, 'bestperiod': a float value that is the period with the highest peak in the periodogram, i.e. the most-likely actual period, 'method': a three-letter code naming the period-finder used; must be one of the keys in the `astrobase.periodbase.METHODLABELS` dict, 'nbestperiods': a list of the periods corresponding to periodogram peaks (`nbestlspvals` below) to annotate on the periodogram plot so they can be called out visually, 'nbestlspvals': a list of the power values associated with periodogram peaks to annotate on the periodogram plot so they can be called out visually; should be the same length as `nbestperiods` above}
nbestperiods and nbestlspvals must have at least 5 elements each, e.g. describing the five ‘best’ (highest power) peaks in the periodogram.
- plotdpi (int) – The resolution in DPI of the output periodogram plot to make.
- override_pfmethod (str or None) – This is used to set a custom label for this periodogram method. Normally, this is taken from the ‘method’ key in the input lspinfo dict, but if you want to override the output method name, provide this as a string here. This can be useful if you have multiple results you want to incorporate into a checkplotdict from a single period-finder (e.g. if you ran BLS over several period ranges separately).
Returns: Returns a dict that contains the following items:
{methodname: {'periods':the period array from lspinfo, 'lspval': the periodogram power array from lspinfo, 'bestperiod': the best period from lspinfo, 'nbestperiods': the 'nbestperiods' list from lspinfo, 'nbestlspvals': the 'nbestlspvals' list from lspinfo, 'periodogram': base64 encoded string representation of the periodogram plot}}
The dict is returned in this format so it can be directly incorporated under the period-finder’s label methodname in a checkplotdict, using Python’s dict update() method.
Return type: dict
- lspinfo (dict) –
-
astrobase.checkplot.pkl_utils.
_pkl_magseries_plot
(stimes, smags, serrs, plotdpi=100, magsarefluxes=False)[source]¶ This returns the magseries plot PNG as base64, plus arrays as dict.
Parameters: - stimes,smags,serrs (np.array) – The mag/flux time-series arrays along with associated errors. These should all have been run through nan-stripping and sigma-clipping beforehand.
- plotdpi (int) – The resolution of the plot to make in DPI.
- magsarefluxes (bool) – If True, indicates the input time-series is fluxes and not mags so the plot y-axis direction and range can be set appropriately.
Returns: A dict of the following form is returned:
{'magseries': {'plot': base64 encoded str representation of the magnitude/flux time-series plot, 'times': the `stimes` array, 'mags': the `smags` array, 'errs': the 'serrs' array}}
The dict is returned in this format so it can be directly incorporated in a checkplotdict, using Python’s dict update() method.
Return type: dict
-
astrobase.checkplot.pkl_utils.
_pkl_phased_magseries_plot
(checkplotdict, lspmethod, periodind, stimes, smags, serrs, varperiod, varepoch, lspmethodind=0, phasewrap=True, phasesort=True, phasebin=0.002, minbinelems=7, plotxlim=(-0.8, 0.8), plotdpi=100, bestperiodhighlight=None, xgridlines=None, xliminsetmode=False, magsarefluxes=False, directreturn=False, overplotfit=None, verbose=True, override_pfmethod=None)[source]¶ This returns the phased magseries plot PNG as base64 plus info as a dict.
Parameters: - checkplotdict (dict) – This is an existing checkplotdict to update. If it’s None or directreturn = True, then the generated dict result for this magseries plot will be returned directly.
- lspmethod (str) – lspmethod is a string indicating the type of period-finding algorithm that produced the period. If this is not in astrobase.plotbase.METHODSHORTLABELS, it will be used verbatim. In most cases, this will come directly from the lspinfo dict produced by a period-finder function.
- periodind (int) –
This is the index of the current periodogram period being operated on:
If == 0 -> best period and `bestperiodhighlight` is applied if not None If > 0 -> some other peak of the periodogram If == -1 -> special mode w/ no periodogram labels and enabled highlight
- stimes,smags,serrs (np.array) – The mag/flux time-series arrays along with associated errors. These should all have been run through nan-stripping and sigma-clipping beforehand.
- varperiod (float or None) – The period to use for this phased light curve plot tile.
- varepoch ('min' or float or list of lists or None) – The epoch to use for this phased light curve plot tile. If this is a float, will use the provided value directly. If this is ‘min’, will automatically figure out the time-of-minimum of the phased light curve. If this is None, will use the mimimum value of stimes as the epoch of the phased light curve plot. If this is a list of lists, will use the provided value of lspmethodind to look up the current period-finder method and the provided value of periodind to look up the epoch associated with that method and the current period. This is mostly only useful when twolspmode is True.
- phasewrap (bool) – If this is True, the phased time-series will be wrapped around phase 0.0.
- phasesort (bool) – If True, will sort the phased light curve in order of increasing phase.
- phasebin (float) – The bin size to use to group together measurements closer than this amount in phase. This is in units of phase. If this is a float, a phase-binned version of the phased light curve will be overplotted on top of the regular phased light curve.
- minbinelems (int) – The minimum number of elements required per phase bin to include it in the phased LC plot.
- plotxlim (sequence of two floats or None) – The x-range (min, max) of the phased light curve plot. If None, will be determined automatically.
- plotdpi (int) – The resolution of the output plot PNGs in dots per inch.
- bestperiodhighlight (str or None) – If not None, this is a str with a matplotlib color specification to use as the background color to highlight the phased light curve plot of the ‘best’ period and epoch combination. If None, no highlight will be applied.
- xgridlines (list of floats or None) – If this is provided, must be a list of floats corresponding to the phase values where to draw vertical dashed lines as a means of highlighting these.
- xliminsetmode (bool) – If this is True, the generated phased light curve plot will use the values of plotxlim as the main plot x-axis limits (i.e. zoomed-in if plotxlim is a range smaller than the full phase range), and will show the full phased light curve plot as an smaller inset. Useful for planetary transit light curves.
- magsarefluxes (bool) – If True, indicates the input time-series is fluxes and not mags so the plot y-axis direction and range can be set appropriately.
- directreturn (bool) – If this set to True, will return only the dict corresponding to the phased LC plot for the input periodind and lspmethod and not return this result embedded in a checkplotdict.
- overplotfit (dict) –
If this is provided, it must be a dict of the form returned by one of the astrobase.lcfit.fit_XXXXX_magseries functions. This can be used to overplot a light curve model fit on top of the phased light curve plot returned by this function. The overplotfit dict has the following form, including at least the keys listed here:
{'fittype':str: name of fit method, 'fitchisq':float: the chi-squared value of the fit, 'fitredchisq':float: the reduced chi-squared value of the fit, 'fitinfo':{'fitmags':array: model mags or fluxes from fit function}, 'magseries':{'times':array: times where the fitmags are evaluated}}
fitmags and times should all be of the same size. The input overplotfit dict is copied over to the checkplotdict for each specific phased LC plot to save all of this information for use later.
- verbose (bool) – If True, will indicate progress and warn about problems.
- override_pfmethod (str or None) – This is used to set a custom label for the periodogram method. Normally, this is taken from the ‘method’ key in the input lspinfo dict, but if you want to override the output method name, provide this as a string here. This can be useful if you have multiple results you want to incorporate into a checkplotdict from a single period-finder (e.g. if you ran BLS over several period ranges separately).
Returns: Returns a dict of the following form:
{lspmethod: {'plot': the phased LC plot as base64 str, 'period': the period used for this phased LC, 'epoch': the epoch used for this phased LC, 'phase': phase value array, 'phasedmags': mags/fluxes sorted in phase order, 'binphase': array of binned phase values, 'binphasedmags': mags/fluxes sorted in binphase order, 'phasewrap': value of the input `phasewrap` kwarg, 'phasesort': value of the input `phasesort` kwarg, 'phasebin': value of the input `phasebin` kwarg, 'minbinelems': value of the input `minbinelems` kwarg, 'plotxlim': value of the input `plotxlim` kwarg, 'lcfit': the provided `overplotfit` dict}}
The dict is in this form because we can use Python dicts’ update() method to update an existing checkplotdict. If returndirect is True, only the inner dict is returned.
Return type: dict
astrobase.checkplot.pkl_xmatch module¶
This contains utility functions that support the checkplot.pkl xmatch functionality.
-
astrobase.checkplot.pkl_xmatch.
load_xmatch_external_catalogs
(xmatchto, xmatchkeys, outfile=None)[source]¶ This loads the external xmatch catalogs into a dict for use in an xmatch.
Parameters: - xmatchto (list of str) –
This is a list of paths to all the catalog text files that will be loaded.
The text files must be ‘CSVs’ that use the ‘|’ character as the separator betwen columns. These files should all begin with a header in JSON format on lines starting with the ‘#’ character. this header will define the catalog and contains the name of the catalog and the column definitions. Column definitions must have the column name and the numpy dtype of the columns (in the same format as that expected for the numpy.genfromtxt function). Any line that does not begin with ‘#’ is assumed to be part of the columns in the catalog. An example is shown below:
# {"name":"NSVS catalog of variable stars", # "columns":[ # {"key":"objectid", "dtype":"U20", "name":"Object ID", "unit": null}, # {"key":"ra", "dtype":"f8", "name":"RA", "unit":"deg"}, # {"key":"decl","dtype":"f8", "name": "Declination", "unit":"deg"}, # {"key":"sdssr","dtype":"f8","name":"SDSS r", "unit":"mag"}, # {"key":"vartype","dtype":"U20","name":"Variable type", "unit":null} # ], # "colra":"ra", # "coldec":"decl", # "description":"Contains variable stars from the NSVS catalog"} objectid1 | 45.0 | -20.0 | 12.0 | detached EB objectid2 | 145.0 | 23.0 | 10.0 | RRab objectid3 | 12.0 | 11.0 | 14.0 | Cepheid . . .
- xmatchkeys (list of lists) – This is the list of lists of column names (as str) to get out of each xmatchto catalog. This should be the same length as xmatchto and each element here will apply to the respective file in xmatchto.
- outfile (str or None) –
If this is not None, set this to the name of the pickle to write the collected xmatch catalogs to. this pickle can then be loaded transparently by the
astrobase.checkplot.pkl.checkplot_dict()
,astrobase.checkplot.pkl.checkplot_pickle()
functions to provide xmatch info to theastrobase.checkplot.pkl_xmatch.xmatch_external_catalogs()
function below.If this is None, will return the loaded xmatch catalogs directly. This will be a huge dict, so make sure you have enough RAM.
Returns: Based on the outfile kwarg, will either return the path to a collected xmatch pickle file or the collected xmatch dict.
Return type: str or dict
- xmatchto (list of str) –
-
astrobase.checkplot.pkl_xmatch.
xmatch_external_catalogs
(checkplotdict, xmatchinfo, xmatchradiusarcsec=2.0, returndirect=False, updatexmatch=True, savepickle=None)[source]¶ This matches the current object in the checkplotdict to all of the external match catalogs specified.
Parameters: - checkplotdict (dict) –
This is a checkplotdict, generated by either the checkplot_dict function, or read in from a _read_checkplot_picklefile function. This must have a structure somewhat like the following, where the indicated keys below are required:
{'objectid': the ID assigned to this object 'objectinfo': {'objectid': ID assigned to this object, 'ra': right ascension of the object in decimal deg, 'decl': declination of the object in decimal deg}}
- xmatchinfo (str or dict) – This is either the xmatch dict produced by the function
astrobase.checkplot.pkl_xmatch.load_xmatch_external_catalogs()
above, or the path to the xmatch info pickle file produced by that function. - xmatchradiusarcsec (float) – This is the cross-matching radius to use in arcseconds.
- returndirect (bool) – If this is True, will only return the xmatch results as a dict. If this False, will return the checkplotdict with the xmatch results added in as a key-val pair.
- updatexmatch (bool) – This function will look for an existing ‘xmatch’ key in the input checkplotdict indicating that an xmatch has been performed before. If updatexmatch is set to True, the xmatch results will be added onto (e.g. when xmatching to additional catalogs after the first run). If this is set to False, the xmatch key-val pair will be completely overwritten.
- savepickle (str or None) – If this is None, it must be a path to where the updated checkplotdict will be written to as a new checkplot pickle. If this is False, only the updated checkplotdict is returned.
Returns: If savepickle is False, this returns a checkplotdict, with the xmatch results added in. An ‘xmatch’ key will be added to this dict, with something like the following dict as the value:
{'xmatchradiusarcsec':xmatchradiusarcsec, 'catalog1':{'name':'Catalog of interesting things', 'found':True, 'distarcsec':0.7, 'info':{'objectid':...,'ra':...,'decl':...,'desc':...}}, 'catalog2':{'name':'Catalog of more interesting things', 'found':False, 'distarcsec':nan, 'info':None}, . . . ....}
This will contain the matches of the object in the input checkplotdict to all of the catalogs provided in xmatchinfo.
If savepickle is True, will return the path to the saved checkplot pickle file.
Return type: dict or str
- checkplotdict (dict) –
astrobase.checkplot.png module¶
This contains the implementation of checkplots that generate PNG files only.
The checkplot_png function takes a single period-finding result and makes the following 3 x 3 grid and writes to a PNG:
[LSP plot + objectinfo] [ unphased LC ] [ period 1 phased LC ]
[period 1 phased LC /2] [period 1 phased LC x2] [ period 2 phased LC ]
[ period 3 phased LC ] [period 4 phased LC ] [ period 5 phased LC ]
The twolsp_checkplot_png function makes a similar plot for two independent period-finding routines and writes to a PNG:
[ pgram1 + objectinfo ] [ pgram2 ] [ unphased LC ]
[ pgram1 P1 phased LC ] [ pgram1 P2 phased LC ] [ pgram1 P3 phased LC ]
[ pgram2 P1 phased LC ] [ pgram2 P2 phased LC ] [ pgram2 P3 phased LC ]
where:
- pgram1 is the plot for the periodogram in the lspinfo1 dict
- pgram1 P1, P2, and P3 are the best three periods from lspinfo1
- pgram2 is the plot for the periodogram in the lspinfo2 dict
- pgram2 P1, P2, and P3 are the best three periods from lspinfo2
-
astrobase.checkplot.png.
checkplot_png
(lspinfo, times, mags, errs, varepoch='min', magsarefluxes=False, objectinfo=None, findercmap='gray_r', finderconvolve=None, findercachedir='~/.astrobase/stamp-cache', normto='globalmedian', normmingap=4.0, sigclip=4.0, phasewrap=True, phasesort=True, phasebin=0.002, minbinelems=7, plotxlim=(-0.8, 0.8), xliminsetmode=False, bestperiodhighlight=None, circleoverlay=False, plotdpi=100, outfile=None, xticksize=None, yticksize=None, verbose=True)[source]¶ This makes a checkplot PNG using the output from a period-finder routine.
A checkplot is a 3 x 3 grid of plots like so:
[periodogram + objectinfo] [ unphased LC ] [period 1 phased LC] [ period 1 phased LC /2 ] [period 1 phased LC x2] [period 2 phased LC] [ period 3 phased LC ] [period 4 phased LC ] [period 5 phased LC]
This is used to sanity check the five best periods obtained from a period-finder function in astrobase.periodbase or from your own period-finder routines if their results can be turned into a dict with the format shown below.
Parameters: - lspinfo (dict or str) –
If this is a dict, it must be a dict produced by an astrobase.periodbase period-finder function or a dict from your own period-finder function or routine that is of the form below with at least these keys:
{'periods': np.array of all periods searched by the period-finder, 'lspvals': np.array of periodogram power value for each period, 'bestperiod': a float value that is the period with the highest peak in the periodogram, i.e. the most-likely actual period, 'method': a three-letter code naming the period-finder used; must be one of the keys in the `astrobase.periodbase.METHODLABELS` dict, 'nbestperiods': a list of the periods corresponding to periodogram peaks (`nbestlspvals` below) to annotate on the periodogram plot so they can be called out visually, 'nbestlspvals': a list of the power values associated with periodogram peaks to annotate on the periodogram plot so they can be called out visually; should be the same length as `nbestperiods` above}
nbestperiods and nbestlspvals must have at least 5 elements each, e.g. describing the five ‘best’ (highest power) peaks in the periodogram.
If lspinfo is a str, then it must be a path to a pickle file (ending with the extension ‘.pkl’ or ‘.pkl.gz’) that contains a dict of the form described above.
- times,mags,errs (np.array) – The mag/flux time-series arrays to process along with associated errors.
- varepoch ('min' or float or None or list of lists) –
This sets the time of minimum light finding strategy for the checkplot:
the epoch used for all phased If `varepoch` is None -> light curve plots will be `min(times)`. If `varepoch='min'` -> automatic epoch finding for all periods using light curve fits. If varepoch is a single float -> this epoch will be used for all phased light curve plots If varepoch is a list of floats each epoch will be applied to with length = `len(nbestperiods)+2` -> the phased light curve for each from period-finder results period specifically
If you use a list for varepoch, it must be of length len(lspinfo[‘nbestperiods’]) + 2, because we insert half and twice the period into the best periods list to make those phased LC plots.
- magsarefluxes (bool) – If True, indicates the input time-series is fluxes and not mags so the plot y-axis direction and range can be set appropriately.
- objectinfo (dict or None) –
If provided, this is a dict containing information on the object whose light curve is being processed. This function will then be able to look up and download a finder chart for this object and write that to the output checkplot PNG image.The objectinfo dict must be of the form and contain at least the keys described below:
{'objectid': the name of the object, 'ra': the right ascension of the object in decimal degrees, 'decl': the declination of the object in decimal degrees, 'ndet': the number of observations of this object}
You can also provide magnitudes and proper motions of the object using the following keys and the appropriate values in the objectinfo dict. These will be used to calculate colors, total and reduced proper motion, etc. and display these in the output checkplot PNG.
- SDSS mag keys: ‘sdssu’, ‘sdssg’, ‘sdssr’, ‘sdssi’, ‘sdssz’
- 2MASS mag keys: ‘jmag’, ‘hmag’, ‘kmag’
- Cousins mag keys: ‘bmag’, ‘vmag’
- GAIA specific keys: ‘gmag’, ‘teff’
- proper motion keys: ‘pmra’, ‘pmdecl’
- findercmap (str or matplotlib.cm.ColorMap object) – The Colormap object to use for the finder chart image.
- finderconvolve (astropy.convolution.Kernel object or None) – If not None, the Kernel object to use for convolving the finder image.
- findercachedir (str) – The directory where the FITS finder images are downloaded and cached.
- normto ({'globalmedian', 'zero'} or a float) –
This sets the normalization target:
'globalmedian' -> norms each mag to global median of the LC column 'zero' -> norms each mag to zero a float -> norms each mag to this specified float value.
- normmingap (float) – This defines how much the difference between consecutive measurements is allowed to be to consider them as parts of different timegroups. By default it is set to 4.0 days.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- phasewrap (bool) – If this is True, the phased time-series will be wrapped around phase 0.0.
- phasesort (bool) – If this is True, the phased time-series will be sorted in phase.
- phasebin (float or None) – If this is provided, indicates the bin size to use to group together measurements closer than this amount in phase. This is in units of phase. The binned phased light curve will be overplotted on top of the phased light curve. Useful for when one has many measurement points and needs to pick out a small trend in an otherwise noisy phased light curve.
- minbinelems (int) – The minimum number of elements in each phase bin.
- plotxlim (sequence of two floats or None) – The x-axis limits to use when making the phased light curve plot. By default, this is (-0.8, 0.8), which places phase 0.0 at the center of the plot and covers approximately two cycles in phase to make any trends clear.
- xliminsetmode (bool) – If this is True, the generated phased light curve plot will use the values of plotxlim as the main plot x-axis limits (i.e. zoomed-in if plotxlim is a range smaller than the full phase range), and will show the full phased light curve plot as an smaller inset. Useful for planetary transit light curves.
- bestperiodhighlight (str or None) – If not None, this is a str with a matplotlib color specification to use as the background color to highlight the phased light curve plot of the ‘best’ period and epoch combination. If None, no highlight will be applied.
- circleoverlay (False or float) – If float, give the radius in arcseconds of circle to overlay
- outfile (str or None) – The file name of the file to save the checkplot to. If this is None, will write to a file called ‘checkplot.png’ in the current working directory.
- plotdpi (int) – Sets the resolution in DPI for PNG plots (default = 100).
- verbose (bool) – If False, turns off many of the informational messages. Useful for when an external function is driving lots of checkplot_png calls.
- xticksize,yticksize (int or None) – Fontsize for x and y ticklabels
Returns: The file path to the generated checkplot PNG file.
Return type: str
- lspinfo (dict or str) –
-
astrobase.checkplot.png.
twolsp_checkplot_png
(lspinfo1, lspinfo2, times, mags, errs, varepoch='min', magsarefluxes=False, objectinfo=None, findercmap='gray_r', finderconvolve=None, findercachedir='~/.astrobase/stamp-cache', normto='globalmedian', normmingap=4.0, sigclip=4.0, phasewrap=True, phasesort=True, phasebin=0.002, minbinelems=7, plotxlim=(-0.8, 0.8), trimylim=False, unphasedms=2.0, phasems=2.0, phasebinms=4.0, xliminsetmode=False, bestperiodhighlight=None, circleoverlay=False, plotdpi=100, outfile=None, figsize=(30, 24), returnfigure=False, xticksize=None, yticksize=None, verbose=True)[source]¶ This makes a checkplot using results from two independent period-finders.
Adapted from Luke Bouma’s implementation of a similar function in his work. This makes a special checkplot that uses two lspinfo dictionaries, from two independent period-finding methods. For EBs, it’s probably best to use Stellingwerf PDM or Schwarzenberg-Czerny AoV as one of these, and the Box Least-squared Search method as the other one.
The checkplot layout in this case is:
[ pgram1 + objectinfo ] [ pgram2 ] [ unphased LC ] [ pgram1 P1 phased LC ] [ pgram1 P2 phased LC ] [ pgram1 P3 phased LC ] [ pgram2 P1 phased LC ] [ pgram2 P2 phased LC ] [ pgram2 P3 phased LC ]
where:
- pgram1 is the plot for the periodogram in the lspinfo1 dict
- pgram1 P1, P2, and P3 are the best three periods from lspinfo1
- pgram2 is the plot for the periodogram in the lspinfo2 dict
- pgram2 P1, P2, and P3 are the best three periods from lspinfo2
Note that we take the output file name from lspinfo1 if lspinfo1 is a string filename pointing to a (gzipped) pickle containing the results dict from a period-finding routine similar to those in periodbase.
Parameters: - lspinfo1,lspinfo2 (dict or str) –
If this is a dict, it must be a dict produced by an astrobase.periodbase period-finder function or a dict from your own period-finder function or routine that is of the form below with at least these keys:
{'periods': np.array of all periods searched by the period-finder, 'lspvals': np.array of periodogram power value for each period, 'bestperiod': a float value that is the period with the highest peak in the periodogram, i.e. the most-likely actual period, 'method': a three-letter code naming the period-finder used; must be one of the keys in the `astrobase.periodbase.METHODLABELS` dict, 'nbestperiods': a list of the periods corresponding to periodogram peaks (`nbestlspvals` below) to annotate on the periodogram plot so they can be called out visually, 'nbestlspvals': a list of the power values associated with periodogram peaks to annotate on the periodogram plot so they can be called out visually; should be the same length as `nbestperiods` above}
nbestperiods and nbestlspvals must have at least 3 elements each, e.g. describing the three ‘best’ (highest power) peaks in the periodogram.
If lspinfo is a str, then it must be a path to a pickle file (ending with the extension ‘.pkl’ or ‘.pkl.gz’) that contains a dict of the form described above.
- times,mags,errs (np.array) – The mag/flux time-series arrays to process along with associated errors.
- varepoch ('min' or float or None or list of lists) –
This sets the time of minimum light finding strategy for the checkplot:
the epoch used for all phased If `varepoch` is None -> light curve plots will be `min(times)`. If `varepoch='min'` -> automatic epoch finding for all periods using light curve fits. If varepoch is a single float -> this epoch will be used for all phased light curve plots If varepoch is a list of floats each epoch will be applied to with length = `len(nbestperiods)` -> the phased light curve for each from period-finder results period specifically
If you use a list for varepoch, it must be of length len(lspinfo[‘nbestperiods’]).
- magsarefluxes (bool) – If True, indicates the input time-series is fluxes and not mags so the plot y-axis direction and range can be set appropriately/
- objectinfo (dict or None) –
If provided, this is a dict containing information on the object whose light curve is being processed. This function will then be able to look up and download a finder chart for this object and write that to the output checkplot PNG image.The objectinfo dict must be of the form and contain at least the keys described below:
{'objectid': the name of the object, 'ra': the right ascension of the object in decimal degrees, 'decl': the declination of the object in decimal degrees, 'ndet': the number of observations of this object}
You can also provide magnitudes and proper motions of the object using the following keys and the appropriate values in the objectinfo dict. These will be used to calculate colors, total and reduced proper motion, etc. and display these in the output checkplot PNG.
- SDSS mag keys: ‘sdssu’, ‘sdssg’, ‘sdssr’, ‘sdssi’, ‘sdssz’
- 2MASS mag keys: ‘jmag’, ‘hmag’, ‘kmag’
- Cousins mag keys: ‘bmag’, ‘vmag’
- GAIA specific keys: ‘gmag’, ‘teff’
- proper motion keys: ‘pmra’, ‘pmdecl’
- findercmap (str or matplotlib.cm.ColorMap object) – The Colormap object to use for the finder chart image.
- finderconvolve (astropy.convolution.Kernel object or None) – If not None, the Kernel object to use for convolving the finder image.
- findercachedir (str) – The directory where the FITS finder images are downloaded and cached.
- normto ({'globalmedian', 'zero'} or a float) –
This sets the LC normalization target:
'globalmedian' -> norms each mag to global median of the LC column 'zero' -> norms each mag to zero a float -> norms each mag to this specified float value.
- normmingap (float) – This defines how much the difference between consecutive measurements is allowed to be to consider them as parts of different timegroups. By default it is set to 4.0 days.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- phasewrap (bool) – If this is True, the phased time-series will be wrapped around phase 0.0.
- phasesort (bool) – If this is True, the phased time-series will be sorted in phase.
- phasebin (float or None) – If this is provided, indicates the bin size to use to group together measurements closer than this amount in phase. This is in units of phase. The binned phased light curve will be overplotted on top of the phased light curve. Useful for when one has many measurement points and needs to pick out a small trend in an otherwise noisy phased light curve.
- minbinelems (int) – The minimum number of elements in each phase bin.
- plotxlim (sequence of two floats or None) – The x-axis limits to use when making the phased light curve plot. By default, this is (-0.8, 0.8), which places phase 0.0 at the center of the plot and covers approximately two cycles in phase to make any trends clear.
- trimylim (bool) –
Default False. If True, sets the y limits on the phase-folded light curves as
- [median - 1.2*(median-5th percentile),
- median + 1.2*(95th percentile-median)]
which can help clip flares or other outliers.
- unphasedms (float) – The marker size to use for the main unphased light curve plot symbols.
- phasems (float) – The marker size to use for the main phased light curve plot symbols.
- phasebinms (float) – The marker size to use for the binned phased light curve plot symbols.
- xliminsetmode (bool) – If this is True, the generated phased light curve plot will use the values of plotxlim as the main plot x-axis limits (i.e. zoomed-in if plotxlim is a range smaller than the full phase range), and will show the full phased light curve plot as an smaller inset. Useful for planetary transit light curves.
- bestperiodhighlight (str or None) – If not None, this is a str with a matplotlib color specification to use as the background color to highlight the phased light curve plot of the ‘best’ period and epoch combination. If None, no highlight will be applied.
- circleoverlay (False or float) – If float, give the radius in arcseconds of circle to overlay
- plotdpi (int) – Sets the resolution in DPI for PNG plots (default = 100).
- outfile (str or None) – The file name of the file to save the checkplot to. If this is None, will write to a file called ‘checkplot.png’ in the current working directory.
- figsize (tuple of two int) – The output figure size in inches.
- returnfigure (bool) – If True, will return the figure directly as a
matplotlib.Figure
object. - xticksize,yticksize (int or None) – Fontsize for x and y ticklabels
- verbose (bool) – If False, turns off many of the informational messages. Useful for when an external function is driving lots of checkplot_png calls.
Returns: figure – The file path to the generated checkplot PNG file if
returnfigure
is False. Amatplotlib.Figure
ifreturnfigure
is True.Return type: str or matplotlib.Figure
astrobase.cpserver package¶
This package contains the implementation of the checkplotserver webapp to review large numbers of checkplot pickle files generated as part of a variable star classification pipeline. Also provided is a lightweight checkplot-viewer.html webapp to quickly glance through large numbers of checkplot PNGs.
If you made checkplot pickles (checkplot-*.pkl)¶
Invoke this command from that directory like so:
$ checkplotlist pkl subdir/containing/the/checkplots
Then, from that directory, invoke the checkplotserver webapp (make sure the astrobase virtualenv is active, so the command below is in your path):
$ checkplotserver [list of options, use --help to see these]
The webapp will start up a Tornado web server running on your computer and listening on a local address (default: http://localhost:5225). This webapp will read the checkplot-filelist.json file to find the checkplots.
Browse to http://localhost:5225 (or whatever port you set in checkplotserver options) to look through or update all your checkplots. Any changes will be written back to the checkplot .pkl files, making this method of browsing more suited to more serious variability searches on large numbers of checkplots.
If you made checkplots PNGs (checkplot-*.png)¶
Copy checkplot-viewer.html and checkplot-viewer.js to the base directory from where you intend to serve your checkplot images from. Then invoke this command from that directory:
$ checkplotlist png subdir/containing/the/checkplots 'optional-glob*.png'
This will generate a checkplot-filelist.json file containing the file paths to the checkplots. You can then run a temporary Python web server from this base directory to browse through all the checkplots:
$ python -m SimpleHTTPServer # Python 2
$ python3 -m http.server # Python 3
then browse to http://localhost:8000/checkplot-viewer.html.
If this directory is already in a path served by a web server, then you can just browse to the checkplot-viewer.html file normally. Note that a file:/// URL provided to the browser won’t necessarily work in some browsers (especially Google Chrome) because of security precautions.
Submodules¶
astrobase.cpserver.checkplotlist module¶
This makes a checkplot file list for use with the checkplot-viewer.html or the checkplotserver.py webapps. Checkplots are quick-views of object info, finder charts, light curves, phased light curves, and periodograms used to examine their stellar variability.
These are produced by several functions in the astrobase.checkplot module:
astrobase.checkplot.pkl.checkplot_pickle()
: makes a checkplot pickle file for any number of independent period-finding methods. Use checkplotserver.py to view these pickle files.astrobase.checkplot.png.checkplot_png()
: makes a checkplot PNG for a single period-finding method. Use checkplot-viewer.html to view these image files.astrobase.checkplot.png.twolsp_checkplot_png()
: does the same for two independent period-finding methods. Use checkplot-viewer.html to view these image files.
-
astrobase.cpserver.checkplotlist.
checkplot_infokey_worker
(task)[source]¶ This gets the required keys from the requested file.
Parameters: task (tuple) – Task is a two element tuple:
- task[0] is the dict to work on - task[1] is a list of lists of str indicating all the key address to extract items from the dict for
Returns: This is a list of all of the items at the requested key addresses. Return type: list
-
astrobase.cpserver.checkplotlist.
main
()[source]¶ This is the main function of this script.
The current script args are shown below
Usage: checkplotlist [-h] [--search SEARCH] [--sortby SORTBY] [--filterby FILTERBY] [--splitout SPLITOUT] [--outprefix OUTPREFIX] [--maxkeyworkers MAXKEYWORKERS] {pkl,png} cpdir This makes a checkplot file list for use with the checkplot-viewer.html (for checkplot PNGs) or the checkplotserver.py (for checkplot pickles) webapps. positional arguments: {pkl,png} type of checkplot to search for: pkl -> checkplot pickles, png -> checkplot PNGs cpdir directory containing the checkplots to process optional arguments: -h, --help show this help message and exit --search SEARCH file glob prefix to use when searching for checkplots, default: '*checkplot*', (the extension is added automatically - .png or .pkl) --sortby SORTBY the sort key and order to use when sorting --filterby FILTERBY the filter key and condition to use when filtering. you can specify this multiple times to filter by several keys at once. all filters are joined with a logical AND operation in the order they're given. --splitout SPLITOUT if there are more than SPLITOUT objects in the target directory (default: 5000), checkplotlist will split the output JSON into multiple files. this helps keep the checkplotserver webapp responsive. --outprefix OUTPREFIX a prefix string to use for the output JSON file(s). use this to separate out different sort orders or filter conditions, for example. if this isn't provided, but --sortby or --filterby are, will use those to figure out the output files' prefixes --maxkeyworkers MAXKEYWORKERS the number of parallel workers that will be launched to retrieve checkplot key values used for sorting and filtering (default: 2)
astrobase.cpserver.checkplotserver module¶
checkplotserver is a Tornado web-server for visualizing the information stored in checkplot pickles, editing them, and exporting information to a variable star classification pipeline.
This is the main module used to launch the server.
-
astrobase.cpserver.checkplotserver.
main
()[source]¶ This launches the server. The current script args are shown below:
Usage: checkplotserver [OPTIONS] Options: --help show this help information --assetpath Sets the asset (server images, css, js, DB) path for checkplotserver. (default <astrobase install dir> /astrobase/cpserver/cps-assets) --baseurl Set the base URL of the checkplotserver. This is useful when you're running checkplotserver on a remote machine and are reverse-proxying more than one instances of it so you can access them using HTTP from outside on different base URLs like /cpserver1/, /cpserver2/, etc. If this is set, all URLs will take the form [baseurl]/..., instead of /... (default /) --checkplotlist The path to the checkplot-filelist.json file listing checkplots to load and serve. If this is not provided, checkplotserver will look for a checkplot-pickle-flist.json in the directory that it was started in --debugmode start up in debug mode if set to 1. (default 0) --maxprocs Number of background processes to use for saving/loading checkplot files and running light curves tools (default 2) --port Run on the given port. (default 5225) --readonly Run the server in readonly mode. This is useful for a public-facing instance of checkplotserver where you just want to allow collaborators to review objects but not edit them. (default False) --serve Bind to given address and serve content. (default 127.0.0.1) --sharedsecret a file containing a cryptographically secure string that is used to authenticate requests that come into the special standalone mode. --standalone This starts the server in standalone mode. (default 0)
astrobase.cpserver.checkplotserver_handlers module¶
These are Tornado handlers for serving checkplots and operating on them.
-
class
astrobase.cpserver.checkplotserver_handlers.
FrontendEncoder
(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]¶ Bases:
json.encoder.JSONEncoder
This overrides Python’s default JSONEncoder so we can serialize custom objects.
-
default
(obj)[source]¶ Overrides the default serializer for JSONEncoder.
This can serialize the following objects in addition to what JSONEncoder can already do.
- np.array
- bytes
- complex
- np.float64 and other np.dtype objects
Parameters: obj (object) – A Python object to serialize to JSON. Returns: A JSON encoded representation of the input object. Return type: str
-
-
class
astrobase.cpserver.checkplotserver_handlers.
IndexHandler
(application: tornado.web.Application, request: tornado.httputil.HTTPServerRequest, **kwargs)[source]¶ Bases:
tornado.web.RequestHandler
This handles the index page.
This page shows the current project.
astrobase.varclass package¶
This contains various modules that obtain features to use in variable star classification.
astrobase.varclass.starfeatures
: features related to color, proper motion, neighbor proximity, cross-matches against GAIA and SIMBAD, etc.astrobase.varclass.varfeatures
: non-periodic light curve variability featuresastrobase.varclass.periodicfeatures
: light curve features for phased light curvesastrobase.varclass.rfclass
: random forest classifier and support functions for variability classification
Submodules¶
astrobase.varclass.periodicfeatures module¶
This contains functions that calculate various light curve features using information about periods and fits to phased light curves.
-
astrobase.varclass.periodicfeatures.
lcfit_features
(times, mags, errs, period, fourierorder=5, transitparams=(-0.01, 0.1, 0.1), ebparams=(-0.2, 0.3, 0.7, 0.5), sigclip=10.0, magsarefluxes=False, fitfailure_means_featurenan=False, verbose=True)[source]¶ This calculates various features related to fitting models to light curves.
This function:
- calculates R_ij and phi_ij ratios for Fourier fit amplitudes and phases.
- calculates the reduced chi-sq for fourier, EB, and planet transit fits.
- calculates the reduced chi-sq for fourier, EB, planet transit fits w/2 x period.
Parameters: - times,mags,errs (np.array) – The input mag/flux time-series to calculate periodic features for.
- period (float) – The period of variabiity to use to phase the light curve.
- fourierorder (int) – The Fourier order to use to generate sinusoidal function and fit that to the phased light curve.
- transitparams (list of floats) – The transit depth, duration, and ingress duration to use to generate a trapezoid planet transit model fit to the phased light curve. The period used is the one provided in period, while the epoch is automatically obtained from a spline fit to the phased light curve.
- ebparams (list of floats) – The primary eclipse depth, eclipse duration, the primary-secondary depth ratio, and the phase of the secondary eclipse to use to generate an eclipsing binary model fit to the phased light curve. The period used is the one provided in period, while the epoch is automatically obtained from a spline fit to the phased light curve.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- magsarefluxes (bool) – Set this to True if the input measurements in mags are actually fluxes.
- fitfailure_means_featurenan (bool) – If the planet, EB and EBx2 fits don’t return standard errors because the covariance matrix could not be generated, then the fit is suspicious and the features calculated can’t be trusted. If fitfailure_means_featurenan is True, then the output features for these fits will be set to nan.
- verbose (bool) – If True, will indicate progress while working.
Returns: A dict of all the features calculated is returned.
Return type: dict
-
astrobase.varclass.periodicfeatures.
periodogram_features
(pgramlist, times, mags, errs, sigclip=10.0, pdiff_threshold=0.0001, sidereal_threshold=0.0001, sampling_peak_multiplier=5.0, sampling_startp=None, sampling_endp=None, verbose=True)[source]¶ This calculates various periodogram features (for each periodogram).
The following features are obtained:
- For all best periods from all periodogram methods in pgramlist, calculates the number of these with peaks that are at least sampling_peak_multiplier x time-sampling periodogram peak at the same period. This indicates how likely the pgramlist periodogram peaks are to being real as opposed to just being caused by time-sampling window-function of the observations.
- For all best periods from all periodogram methods in pgramlist, calculates the number of best periods which are consistent with a sidereal day (1.0027379 and 0.9972696), likely indicating that they’re not real.
- For all best periods from all periodogram methods in pgramlist, calculates the number of cross-wise period differences for all of these that fall below the pdiff_threshold value. If this is high, most of the period-finders in pgramlist agree on their best period results, so it’s likely the periods found are real.
Parameters: - pgramlist (list of dicts) – This is a list of dicts returned by any of the periodfinding methods in
astrobase.periodbase
. This can also be obtained from the resulting pickle from the :py:func:astrobase.lcproc.periodsearch.run_pf` function. It’s a good idea to make pgramlist a list of periodogram lists from all magnitude columns in the input light curve to test periodic variability across all magnitude columns (e.g. period diffs between EPD and TFA mags) - times,mags,errs (np.array) – The input flux/mag time-series to use to calculate features. These are
used to recalculate the time-sampling L-S periodogram (using
astrobase.periodbase.zgls.specwindow_lsp()
) if one is not present in pgramlist. If it’s present, these can all be set to None. - sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- pdiff_threshold (float) – This is the max difference between periods to consider them the same.
- sidereal_threshold (float) – This is the max difference between any of the ‘best’ periods and the sidereal day periods to consider them the same.
- sampling_peak_multiplier (float) – This is the minimum multiplicative factor of a ‘best’ period’s normalized periodogram peak over the sampling periodogram peak at the same period required to accept the ‘best’ period as possibly real.
- sampling_endp (sampling_startp,) – If the pgramlist doesn’t have a time-sampling Lomb-Scargle periodogram, it will be obtained automatically. Use these kwargs to control the minimum and maximum period interval to be searched when generating this periodogram.
- verbose (bool) – If True, will indicate progress and report errors.
Returns: Returns a dict with all of the periodogram features calculated.
Return type: dict
-
astrobase.varclass.periodicfeatures.
phasedlc_features
(times, mags, errs, period, nbrtimes=None, nbrmags=None, nbrerrs=None)[source]¶ This calculates various phased LC features for the object.
Some of the features calculated here come from:
Kim, D.-W., Protopapas, P., Bailer-Jones, C. A. L., et al. 2014, Astronomy and Astrophysics, 566, A43, and references therein (especially Richards, et al. 2011).
Parameters: - times,mags,errs (np.array) – The input mag/flux time-series to calculate the phased LC features for.
- period (float) – The period used to phase the input mag/flux time-series.
- nbrtimes,nbrmags,nbrerrs (np.array or None) – If nbrtimes, nbrmags, and nbrerrs are all provided, they should be ndarrays with times, mags, errs of this object’s closest neighbor (close within some small number x FWHM of telescope to check for blending). This function will then also calculate extra features based on the neighbor’s phased LC using the period provided for the target object.
Returns: Returns a dict with phased LC features.
Return type: dict
astrobase.varclass.rfclass module¶
Does variable classification using random forests. Two types of classification are supported:
- Variable classification using non-periodic features: this is used to perform a
binary classification between non-variable and variable. Uses the features in
astrobase.varclass.varfeatures
andastrobase.varclass.starfeatures
. - TODO: Periodic variable classification using periodic features: this is used
to perform multi-class classification for periodic variables using the
features in
astrobase.varclass.periodicfeatures
andastrobase.varclass.starfeatures
. The classes recognized are listed in PERIODIC_VARCLASSES below and were generated from manual classification run on various HATNet, HATSouth and HATPI fields.
-
astrobase.varclass.rfclass.
collect_nonperiodic_features
(featuresdir, magcol, outfile, pklglob='varfeatures-*.pkl', featurestouse=['stetsonj', 'stetsonk', 'amplitude', 'magnitude_ratio', 'linear_fit_slope', 'eta_normal', 'percentile_difference_flux_percentile', 'mad', 'skew', 'kurtosis', 'mag_iqr', 'beyond1std', 'grcolor', 'gicolor', 'ricolor', 'bvcolor', 'jhcolor', 'jkcolor', 'hkcolor', 'gkcolor', 'propermotion'], maxobjects=None, labeldict=None, labeltype='binary')[source]¶ This collects variability features into arrays for use with the classifer.
Parameters: - featuresdir (str) – This is the directory where all the varfeatures pickles are. Use
pklglob to specify the glob to search for. The varfeatures pickles
contain objectids, a light curve magcol, and features as dict
key-vals. The
astrobase.lcproc.lcvfeatures
module can be used to produce these. - magcol (str) – This is the key in each varfeatures pickle corresponding to the magcol of the light curve the variability features were extracted from.
- outfile (str) – This is the filename of the output pickle that will be written containing a dict of all the features extracted into np.arrays.
- pklglob (str) – This is the UNIX file glob to use to search for varfeatures pickle files in featuresdir.
- featurestouse (list of str) – Each varfeatures pickle can contain any combination of non-periodic, stellar, and periodic features; these must have the same names as elements in the list of strings provided in featurestouse. This tries to get all the features listed in NONPERIODIC_FEATURES_TO_COLLECT by default. If featurestouse is provided as a list, gets only the features listed in this kwarg instead.
- maxobjects (int or None) – The controls how many pickles from the featuresdir to process. If None, will process all varfeatures pickles.
- labeldict (dict or None) –
If this is provided, it must be a dict with the following key:val list:
'<objectid>':<label value>
for each objectid collected from the varfeatures pickles. This will turn the collected information into a training set for classifiers.
Example: to carry out non-periodic variable feature collection of fake LCS prepared by
astrobase.fakelcs.generation
, use the value of the ‘isvariable’ dict elem from the fakelcs-info.pkl here, like so:labeldict={x:y for x,y in zip(fakelcinfo['objectid'], fakelcinfo['isvariable'])}
- labeltype ({'binary', 'classes'}) – This is either ‘binary’ or ‘classes’ for binary/multi-class classification respectively.
Returns: This returns a dict with all of the features collected into np.arrays, ready to use as input to a scikit-learn classifier.
Return type: dict
- featuresdir (str) – This is the directory where all the varfeatures pickles are. Use
pklglob to specify the glob to search for. The varfeatures pickles
contain objectids, a light curve magcol, and features as dict
key-vals. The
-
astrobase.varclass.rfclass.
train_rf_classifier
(collected_features, test_fraction=0.25, n_crossval_iterations=20, n_kfolds=5, crossval_scoring_metric='f1', classifier_to_pickle=None, nworkers=-1)[source]¶ This gets the best RF classifier after running cross-validation.
- splits the training set into test/train samples
- does KFold stratified cross-validation using RandomizedSearchCV
- gets the RandomForestClassifier with the best performance after CV
- gets the confusion matrix for the test set
Runs on the output dict from functions that produce dicts similar to that produced by collect_nonperiodic_features above.
Parameters: - collected_features (dict or str) – This is either the dict produced by a collect_*_features function or the pickle produced by the same.
- test_fraction (float) – This sets the fraction of the input set that will be used as the test set after training.
- n_crossval_iterations (int) – This sets the number of iterations to use when running the cross-validation.
- n_kfolds (int) – This sets the number of K-folds to use on the data when doing a test-train split.
- crossval_scoring_metric (str) –
This is a string that describes how the cross-validation score is calculated for each iteration. See the URL below for how to specify this parameter:
http://scikit-learn.org/stable/modules/model_evaluation.html#scoring-parameter By default, this is tuned for binary classification and uses the F1 scoring metric. Change the crossval_scoring_metric to another metric (probably ‘accuracy’) for multi-class classification, e.g. for periodic variable classification.
- classifier_to_pickle (str) – If this is a string indicating the name of a pickle file to write, will write the trained classifier to the pickle that can be later loaded and used to classify data.
- nworkers (int) – This is the number of parallel workers to use in the RandomForestClassifier. Set to -1 to use all CPUs on your machine.
Returns: A dict containing the trained classifier, cross-validation results, the input data set, and all input kwargs used is returned, along with cross-validation score metrics.
Return type: dict
-
astrobase.varclass.rfclass.
apply_rf_classifier
(classifier, varfeaturesdir, outpickle, maxobjects=None)[source]¶ This applys an RF classifier trained using train_rf_classifier to varfeatures pickles in varfeaturesdir.
Parameters: - classifier (dict or str) – This is the output dict or pickle created by get_rf_classifier. This will contain a features_name key that will be used to collect the same features used to train the classifier from the varfeatures pickles in varfeaturesdir.
- varfeaturesdir (str) – The directory containing the varfeatures pickles for objects that will be classified by the trained classifier.
- outpickle (str) – This is a filename for the pickle that will be written containing the result dict from this function.
- maxobjects (int) – This sets the number of objects to process in varfeaturesdir.
Returns: The classification results after running the trained classifier as returned as a dict. This contains predicted labels and their prediction probabilities.
Return type: dict
-
astrobase.varclass.rfclass.
plot_training_results
(classifier, classlabels, outfile)[source]¶ This plots the training results from the classifier run on the training set.
- plots the confusion matrix
- plots the feature importances
- FIXME: plot the learning curves too, see: http://scikit-learn.org/stable/modules/learning_curve.html
Parameters: - classifier (dict or str) – This is the output dict or pickle created by get_rf_classifier containing the trained classifier.
- classlabels (list of str) – This contains all of the class labels for the current classification problem.
- outfile (str) – This is the filename where the plots will be written.
Returns: The path to the generated plot file.
Return type: str
astrobase.varclass.starfeatures module¶
This calculates various features related to the color/proper-motion of stars.
All of the functions in this module require as input an ‘objectinfo’ dict. This should usually be taken from a light curve file read into an lcdict. The format and the minimum keys required are:
{'objectid': the name of the object,
'ra': the right ascension of the object in decimal degrees,
'decl': the declination of the object in decimal degrees,
'ndet': the number of observations of this object}
You can also provide magnitudes and proper motions of the object using the following keys and the appropriate values in the objectinfo dict. These will be used to calculate colors, total and reduced proper motion, etc.:
'pmra' -> the proper motion in mas/yr in right ascension,
'pmdecl' -> the proper motion in mas/yr in declination,
'umag' -> U mag -> colors: U-B, U-V, U-g
'bmag' -> B mag -> colors: U-B, B-V
'vmag' -> V mag -> colors: U-V, B-V, V-R, V-I, V-K
'rmag' -> R mag -> colors: V-R, R-I
'imag' -> I mag -> colors: g-I, V-I, R-I, B-I
'jmag' -> 2MASS J mag -> colors: J-H, J-K, g-J, i-J
'hmag' -> 2MASS H mag -> colors: J-H, H-K
'kmag' -> 2MASS Ks mag -> colors: g-Ks, H-Ks, J-Ks, V-Ks
'sdssu' -> SDSS u mag -> colors: u-g, u-V
'sdssg' -> SDSS g mag -> colors: g-r, g-i, g-K, u-g, U-g, g-J
'sdssr' -> SDSS r mag -> colors: r-i, g-r
'sdssi' -> SDSS i mag -> colors: r-i, i-z, g-i, i-J, i-W1
'sdssz' -> SDSS z mag -> colors: i-z, z-W2, g-z
'ujmag' -> UKIRT J mag -> colors: J-H, H-K, J-K, g-J, i-J
'uhmag' -> UKIRT H mag -> colors: J-H, H-K
'ukmag' -> UKIRT K mag -> colors: g-K, H-K, J-K, V-K
'irac1' -> Spitzer IRAC1 mag -> colors: i-I1, I1-I2
'irac2' -> Spitzer IRAC2 mag -> colors: I1-I2, I2-I3
'irac3' -> Spitzer IRAC3 mag -> colors: I2-I3
'irac4' -> Spitzer IRAC4 mag -> colors: I3-I4
'wise1' -> WISE W1 mag -> colors: i-W1, W1-W2
'wise2' -> WISE W2 mag -> colors: W1-W2, W2-W3
'wise3' -> WISE W3 mag -> colors: W2-W3
'wise4' -> WISE W4 mag -> colors: W3-W4
-
astrobase.varclass.starfeatures.
coord_features
(objectinfo)[source]¶ Calculates object coordinates features, including:
- galactic coordinates
- total proper motion from pmra, pmdecl
- reduced J proper motion from propermotion and Jmag
Parameters: objectinfo (dict) – This is an objectinfo dict from a light curve file read into an lcdict. The format and the minimum keys required are:
{'ra': the right ascension of the object in decimal degrees, 'decl': the declination of the object in decimal degrees, 'pmra': the proper motion in right ascension in mas/yr, 'pmdecl': the proper motion in declination in mas/yr, 'jmag': the 2MASS J mag of this object}
Returns: A dict containing the total proper motion Return type: dict
-
astrobase.varclass.starfeatures.
color_features
(in_objectinfo, deredden=True, custom_bandpasses=None, dust_timeout=10.0)[source]¶ Stellar colors and dereddened stellar colors using 2MASS DUST API:
http://irsa.ipac.caltech.edu/applications/DUST/docs/dustProgramInterface.html
Parameters: - in_objectinfo (dict) –
This is a dict that contains the object’s magnitudes and positions. This requires at least ‘ra’, and ‘decl’ as keys which correspond to the right ascension and declination of the object, and one or more of the following keys for object magnitudes:
'umag' -> U mag -> colors: U-B, U-V, U-g 'bmag' -> B mag -> colors: U-B, B-V 'vmag' -> V mag -> colors: U-V, B-V, V-R, V-I, V-K 'rmag' -> R mag -> colors: V-R, R-I 'imag' -> I mag -> colors: g-I, V-I, R-I, B-I 'jmag' -> 2MASS J mag -> colors: J-H, J-K, g-J, i-J 'hmag' -> 2MASS H mag -> colors: J-H, H-K 'kmag' -> 2MASS Ks mag -> colors: g-Ks, H-Ks, J-Ks, V-Ks 'sdssu' -> SDSS u mag -> colors: u-g, u-V 'sdssg' -> SDSS g mag -> colors: g-r, g-i, g-K, u-g, U-g, g-J 'sdssr' -> SDSS r mag -> colors: r-i, g-r 'sdssi' -> SDSS i mag -> colors: r-i, i-z, g-i, i-J, i-W1 'sdssz' -> SDSS z mag -> colors: i-z, z-W2, g-z 'ujmag' -> UKIRT J mag -> colors: J-H, H-K, J-K, g-J, i-J 'uhmag' -> UKIRT H mag -> colors: J-H, H-K 'ukmag' -> UKIRT K mag -> colors: g-K, H-K, J-K, V-K 'irac1' -> Spitzer IRAC1 mag -> colors: i-I1, I1-I2 'irac2' -> Spitzer IRAC2 mag -> colors: I1-I2, I2-I3 'irac3' -> Spitzer IRAC3 mag -> colors: I2-I3 'irac4' -> Spitzer IRAC4 mag -> colors: I3-I4 'wise1' -> WISE W1 mag -> colors: i-W1, W1-W2 'wise2' -> WISE W2 mag -> colors: W1-W2, W2-W3 'wise3' -> WISE W3 mag -> colors: W2-W3 'wise4' -> WISE W4 mag -> colors: W3-W4
These are basically taken from the available reddening bandpasses from the 2MASS DUST service. If B, V, u, g, r, i, z aren’t provided but 2MASS J, H, Ks are all provided, the former will be calculated using the 2MASS JHKs -> BVugriz conversion functions in
astrobase.magnitudes
. - deredden (bool) – If True, will make sure all colors use dereddened mags where possible.
- custom_bandpasses (dict) –
This is a dict used to define any custom bandpasses in the in_objectinfo dict you want to make this function aware of and generate colors for. Use the format below for this dict:
{ '<bandpass_key_1>':{'dustkey':'<twomass_dust_key_1>', 'label':'<band_label_1>' 'colors':[['<bandkey1>-<bandkey2>', '<BAND1> - <BAND2>'], ['<bandkey3>-<bandkey4>', '<BAND3> - <BAND4>']]}, . ... . '<bandpass_key_N>':{'dustkey':'<twomass_dust_key_N>', 'label':'<band_label_N>' 'colors':[['<bandkey1>-<bandkey2>', '<BAND1> - <BAND2>'], ['<bandkey3>-<bandkey4>', '<BAND3> - <BAND4>']]}, }
Where:
bandpass_key is a key to use to refer to this bandpass in the objectinfo dict, e.g. ‘sdssg’ for SDSS g band
twomass_dust_key is the key to use in the 2MASS DUST result table for reddening per band-pass. For example, given the following DUST result table (using http://irsa.ipac.caltech.edu/applications/DUST/):
|Filter_name|LamEff |A_over_E_B_V_SandF|A_SandF|A_over_E_B_V_SFD|A_SFD| |char |float |float |float |float |float| | |microns| |mags | |mags | CTIO U 0.3734 4.107 0.209 4.968 0.253 CTIO B 0.4309 3.641 0.186 4.325 0.221 CTIO V 0.5517 2.682 0.137 3.240 0.165 . . ...
The twomass_dust_key for ‘vmag’ would be ‘CTIO V’. If you want to skip DUST lookup and want to pass in a specific reddening magnitude for your bandpass, use a float for the value of twomass_dust_key. If you want to skip DUST lookup entirely for this bandpass, use None for the value of twomass_dust_key.
band_label is the label to use for this bandpass, e.g. ‘W1’ for WISE-1 band, ‘u’ for SDSS u, etc.
The ‘colors’ list contains color definitions for all colors you want to generate using this bandpass. this list contains elements of the form:
['<bandkey1>-<bandkey2>','<BAND1> - <BAND2>']
where the the first item is the bandpass keys making up this color, and the second item is the label for this color to be used by the frontends. An example:
['sdssu-sdssg','u - g']
- dust_timeout (float) – The timeout to use when contacting the 2MASS DUST web service.
Returns: An objectinfo dict with all of the generated colors, dereddened magnitude,s dereddened colors, as specified in the input args is returned.
Return type: dict
- in_objectinfo (dict) –
-
astrobase.varclass.starfeatures.
mdwarf_subtype_from_sdsscolor
(ri_color, iz_color)[source]¶ This calculates the M-dwarf subtype given SDSS r-i and i-z colors.
Parameters: - ri_color (float) – The SDSS r-i color of the object.
- iz_color (float) – The SDSS i-z color of the object.
Returns: (subtype, index1, index2) – subtype: if the star appears to be an M dwarf, will return an int between 0 and 9 indicating its subtype, e.g. will return 4 for an M4 dwarf. If the object isn’t an M dwarf, will return None
index1, index2: the M-dwarf color locus value and spread of this object calculated from the r-i and i-z colors.
Return type: tuple
-
astrobase.varclass.starfeatures.
color_classification
(colorfeatures, pmfeatures)[source]¶ This calculates rough star type classifications based on star colors in the ugrizJHK bands.
Uses the output from color_features and coord_features. By default, color_features will use dereddened colors, as are expected by most relations here.
Based on the color cuts from:
- SDSS SEGUE (Yanny+ 2009)
- SDSS QSO catalog (Schneider+ 2007)
- SDSS RR Lyrae catalog (Sesar+ 2011)
- SDSS M-dwarf catalog (West+ 2008)
- Helmi+ 2003
- Bochanski+ 2014
Parameters: - colorfeatures (dict) – This is the dict produced by the color_features function.
- pmfeatures (dict) – This is the dict produced by the coord_features function.
Returns: A dict containing all of the possible classes this object can belong to as a list in the color_classes key, and values of the various color indices used to arrive to that conclusion as the other keys.
Return type: dict
-
astrobase.varclass.starfeatures.
neighbor_gaia_features
(objectinfo, lclist_kdtree, neighbor_radius_arcsec, gaia_matchdist_arcsec=3.0, verbose=True, gaia_submit_timeout=10.0, gaia_submit_tries=3, gaia_max_timeout=180.0, gaia_mirror=None, gaia_data_release='dr2', complete_query_later=True, search_simbad=False)[source]¶ Gets several neighbor, GAIA, and SIMBAD features:
From the KD-Tree in the given light curve catalog the object is in: lclist_kdtree:
- distance to closest neighbor in arcsec
- total number of all neighbors within 2 x neighbor_radius_arcsec
From the GAIA DR2 catalog:
- distance to closest neighbor in arcsec
- total number of all neighbors within 2 x neighbor_radius_arcsec
- gets the parallax for the object and neighbors
- calculates the absolute GAIA mag and G-K color for use in CMDs
- gets the proper motion in RA/Dec if available
From the SIMBAD catalog:
- the name of the object
- the type of the object
Parameters: - objectinfo (dict) –
This is the objectinfo dict from an object’s light curve. This must contain at least the following keys:
{'ra': the right ascension of the object, 'decl': the declination of the object}
- lclist_kdtree (scipy.spatial.cKDTree object) – This is a KD-Tree built on the Cartesian xyz coordinates from (ra, dec)
of all objects in the same field as this object. It is similar to that
produced by
astrobase.lcproc.catalogs.make_lclist()
, and is used to carry out the spatial search required to find neighbors for this object. - neighbor_radius_arcsec (float) – The maximum radius in arcseconds around this object to search for neighbors in both the light curve catalog and in the GAIA DR2 catalog.
- gaia_matchdist_arcsec (float) – The maximum distance in arcseconds to use for a GAIA cross-match to this object.
- verbose (bool) – If True, indicates progress and warns of problems.
- gaia_submit_timeout (float) – Sets the timeout in seconds to use when submitting a request to look up the object’s information to the GAIA service. Note that if fast_mode is set, this is ignored.
- gaia_submit_tries (int) – Sets the maximum number of times the GAIA services will be contacted to obtain this object’s information. If fast_mode is set, this is ignored, and the services will be contacted only once (meaning that a failure to respond will be silently ignored and no GAIA data will be added to the checkplot’s objectinfo dict).
- gaia_max_timeout (float) – Sets the timeout in seconds to use when waiting for the GAIA service to respond to our request for the object’s information. Note that if fast_mode is set, this is ignored.
- gaia_mirror (str) – This sets the GAIA mirror to use. This is a key in the services.gaia.GAIA_URLS dict which defines the URLs to hit for each mirror.
- gaia_data_release ({'dr2', 'edr3'}) – The Gaia data release to use for the query. This provides hints for which table to use for the GAIA mirror being queried.
- search_simbad (bool) – If this is True, searches for objects in SIMBAD at this object’s location and gets the object’s SIMBAD main ID, type, and stellar classification if available.
Returns: Returns a dict with neighbor, GAIA, and SIMBAD features.
Return type: dict
astrobase.varclass.varfeatures module¶
Calculates light curve features for variability classification.
-
astrobase.varclass.varfeatures.
stetson_jindex
(ftimes, fmags, ferrs, weightbytimediff=False)[source]¶ This calculates the Stetson index for the magseries, based on consecutive pairs of observations.
Based on Nicole Loncke’s work for her Planets and Life certificate at Princeton in 2014.
Parameters: - ftimes,fmags,ferrs (np.array) – The input mag/flux time-series with all non-finite elements removed.
- weightbytimediff (bool) –
If this is True, the Stetson index for any pair of mags will be reweighted by the difference in times between them using the scheme in Fruth+ 2012 and Zhange+ 2003 (as seen in Sokolovsky+ 2017):
w_i = exp(- (t_i+1 - t_i)/ delta_t )
Returns: The calculated Stetson J variability index.
Return type: float
-
astrobase.varclass.varfeatures.
stetson_kindex
(fmags, ferrs)[source]¶ This calculates the Stetson K index (a robust measure of the kurtosis).
Parameters: fmags,ferrs (np.array) – The input mag/flux time-series to process. Must have no non-finite elems. Returns: The Stetson K variability index. Return type: float
-
astrobase.varclass.varfeatures.
lightcurve_moments
(ftimes, fmags, ferrs)[source]¶ This calculates the weighted mean, stdev, median, MAD, percentiles, skew, kurtosis, fraction of LC beyond 1-stdev, and IQR.
Parameters: ftimes,fmags,ferrs (np.array) – The input mag/flux time-series with all non-finite elements removed. Returns: A dict with all of the light curve moments calculated. Return type: dict
-
astrobase.varclass.varfeatures.
lightcurve_flux_measures
(ftimes, fmags, ferrs, magsarefluxes=False)[source]¶ This calculates percentiles and percentile ratios of the flux.
Parameters: - ftimes,fmags,ferrs (np.array) – The input mag/flux time-series with all non-finite elements removed.
- magsarefluxes (bool) – If the fmags array actually contains fluxes, will not convert mags to fluxes before calculating the percentiles.
Returns: A dict with all of the light curve flux percentiles and percentile ratios calculated.
Return type: dict
-
astrobase.varclass.varfeatures.
lightcurve_ptp_measures
(ftimes, fmags, ferrs)[source]¶ This calculates various point-to-point measures (eta in Kim+ 2014).
Parameters: ftimes,fmags,ferrs (np.array) – The input mag/flux time-series with all non-finite elements removed. Returns: A dict with values of the point-to-point measures, including the eta variability index (often used as its inverse inveta to have the same sense as increasing variability index -> more likely a variable star). Return type: dict
-
astrobase.varclass.varfeatures.
nonperiodic_lightcurve_features
(times, mags, errs, magsarefluxes=False)[source]¶ This calculates the following nonperiodic features of the light curve, listed in Richards, et al. 2011):
- amplitude
- beyond1std
- flux_percentile_ratio_mid20
- flux_percentile_ratio_mid35
- flux_percentile_ratio_mid50
- flux_percentile_ratio_mid65
- flux_percentile_ratio_mid80
- linear_trend
- max_slope
- median_absolute_deviation
- median_buffer_range_percentage
- pair_slope_trend
- percent_amplitude
- percent_difference_flux_percentile
- skew
- stdev
- timelength
- mintime
- maxtime
Parameters: - times,mags,errs (np.array) – The input mag/flux time-series to process.
- magsarefluxes (bool) – If True, will treat values in mags as fluxes instead of magnitudes.
Returns: A dict containing all of the features listed above.
Return type: dict
-
astrobase.varclass.varfeatures.
gilliland_cdpp
(times, mags, errs, windowlength=97, polyorder=2, binsize=23400, sigclip=5.0, magsarefluxes=False, **kwargs)[source]¶ This calculates the CDPP of a timeseries using the method in the paper:
Gilliland, R. L., Chaplin, W. J., Dunham, E. W., et al. 2011, ApJS, 197, 6 (http://adsabs.harvard.edu/abs/2011ApJS..197….6G)
The steps are:
- pass the time-series through a Savitsky-Golay filter.
- we use scipy.signal.savgol_filter, **kwargs are passed to this.
- also see: http://scipy.github.io/old-wiki/pages/Cookbook/SavitzkyGolay.
- the windowlength is the number of LC points to use (Kepler uses 2 days = (1440 minutes/day / 30 minutes/LC point) x 2 days = 96 -> 97 LC points).
- the polyorder is a quadratic by default.
- subtract the smoothed time-series from the actual light curve.
- sigma clip the remaining LC.
- get the binned mag series by averaging over 6.5 hour bins, only retaining bins with at least 7 points.
- the standard deviation of the binned averages is the CDPP.
- multiply this by 1.168 to correct for over-subtraction of white-noise.
Parameters: - times,mags,errs (np.array) – The input mag/flux time-series to calculate CDPP for.
- windowlength (int) – The smoothing window size to use.
- polyorder (int) – The polynomial order to use in the Savitsky-Golay smoothing.
- binsize (int) – The bin size to use for binning the light curve.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- magsarefluxes (bool) – If True, indicates the input time-series is fluxes and not mags.
- kwargs (additional kwargs) – These are passed directly to scipy.signal.savgol_filter.
Returns: The calculated CDPP value.
Return type: float
- pass the time-series through a Savitsky-Golay filter.
-
astrobase.varclass.varfeatures.
all_nonperiodic_features
(times, mags, errs, magsarefluxes=False, stetson_weightbytimediff=True)[source]¶ This rolls up the feature functions above and returns a single dict.
NOTE: this doesn’t calculate the CDPP to save time since binning and smoothing takes a while for dense light curves.
Parameters: - times,mags,errs (np.array) – The input mag/flux time-series to calculate CDPP for.
- magsarefluxes (bool) – If True, indicates mags is actually an array of flux values.
- stetson_weightbytimediff (bool) –
If this is True, the Stetson index for any pair of mags will be reweighted by the difference in times between them using the scheme in Fruth+ 2012 and Zhange+ 2003 (as seen in Sokolovsky+ 2017):
w_i = exp(- (t_i+1 - t_i)/ delta_t )
Returns: Returns a dict with all of the variability features.
Return type: dict
astrobase.services package¶
This contains various modules to query online data services. These are not exhaustive and are meant to support other astrobase modules.
astrobase.services.dust
: interface to the 2MASS DUST extinction/emission service.astrobase.services.gaia
: interface to the GAIA TAP+ ADQL query service.astrobase.services.lccs
: interface to the LCC-Server API.astrobase.services.mast
: interface to the MAST catalogs at STScI and the TESS Input Catalog in particular.astrobase.services.simbad
: interface to the CDS SIMBAD service.astrobase.services.skyview
: interface to the NASA SkyView finder-chart and cutout service.astrobase.services.trilegal
: interface to the Girardi TRILEGAL galaxy model forms and service.astrobase.services.limbdarkening
: utilities to get stellar limb darkening coefficients for use during transit fitting.astrobase.services.identifiers
: utilities to convert from SIMBAD object names to GAIA DR2 source identifiers and TESS Input Catalogs IDs.astrobase.services.tesslightcurves
: utilities to download various TESS light curve products from MAST.astrobase.services.alltesslightcurves
: utilities to download all TESS light curve products from MAST for a given TIC ID.
For a much broader interface to online data services, use the astroquery package by A. Ginsburg, B. Sipocz, et al.:
http://astroquery.readthedocs.io
Submodules¶
astrobase.services.alltesslightcurves module¶
A tool for aquiring TESS light curves from a variety of pipelines. Wraps functions found in tesslightcurves.py
-
astrobase.services.alltesslightcurves.
get_all_tess_lightcurves
(tic_id, pipelines=('CDIPS', 'PATHOS', 'TASOC', '2minSPOC', 'eleanor'), download_dir=None)[source]¶ Gets all possible TESS light curves for a TIC ID.
Parameters: - tic_id (str) – The TIC ID of the object to get all the light curves for.
- pipelines (list or tuple) –
The pipeline products to search for light curves of the given TIC ID. Must be one or more of the following:
['CDIPS', 'PATHOS', 'TASOC', '2minSPOC', 'eleanor']
- download_dir (str or None) – The directory to download the light curves to. If None, will download to the current directory.
Returns: lcfile_list – Returns a list of light curve files that were downloaded for the object.
Return type: list of str
astrobase.services.dust module¶
This gets extinction tables from the the 2MASS DUST service at:
http://irsa.ipac.caltech.edu/applications/DUST/
If you use this, please cite the SF11 and SFD98 papers and acknowledge the use of 2MASS/IPAC services.
- http://www.adsabs.harvard.edu/abs/1998ApJ…500..525S
- http://www.adsabs.harvard.edu/abs/2011ApJ…737..103S
Also see:
http://irsa.ipac.caltech.edu/applications/DUST/docs/background.html
-
astrobase.services.dust.
extinction_query
(lon, lat, coordtype='equatorial', sizedeg=5.0, forcefetch=False, cachedir='~/.astrobase/dust-cache', verbose=True, timeout=10.0, jitter=5.0)[source]¶ This queries the 2MASS DUST service to find the extinction parameters for the given lon, lat.
Parameters: - lon,lat (float) – These are decimal right ascension and declination if coordtype = ‘equatorial’. These are are decimal Galactic longitude and latitude if coordtype = ‘galactic’.
- coordtype ({'equatorial','galactic'}) – Sets the type of coordinates passed in as lon, lat.
- sizedeg (float) – This is the width of the image returned by the DUST service. This can usually be left as-is if you’re interested in the extinction only.
- forcefetch (bool) – If this is True, the query will be retried even if cached results for it exist.
- cachedir (str) – This points to the directory where results will be downloaded.
- verbose (bool) – If True, will indicate progress and warn of any issues.
- timeout (float) – This sets the amount of time in seconds to wait for the service to respond to our request.
- jitter (float) – This is used to control the scale of the random wait in seconds before starting the query. Useful in parallelized situations.
Returns: A dict of the following form is returned:
{'Amag':{dict of extinction A_v values for several mag systems}, 'table': array containing the full extinction table, 'tablefile': the path to the full extinction table file on disk, 'provenance': 'cached' or 'new download', 'request': string repr of the request made to 2MASS DUST}
Return type: dict
astrobase.services.fortney2k7 module¶
This contains data from Fortney et al. 2007 on planet compositions, masses and radii. Also contains functions that return numpy arrays from these data based on specified input.
Requires numpy.
-
astrobase.services.fortney2k7.
massradius
(age, planetdist, coremass, mass='massjupiter', radius='radiusjupiter')[source]¶ This function gets the Fortney mass-radius relation for planets.
Parameters: - age (float) – This should be one of: 0.3, 1.0, 4.5 [in Gyr].
- planetdist (float) – This should be one of: 0.02, 0.045, 0.1, 1.0, 9.5 [in AU]
- coremass (int) – This should be one of: 0, 10, 25, 50, 100 [in Mearth]
- mass ({'massjupiter','massearth'}) – Sets the mass units.
- radius (str) – Sets the radius units. Only ‘radiusjupiter’ is used for now.
Returns: A dict of the following form is returned:
{'mass': an array containing the masses to plot), 'radius': an array containing the radii to plot}
These can be passed to a plotting routine to make mass-radius plot for the specified age, planet-star distance, and core-mass.
Return type: dict
astrobase.services.gaia module¶
This queries the GAIA catalog for object lists in specified areas of the sky. The main use of this module is to generate realistic spatial distributions of stars for variability recovery simulations in combination with colors and luminosities from the TRILEGAL galaxy model.
If you use this module, please cite the GAIA papers as outlined at:
Much of this module is derived from the example given at:
http://gea.esac.esa.int/archive-help/commandline/index.html
For a more general and useful interface to the GAIA catalog, see the astroquery package by A. Ginsburg, B. Sipocz, et al.:
http://astroquery.readthedocs.io/en/latest/gaia/gaia.html
-
astrobase.services.gaia.
tap_query
(querystr, gaia_mirror=None, data_release='dr2', returnformat='csv', forcefetch=False, cachedir='~/.astrobase/gaia-cache', verbose=True, timeout=15.0, refresh=2.0, maxtimeout=300.0, maxtries=3, complete_query_later=False)[source]¶ This queries the GAIA TAP service using an ADQL query string.
Parameters: - querystr (str) – This is the ADQL query string. See: http://www.ivoa.net/documents/ADQL/2.0 for the specification and http://gea.esac.esa.int/archive-help/adql/index.html for GAIA-specific additions.
- gaia_mirror ({'gaia','heidelberg','vizier'} or None) – This is the key used to select a GAIA catalog mirror from the GAIA_URLS dict above. If set, the specified mirror will be used. If None, a random mirror chosen from that dict will be used.
- data_release ({'dr2', 'edr3'}) – The Gaia data release to use for the query. This provides hints for which table to use for the GAIA mirror being queried.
- returnformat ({'csv','votable','json'}) – The returned file format to request from the GAIA catalog service.
- forcefetch (bool) – If this is True, the query will be retried even if cached results for it exist.
- cachedir (str) – This points to the directory where results will be downloaded.
- verbose (bool) – If True, will indicate progress and warn of any issues.
- timeout (float) – This sets the amount of time in seconds to wait for the service to respond to our initial request.
- refresh (float) – This sets the amount of time in seconds to wait before checking if the result file is available. If the results file isn’t available after refresh seconds have elapsed, the function will wait for refresh seconds continuously, until maxtimeout is reached or the results file becomes available.
- maxtimeout (float) – The maximum amount of time in seconds to wait for a result to become available after submitting our query request.
- maxtries (int) – The maximum number of tries (across all mirrors tried) to make to either submit the request or download the results, before giving up.
- completequerylater (bool) – If set to True, a submitted query that does not return a result before maxtimeout has passed will be cancelled but its input request parameters and the result URL provided by the service will be saved. If this function is then called later with these same input request parameters, it will check if the query finally finished and a result is available. If so, will download the results instead of submitting a new query. If it’s not done yet, will start waiting for results again. To force launch a new query with the same request parameters, set the forcefetch kwarg to True.
Returns: This returns a dict of the following form:
{'params':dict of the input params used for the query, 'provenance':'cache' or 'new download', 'result':path to the file on disk with the downloaded data table}
Return type: dict
-
astrobase.services.gaia.
objectlist_conesearch
(racenter, declcenter, searchradiusarcsec, gaia_mirror=None, data_release='dr2', columns=('source_id', 'ra', 'dec', 'phot_g_mean_mag', 'l', 'b', 'parallax', 'parallax_error', 'pmra', 'pmra_error', 'pmdec', 'pmdec_error'), extra_filter=None, returnformat='csv', forcefetch=False, cachedir='~/.astrobase/gaia-cache', verbose=True, timeout=15.0, refresh=2.0, maxtimeout=300.0, maxtries=3, complete_query_later=True)[source]¶ This queries the GAIA TAP service for a list of objects near the coords.
Runs a conesearch around (racenter, declcenter) with radius in arcsec of searchradiusarcsec.
Parameters: - racenter,declcenter (float) – The center equatorial coordinates in decimal degrees.
- searchradiusarcsec (float) – The search radius of the cone-search in arcseconds.
- gaia_mirror ({'gaia','heidelberg','vizier'} or None) – This is the key used to select a GAIA catalog mirror from the GAIA_URLS dict above. If set, the specified mirror will be used. If None, a random mirror chosen from that dict will be used.
- data_release ({'dr2', 'edr3'}) – The Gaia data release to use for the query.
- columns (sequence of str) – This indicates which columns from the GAIA table to request for the objects found within the search radius.
- extra_filter (str or None) – If this is provided, must be a valid ADQL filter string that is used to further filter the cone-search results.
- returnformat ({'csv','votable','json'}) – The returned file format to request from the GAIA catalog service.
- forcefetch (bool) – If this is True, the query will be retried even if cached results for it exist.
- cachedir (str) – This points to the directory where results will be downloaded.
- verbose (bool) – If True, will indicate progress and warn of any issues.
- timeout (float) – This sets the amount of time in seconds to wait for the service to respond to our initial request.
- refresh (float) – This sets the amount of time in seconds to wait before checking if the result file is available. If the results file isn’t available after refresh seconds have elapsed, the function will wait for refresh seconds continuously, until maxtimeout is reached or the results file becomes available.
- maxtimeout (float) – The maximum amount of time in seconds to wait for a result to become available after submitting our query request.
- maxtries (int) – The maximum number of tries (across all mirrors tried) to make to either submit the request or download the results, before giving up.
- completequerylater (bool) – If set to True, a submitted query that does not return a result before maxtimeout has passed will be cancelled but its input request parameters and the result URL provided by the service will be saved. If this function is then called later with these same input request parameters, it will check if the query finally finished and a result is available. If so, will download the results instead of submitting a new query. If it’s not done yet, will start waiting for results again. To force launch a new query with the same request parameters, set the forcefetch kwarg to True.
Returns: This returns a dict of the following form:
{'params':dict of the input params used for the query, 'provenance':'cache' or 'new download', 'result':path to the file on disk with the downloaded data table}
Return type: dict
-
astrobase.services.gaia.
objectlist_radeclbox
(radeclbox, gaia_mirror=None, data_release='dr2', columns=('source_id', 'ra', 'dec', 'phot_g_mean_mag', 'l', 'b', 'parallax, parallax_error', 'pmra', 'pmra_error', 'pmdec', 'pmdec_error'), extra_filter=None, returnformat='csv', forcefetch=False, cachedir='~/.astrobase/gaia-cache', verbose=True, timeout=15.0, refresh=2.0, maxtimeout=300.0, maxtries=3, complete_query_later=True)[source]¶ This queries the GAIA TAP service for a list of objects in an equatorial coordinate box.
Parameters: - radeclbox (sequence of four floats) –
This defines the box to search in:
[ra_min, ra_max, decl_min, decl_max]
- gaia_mirror ({'gaia','heidelberg','vizier'} or None) – This is the key used to select a GAIA catalog mirror from the GAIA_URLS dict above. If set, the specified mirror will be used. If None, a random mirror chosen from that dict will be used.
- data_release ({'dr2', 'edr3'}) – The Gaia data release to use for the query.
- columns (sequence of str) – This indicates which columns from the GAIA table to request for the objects found within the search radius.
- extra_filter (str or None) – If this is provided, must be a valid ADQL filter string that is used to further filter the cone-search results.
- returnformat ({'csv','votable','json'}) – The returned file format to request from the GAIA catalog service.
- forcefetch (bool) – If this is True, the query will be retried even if cached results for it exist.
- cachedir (str) – This points to the directory where results will be downloaded.
- verbose (bool) – If True, will indicate progress and warn of any issues.
- timeout (float) – This sets the amount of time in seconds to wait for the service to respond to our initial request.
- refresh (float) – This sets the amount of time in seconds to wait before checking if the result file is available. If the results file isn’t available after refresh seconds have elapsed, the function will wait for refresh seconds continuously, until maxtimeout is reached or the results file becomes available.
- maxtimeout (float) – The maximum amount of time in seconds to wait for a result to become available after submitting our query request.
- maxtries (int) – The maximum number of tries (across all mirrors tried) to make to either submit the request or download the results, before giving up.
- completequerylater (bool) – If set to True, a submitted query that does not return a result before maxtimeout has passed will be cancelled but its input request parameters and the result URL provided by the service will be saved. If this function is then called later with these same input request parameters, it will check if the query finally finished and a result is available. If so, will download the results instead of submitting a new query. If it’s not done yet, will start waiting for results again. To force launch a new query with the same request parameters, set the forcefetch kwarg to True.
Returns: This returns a dict of the following form:
{'params':dict of the input params used for the query, 'provenance':'cache' or 'new download', 'result':path to the file on disk with the downloaded data table}
Return type: dict
- radeclbox (sequence of four floats) –
-
astrobase.services.gaia.
objectid_search
(gaiaid, gaia_mirror=None, data_release='dr2', columns=('source_id', 'ra', 'dec', 'phot_g_mean_mag', 'phot_bp_mean_mag', 'phot_rp_mean_mag', 'l', 'b', 'parallax, parallax_error', 'pmra', 'pmra_error', 'pmdec', 'pmdec_error'), returnformat='csv', forcefetch=False, cachedir='~/.astrobase/gaia-cache', verbose=True, timeout=15.0, refresh=2.0, maxtimeout=300.0, maxtries=3, complete_query_later=True)[source]¶ This queries the GAIA TAP service for a single GAIA source ID.
Parameters: - gaiaid (str) – The source ID of the object whose info will be collected.
- gaia_mirror ({'gaia','heidelberg','vizier'} or None) – This is the key used to select a GAIA catalog mirror from the GAIA_URLS dict above. If set, the specified mirror will be used. If None, a random mirror chosen from that dict will be used.
- data_release ({'dr2', 'edr3'}) – The Gaia data release to use for the query.
- columns (sequence of str) – This indicates which columns from the GAIA table to request for the objects found within the search radius.
- returnformat ({'csv','votable','json'}) – The returned file format to request from the GAIA catalog service.
- forcefetch (bool) – If this is True, the query will be retried even if cached results for it exist.
- cachedir (str) – This points to the directory where results will be downloaded.
- verbose (bool) – If True, will indicate progress and warn of any issues.
- timeout (float) – This sets the amount of time in seconds to wait for the service to respond to our initial request.
- refresh (float) – This sets the amount of time in seconds to wait before checking if the result file is available. If the results file isn’t available after refresh seconds have elapsed, the function will wait for refresh seconds continuously, until maxtimeout is reached or the results file becomes available.
- maxtimeout (float) – The maximum amount of time in seconds to wait for a result to become available after submitting our query request.
- maxtries (int) – The maximum number of tries (across all mirrors tried) to make to either submit the request or download the results, before giving up.
- completequerylater (bool) – If set to True, a submitted query that does not return a result before maxtimeout has passed will be cancelled but its input request parameters and the result URL provided by the service will be saved. If this function is then called later with these same input request parameters, it will check if the query finally finished and a result is available. If so, will download the results instead of submitting a new query. If it’s not done yet, will start waiting for results again. To force launch a new query with the same request parameters, set the forcefetch kwarg to True.
Returns: This returns a dict of the following form:
{'params':dict of the input params used for the query, 'provenance':'cache' or 'new download', 'result':path to the file on disk with the downloaded data table}
Return type: dict
astrobase.services.lccs module¶
This contains functions to search for objects and get light curves from a Light Curve Collection server (https://github.com/waqasbhatti/lcc-server) using its HTTP API.
The LCC-Server requires an API key to access most services. The service functions in this module will automatically acquire an anonymous user API key on first use (and upon API key expiry afterwards). If you sign up for an LCC-Server user account, you can import the API key generated for that account on the user home page. To do this, use the import_apikey function in this module.
This currently supports the following LCC-Server services:
conesearch : cone_search(lcc_server_url, center_ra, center_decl, ...)
ftsquery : fulltext_search(lcc_server_url, searchtxt, sesame=False, ...)
columnsearch : column_search(lcc_server_url, filters, ...)
xmatch : xmatch_search(lcc_server_url, file_to_upload, ...
The functions above will download the data products (data table CSVs, light curve ZIP files) of the search results automatically, or in case the query takes too long, will return within a configurable timeout. The query information is cached to ~/.astrobase/lccs, and can be used to download data products for long-running queries later.
The functions below support various auxiliary LCC services:
get-dataset : get_dataset(lcc_server_url, dataset_id)
objectinfo : object_info(lcc_server_url, objectid, collection, ...)
dataset-list : list_recent_datasets(lcc_server_url, nrecent=25, ...)
collections : list_lc_collections(lcc_server_url)
-
astrobase.services.lccs.
check_existing_apikey
(lcc_server)[source]¶ This validates if an API key for the specified LCC-Server is available.
API keys are stored using the following file scheme:
~/.astrobase/lccs/apikey-domain.of.lccserver.org
e.g. for the HAT LCC-Server at https://data.hatsurveys.org:
~/.astrobase/lccs/apikey-https-data.hatsurveys.org
Parameters: lcc_server (str) – The base URL of the LCC-Server for which the existence of API keys will be checked. Returns: (apikey_ok, apikey_str, expiry) – The returned tuple contains the status of the API key, the API key itself if present, and its expiry date if present. Return type: tuple
-
astrobase.services.lccs.
get_new_apikey
(lcc_server)[source]¶ This gets a new API key from the specified LCC-Server.
NOTE: this only gets an anonymous API key. To get an API key tied to a user account (and associated privilege level), see the import_apikey function below.
Parameters: lcc_server (str) – The base URL of the LCC-Server from where the API key will be fetched. Returns: (apikey, expiry) – This returns a tuple with the API key and its expiry date. Return type: tuple
-
astrobase.services.lccs.
import_apikey
(lcc_server, apikey_json)[source]¶ This imports an API key from text and writes it to the cache dir.
Use this with the JSON file downloaded from API key download link on your LCC-Server user home page. The API key will thus be tied to the privileges of that user account and can then access objects, datasets, and collections marked as private for the user only or shared with that user.
Parameters: - lcc_server (str) – The base URL of the LCC-Server to get the API key for.
- apikey_text_json (str) – The JSON string from the API key text box on the user’s LCC-Server home page at lcc_server/users/home.
Returns: (apikey, expiry) – This returns a tuple with the API key and its expiry date.
Return type: tuple
-
astrobase.services.lccs.
submit_post_searchquery
(url, data, apikey)[source]¶ This submits a POST query to an LCC-Server search API endpoint.
Handles streaming of the results, and returns the final JSON stream. Also handles results that time out.
Parameters: - url (str) – The URL of the search API endpoint to hit. This is something like https://data.hatsurveys.org/api/conesearch
- data (dict) – A dict of the search query parameters to pass to the search service.
- apikey (str) – The API key to use to access the search service. API keys are required for all POST request made to an LCC-Server’s API endpoints.
Returns: (status_flag, data_dict, dataset_id) – This returns a tuple containing the status of the request: (‘complete’, ‘failed’, ‘background’, etc.), a dict parsed from the JSON result of the request, and a dataset ID, which can be used to reconstruct the URL on the LCC-Server where the results can be browsed.
Return type: tuple
-
astrobase.services.lccs.
retrieve_dataset_files
(searchresult, getpickle=False, outdir=None, apikey=None)[source]¶ This retrieves a search result dataset’s CSV and any LC zip files.
Takes the output from the submit_post_searchquery function above or a pickle file generated from that function’s output if the query timed out.
Parameters: - searchresult (str or tuple) – If provided as a str, points to the pickle file created using the output from the submit_post_searchquery function. If provided as a tuple, this is the result tuple from the submit_post_searchquery function.
- getpickle (False) – If this is True, will also download the dataset’s pickle. Note that LCC-Server is a Python 3.6+ package (while lccs.py still works with Python 2.7) and it saves its pickles in pickle.HIGHEST_PROTOCOL for efficiency, so these pickles may be unreadable in lower Pythons. As an alternative, the dataset CSV contains the full data table and all the information about the dataset in its header, which is JSON parseable. You can also use the function get_dataset below to get the dataset pickle information in JSON form.
- outdir (None or str) – If this is a str, points to the output directory where the results will be placed. If it’s None, they will be placed in the current directory.
- apikey (str or None) – If this is a str, uses the given API key to authenticate the download request. This is useful when you have a private dataset you want to get products for.
Returns: (local_dataset_csv, local_dataset_lczip, local_dataset_pickle) – This returns a tuple containing paths to the dataset CSV, LC zipfile, and the dataset pickle if getpickle was set to True (None otherwise).
Return type: tuple
-
astrobase.services.lccs.
cone_search
(lcc_server, center_ra, center_decl, radiusarcmin=5.0, result_visibility='unlisted', email_when_done=False, collections=None, columns=None, filters=None, sortspec=None, samplespec=None, limitspec=None, download_data=True, outdir=None, maxtimeout=300.0, refresh=15.0)[source]¶ This runs a cone-search query.
Parameters: - lcc_server (str) – This is the base URL of the LCC-Server to talk to. (e.g. for HAT, use: https://data.hatsurveys.org)
- center_ra,center_decl (float) –
These are the central coordinates of the search to conduct. These can be either decimal degrees of type float, or sexagesimal coordinates of type str:
- OK: 290.0, 45.0
- OK: 15:00:00 +45:00:00
- OK: 15 00 00.0 -45 00 00.0
- NOT OK: 290.0 +45:00:00
- NOT OK: 15:00:00 45.0
- radiusarcmin (float) – This is the search radius to use for the cone-search. This is in arcminutes. The maximum radius you can use is 60 arcminutes = 1 degree.
- result_visibility ({'private', 'unlisted', 'public'}) –
This sets the visibility of the dataset produced from the search result:
'private' -> the dataset and its products are not visible or accessible by any user other than the one that created the dataset. 'unlisted' -> the dataset and its products are not visible in the list of public datasets, but can be accessed if the dataset URL is known 'public' -> the dataset and its products are visible in the list of public datasets and can be accessed by anyone.
- email_when_done (bool) – If True, the LCC-Server will email you when the search is complete. This will also set download_data to False. Using this requires an LCC-Server account and an API key tied to that account.
- collections (list of str or None) – This is a list of LC collections to search in. If this is None, all collections will be searched.
- columns (list of str or None) – This is a list of columns to return in the results. Matching objects’ object IDs, RAs, DECs, and links to light curve files will always be returned so there is no need to specify these columns. If None, only these columns will be returned: ‘objectid’, ‘ra’, ‘decl’, ‘lcfname’
- filters (str or None) –
This is an SQL-like string to use to filter on database columns in the LCC-Server’s collections. To see the columns available for a search, visit the Collections tab in the LCC-Server’s browser UI. The filter operators allowed are:
lt -> less than gt -> greater than ge -> greater than or equal to le -> less than or equal to eq -> equal to ne -> not equal to ct -> contains text isnull -> column value is null notnull -> column value is not null
You may use the and and or operators between filter specifications to chain them together logically.
Example filter strings:
"(propermotion gt 200.0) and (sdssr lt 11.0)" "(dered_jmag_kmag gt 2.0) and (aep_000_stetsonj gt 10.0)" "(gaia_status ct 'ok') and (propermotion gt 300.0)" "(simbad_best_objtype ct 'RR') and (dered_sdssu_sdssg lt 0.5)"
- sortspec (tuple of two strs or None) –
If not None, this should be a tuple of two items:
('column to sort by', 'asc|desc')
This sets the column to sort the results by. For cone_search, the default column and sort order are ‘dist_arcsec’ and ‘asc’, meaning the distance from the search center in ascending order.
- samplespec (int or None) – If this is an int, will indicate how many rows from the initial search result will be uniformly random sampled and returned.
- limitspec (int or None) –
If this is an int, will indicate how many rows from the initial search result to return in total.
sortspec, samplespec, and limitspec are applied in this order:
sample -> sort -> limit - download_data (bool) –
This sets if the accompanying data from the search results will be downloaded automatically. This includes the data table CSV, the dataset pickle file, and a light curve ZIP file. Note that if the search service indicates that your query is still in progress, this function will block until the light curve ZIP file becomes available. The maximum wait time in seconds is set by maxtimeout and the refresh interval is set by refresh.
To avoid the wait block, set download_data to False and the function will write a pickle file to ~/.astrobase/lccs/query-[setid].pkl containing all the information necessary to retrieve these data files later when the query is done. To do so, call the retrieve_dataset_files with the path to this pickle file (it will be returned).
- outdir (str or None) – If this is provided, sets the output directory of the downloaded dataset files. If None, they will be downloaded to the current directory.
- maxtimeout (float) – The maximum time in seconds to wait for the LCC-Server to respond with a result before timing out. You can use the retrieve_dataset_files function to get results later as needed.
- refresh (float) – The time to wait in seconds before pinging the LCC-Server to see if a search query has completed and dataset result files can be downloaded.
Returns: Returns a tuple with the following elements:
(search result status dict, search result CSV file path, search result LC ZIP path)
Return type: tuple
-
astrobase.services.lccs.
fulltext_search
(lcc_server, searchterm, sesame_lookup=False, result_visibility='unlisted', email_when_done=False, collections=None, columns=None, filters=None, sortspec=None, samplespec=None, limitspec=None, download_data=True, outdir=None, maxtimeout=300.0, refresh=15.0)[source]¶ This runs a full-text search query.
Parameters: - lcc_server (str) – This is the base URL of the LCC-Server to talk to. (e.g. for HAT, use: https://data.hatsurveys.org)
- searchterm (str) – This is the term to look for in a full-text search of the LCC-Server’s collections. This can be an object name, tag, description, etc., as noted in the LCC-Server’s full-text search tab in its browser UI. To search for an exact match to a string (like an object name), you can add double quotes around the string, e.g. searchitem = ‘“exact match to me needed”’.
- sesame_lookup (bool) – If True, means the LCC-Server will assume the provided search term is a single object’s name, look up its coordinates using the CDS SIMBAD SESAME name resolution service, and then search the LCC-Server for any matching objects. The object name can be either a star name known to SIMBAD, or it can be an extended source name (e.g. an open cluster or nebula). In the first case, a search radius of 5 arcseconds will be used. In the second case, a search radius of 1 degree will be used to find all nearby database objects associated with an extended source name.
- result_visibility ({'private', 'unlisted', 'public'}) –
This sets the visibility of the dataset produced from the search result:
'private' -> the dataset and its products are not visible or accessible by any user other than the one that created the dataset. 'unlisted' -> the dataset and its products are not visible in the list of public datasets, but can be accessed if the dataset URL is known 'public' -> the dataset and its products are visible in the list of public datasets and can be accessed by anyone.
- email_when_done (bool) – If True, the LCC-Server will email you when the search is complete. This will also set download_data to False. Using this requires an LCC-Server account and an API key tied to that account.
- collections (list of str or None) – This is a list of LC collections to search in. If this is None, all collections will be searched.
- columns (list of str or None) – This is a list of columns to return in the results. Matching objects’ object IDs, RAs, DECs, and links to light curve files will always be returned so there is no need to specify these columns. If None, only these columns will be returned: ‘objectid’, ‘ra’, ‘decl’, ‘lcfname’
- filters (str or None) –
This is an SQL-like string to use to filter on database columns in the LCC-Server’s collections. To see the columns available for a search, visit the Collections tab in the LCC-Server’s browser UI. The filter operators allowed are:
lt -> less than gt -> greater than ge -> greater than or equal to le -> less than or equal to eq -> equal to ne -> not equal to ct -> contains text isnull -> column value is null notnull -> column value is not null
You may use the and and or operators between filter specifications to chain them together logically.
Example filter strings:
"(propermotion gt 200.0) and (sdssr lt 11.0)" "(dered_jmag_kmag gt 2.0) and (aep_000_stetsonj gt 10.0)" "(gaia_status ct 'ok') and (propermotion gt 300.0)" "(simbad_best_objtype ct 'RR') and (dered_sdssu_sdssg lt 0.5)"
- sortspec (tuple of two strs or None) –
If not None, this should be a tuple of two items:
('column to sort by', 'asc|desc')
This sets the column to sort the results by. For cone_search, the default column and sort order are ‘dist_arcsec’ and ‘asc’, meaning the distance from the search center in ascending order.
- samplespec (int or None) – If this is an int, will indicate how many rows from the initial search result will be uniformly random sampled and returned.
- limitspec (int or None) –
If this is an int, will indicate how many rows from the initial search result to return in total.
sortspec, samplespec, and limitspec are applied in this order:
sample -> sort -> limit - download_data (bool) –
This sets if the accompanying data from the search results will be downloaded automatically. This includes the data table CSV, the dataset pickle file, and a light curve ZIP file. Note that if the search service indicates that your query is still in progress, this function will block until the light curve ZIP file becomes available. The maximum wait time in seconds is set by maxtimeout and the refresh interval is set by refresh.
To avoid the wait block, set download_data to False and the function will write a pickle file to ~/.astrobase/lccs/query-[setid].pkl containing all the information necessary to retrieve these data files later when the query is done. To do so, call the retrieve_dataset_files with the path to this pickle file (it will be returned).
- outdir (str or None) – If this is provided, sets the output directory of the downloaded dataset files. If None, they will be downloaded to the current directory.
- maxtimeout (float) – The maximum time in seconds to wait for the LCC-Server to respond with a result before timing out. You can use the retrieve_dataset_files function to get results later as needed.
- refresh (float) – The time to wait in seconds before pinging the LCC-Server to see if a search query has completed and dataset result files can be downloaded.
Returns: Returns a tuple with the following elements:
(search result status dict, search result CSV file path, search result LC ZIP path)
Return type: tuple
-
astrobase.services.lccs.
column_search
(lcc_server, filters, result_visibility='unlisted', email_when_done=False, collections=None, columns=None, sortspec=('sdssr', 'asc'), samplespec=None, limitspec=None, download_data=True, outdir=None, maxtimeout=300.0, refresh=15.0)[source]¶ This runs a column search query.
Parameters: - lcc_server (str) – This is the base URL of the LCC-Server to talk to. (e.g. for HAT, use: https://data.hatsurveys.org)
- filters (str or None) –
This is an SQL-like string to use to filter on database columns in the LCC-Server’s collections. To see the columns available for a search, visit the Collections tab in the LCC-Server’s browser UI. The filter operators allowed are:
lt -> less than gt -> greater than ge -> greater than or equal to le -> less than or equal to eq -> equal to ne -> not equal to ct -> contains text isnull -> column value is null notnull -> column value is not null
You may use the and and or operators between filter specifications to chain them together logically.
Example filter strings:
"(propermotion gt 200.0) and (sdssr lt 11.0)" "(dered_jmag_kmag gt 2.0) and (aep_000_stetsonj gt 10.0)" "(gaia_status ct 'ok') and (propermotion gt 300.0)" "(simbad_best_objtype ct 'RR') and (dered_sdssu_sdssg lt 0.5)"
- result_visibility ({'private', 'unlisted', 'public'}) –
This sets the visibility of the dataset produced from the search result:
'private' -> the dataset and its products are not visible or accessible by any user other than the one that created the dataset. 'unlisted' -> the dataset and its products are not visible in the list of public datasets, but can be accessed if the dataset URL is known 'public' -> the dataset and its products are visible in the list of public datasets and can be accessed by anyone.
- email_when_done (bool) – If True, the LCC-Server will email you when the search is complete. This will also set download_data to False. Using this requires an LCC-Server account and an API key tied to that account.
- collections (list of str or None) – This is a list of LC collections to search in. If this is None, all collections will be searched.
- columns (list of str or None) – This is a list of columns to return in the results. Matching objects’ object IDs, RAs, DECs, and links to light curve files will always be returned so there is no need to specify these columns. If None, only these columns will be returned: ‘objectid’, ‘ra’, ‘decl’, ‘lcfname’
- sortspec (tuple of two strs or None) –
If not None, this should be a tuple of two items:
('column to sort by', 'asc|desc')
This sets the column to sort the results by. For cone_search, the default column and sort order are ‘dist_arcsec’ and ‘asc’, meaning the distance from the search center in ascending order.
- samplespec (int or None) – If this is an int, will indicate how many rows from the initial search result will be uniformly random sampled and returned.
- limitspec (int or None) –
If this is an int, will indicate how many rows from the initial search result to return in total.
sortspec, samplespec, and limitspec are applied in this order:
sample -> sort -> limit - download_data (bool) –
This sets if the accompanying data from the search results will be downloaded automatically. This includes the data table CSV, the dataset pickle file, and a light curve ZIP file. Note that if the search service indicates that your query is still in progress, this function will block until the light curve ZIP file becomes available. The maximum wait time in seconds is set by maxtimeout and the refresh interval is set by refresh.
To avoid the wait block, set download_data to False and the function will write a pickle file to ~/.astrobase/lccs/query-[setid].pkl containing all the information necessary to retrieve these data files later when the query is done. To do so, call the retrieve_dataset_files with the path to this pickle file (it will be returned).
- outdir (str or None) – If this is provided, sets the output directory of the downloaded dataset files. If None, they will be downloaded to the current directory.
- maxtimeout (float) – The maximum time in seconds to wait for the LCC-Server to respond with a result before timing out. You can use the retrieve_dataset_files function to get results later as needed.
- refresh (float) – The time to wait in seconds before pinging the LCC-Server to see if a search query has completed and dataset result files can be downloaded.
Returns: Returns a tuple with the following elements:
(search result status dict, search result CSV file path, search result LC ZIP path)
Return type: tuple
-
astrobase.services.lccs.
xmatch_search
(lcc_server, file_to_upload, xmatch_dist_arcsec=3.0, result_visibility='unlisted', email_when_done=False, collections=None, columns=None, filters=None, sortspec=None, limitspec=None, samplespec=None, download_data=True, outdir=None, maxtimeout=300.0, refresh=15.0)[source]¶ This runs a cross-match search query.
Parameters: - lcc_server (str) – This is the base URL of the LCC-Server to talk to. (e.g. for HAT, use: https://data.hatsurveys.org)
- file_to_upload (str) –
This is the path to a text file containing objectid, RA, declination rows for the objects to cross-match against the LCC-Server collections. This should follow the format of the following example:
# example object and coordinate list # objectid ra dec aaa 289.99698 44.99839 bbb 293.358 -23.206 ccc 294.197 +23.181 ddd 19 25 27.9129 +42 47 03.693 eee 19:25:27 -42:47:03.21 # . # . # . # etc. lines starting with '#' will be ignored # (max 5000 objects)
- xmatch_dist_arcsec (float) – This is the maximum distance in arcseconds to consider when cross-matching objects in the uploaded file to the LCC-Server’s collections. The maximum allowed distance is 30 arcseconds. Multiple matches to an uploaded object are possible and will be returned in order of increasing distance grouped by input objectid.
- result_visibility ({'private', 'unlisted', 'public'}) –
This sets the visibility of the dataset produced from the search result:
'private' -> the dataset and its products are not visible or accessible by any user other than the one that created the dataset. 'unlisted' -> the dataset and its products are not visible in the list of public datasets, but can be accessed if the dataset URL is known 'public' -> the dataset and its products are visible in the list of public datasets and can be accessed by anyone.
- email_when_done (bool) – If True, the LCC-Server will email you when the search is complete. This will also set download_data to False. Using this requires an LCC-Server account and an API key tied to that account.
- collections (list of str or None) – This is a list of LC collections to search in. If this is None, all collections will be searched.
- columns (list of str or None) – This is a list of columns to return in the results. Matching objects’ object IDs, RAs, DECs, and links to light curve files will always be returned so there is no need to specify these columns. If None, only these columns will be returned: ‘objectid’, ‘ra’, ‘decl’, ‘lcfname’
- filters (str or None) –
This is an SQL-like string to use to filter on database columns in the LCC-Server’s collections. To see the columns available for a search, visit the Collections tab in the LCC-Server’s browser UI. The filter operators allowed are:
lt -> less than gt -> greater than ge -> greater than or equal to le -> less than or equal to eq -> equal to ne -> not equal to ct -> contains text isnull -> column value is null notnull -> column value is not null
You may use the and and or operators between filter specifications to chain them together logically.
Example filter strings:
"(propermotion gt 200.0) and (sdssr lt 11.0)" "(dered_jmag_kmag gt 2.0) and (aep_000_stetsonj gt 10.0)" "(gaia_status ct 'ok') and (propermotion gt 300.0)" "(simbad_best_objtype ct 'RR') and (dered_sdssu_sdssg lt 0.5)"
- sortspec (tuple of two strs or None) –
If not None, this should be a tuple of two items:
('column to sort by', 'asc|desc')
This sets the column to sort the results by. For cone_search, the default column and sort order are ‘dist_arcsec’ and ‘asc’, meaning the distance from the search center in ascending order.
- samplespec (int or None) – If this is an int, will indicate how many rows from the initial search result will be uniformly random sampled and returned.
- limitspec (int or None) –
If this is an int, will indicate how many rows from the initial search result to return in total.
sortspec, samplespec, and limitspec are applied in this order:
sample -> sort -> limit - download_data (bool) –
This sets if the accompanying data from the search results will be downloaded automatically. This includes the data table CSV, the dataset pickle file, and a light curve ZIP file. Note that if the search service indicates that your query is still in progress, this function will block until the light curve ZIP file becomes available. The maximum wait time in seconds is set by maxtimeout and the refresh interval is set by refresh.
To avoid the wait block, set download_data to False and the function will write a pickle file to ~/.astrobase/lccs/query-[setid].pkl containing all the information necessary to retrieve these data files later when the query is done. To do so, call the retrieve_dataset_files with the path to this pickle file (it will be returned).
- outdir (str or None) – If this is provided, sets the output directory of the downloaded dataset files. If None, they will be downloaded to the current directory.
- maxtimeout (float) – The maximum time in seconds to wait for the LCC-Server to respond with a result before timing out. You can use the retrieve_dataset_files function to get results later as needed.
- refresh (float) – The time to wait in seconds before pinging the LCC-Server to see if a search query has completed and dataset result files can be downloaded.
Returns: Returns a tuple with the following elements:
(search result status dict, search result CSV file path, search result LC ZIP path)
Return type: tuple
-
astrobase.services.lccs.
get_dataset
(lcc_server, dataset_id, strformat=False, page=1)[source]¶ This downloads a JSON form of a dataset from the specified lcc_server.
If the dataset contains more than 1000 rows, it will be paginated, so you must use the page kwarg to get the page you want. The dataset JSON will contain the keys ‘npages’, ‘currpage’, and ‘rows_per_page’ to help with this. The ‘rows’ key contains the actual data rows as a list of tuples.
The JSON contains metadata about the query that produced the dataset, information about the data table’s columns, and links to download the dataset’s products including the light curve ZIP and the dataset CSV.
Parameters: - lcc_server (str) – This is the base URL of the LCC-Server to talk to.
- dataset_id (str) – This is the unique setid of the dataset you want to get. In the results from the *_search functions above, this is the value of the infodict[‘result’][‘setid’] key in the first item (the infodict) in the returned tuple.
- strformat (bool) – This sets if you want the returned data rows to be formatted in their string representations already. This can be useful if you’re piping the returned JSON straight into some sort of UI and you don’t want to deal with formatting floats, etc. To do this manually when strformat is set to False, look at the coldesc item in the returned dict, which gives the Python and Numpy string format specifiers for each column in the data table.
- page (int) – This sets which page of the dataset should be retrieved.
Returns: This returns the dataset JSON loaded into a dict.
Return type: dict
-
astrobase.services.lccs.
object_info
(lcc_server, objectid, db_collection_id)[source]¶ This gets information on a single object from the LCC-Server.
Returns a dict with all of the available information on an object, including finding charts, comments, object type and variability tags, and period-search results (if available).
If you have an LCC-Server API key present in ~/.astrobase/lccs/ that is associated with an LCC-Server user account, objects that are visible to this user will be returned, even if they are not visible to the public. Use this to look up objects that have been marked as ‘private’ or ‘shared’.
NOTE: you can pass the result dict returned by this function directly into the astrobase.checkplot.checkplot_pickle_to_png function, e.g.:
astrobase.checkplot.checkplot_pickle_to_png(result_dict, 'object-%s-info.png' % result_dict['objectid'])
to generate a quick PNG overview of the object information.
Parameters: - lcc_server (str) – This is the base URL of the LCC-Server to talk to.
- objectid (str) – This is the unique database ID of the object to retrieve info for. This is always returned as the db_oid column in LCC-Server search results.
- db_collection_id (str) – This is the collection ID which will be searched for the object. This is always returned as the collection column in LCC-Server search results.
Returns: A dict containing the object info is returned. Some important items in the result dict:
- objectinfo: all object magnitude, color, GAIA cross-match, and object type information available for this object
- objectcomments: comments on the object’s variability if available
- varinfo: variability comments, variability features, type tags, period and epoch information if available
- neighbors: information on the neighboring objects of this object in its parent light curve collection
- xmatch: information on any cross-matches to external catalogs (e.g. KIC, EPIC, TIC, APOGEE, etc.)
- finderchart: a base-64 encoded PNG image of the object’s DSS2 RED finder chart. To convert this to an actual PNG, try the function: astrobase.checkplot.pkl_io._b64_to_file.
- magseries: a base-64 encoded PNG image of the object’s light curve. To convert this to an actual PNG, try the function: astrobase.checkplot.pkl_io._b64_to_file.
- pfmethods: a list of period-finding methods applied to the object if any. If this list is present, use the keys in it to get to the actual period-finding results for each method. These will contain base-64 encoded PNGs of the periodogram and phased light curves using the best three peaks in the periodogram, as well as period and epoch information.
Return type: dict
-
astrobase.services.lccs.
list_recent_datasets
(lcc_server, nrecent=25)[source]¶ This lists recent publicly visible datasets available on the LCC-Server.
If you have an LCC-Server API key present in ~/.astrobase/lccs/ that is associated with an LCC-Server user account, datasets that belong to this user will be returned as well, even if they are not visible to the public.
Parameters: - lcc_server (str) – This is the base URL of the LCC-Server to talk to.
- nrecent (int) – This indicates how many recent public datasets you want to list. This is always capped at 1000.
Returns: Returns a list of dicts, with each dict containing info on each dataset.
Return type: list of dicts
-
astrobase.services.lccs.
list_lc_collections
(lcc_server)[source]¶ This lists all light curve collections made available on the LCC-Server.
If you have an LCC-Server API key present in ~/.astrobase/lccs/ that is associated with an LCC-Server user account, light curve collections visible to this user will be returned as well, even if they are not visible to the public.
Parameters: lcc_server (str) – The base URL of the LCC-Server to talk to. Returns: Returns a dict containing lists of info items per collection. This includes collection_ids, lists of columns, lists of indexed columns, lists of full-text indexed columns, detailed column descriptions, number of objects in each collection, collection sky coverage, etc. Return type: dict
astrobase.services.mast module¶
This interfaces with the MAST API. The main use for this (for now) is to fill in TIC information for checkplots.
The MAST API service documentation is at:
https://mast.stsci.edu/api/v0/index.html
For a more general and useful interface to MAST, see the astroquery package by A. Ginsburg, B. Sipocz, et al.:
http://astroquery.readthedocs.io
-
astrobase.services.mast.
mast_query
(service, params, data=None, apiversion='v0', forcefetch=False, cachedir='~/.astrobase/mast-cache', verbose=True, timeout=10.0, refresh=5.0, maxtimeout=90.0, maxtries=3, raiseonfail=False, jitter=5.0)[source]¶ This queries the STScI MAST service for catalog data.
All results are downloaded as JSON files that are written to cachedir.
Parameters: - service (str) – This is the name of the service to use. See https://mast.stsci.edu/api/v0/_services.html for a list of all available services.
- params (dict) – This is a dict containing the input params to the service as described on its details page linked in the service description page on MAST.
- data (dict or None) – This contains optional data to upload to the service.
- apiversion (str) – The API version of the MAST service to use. This sets the URL that this function will call, using apiversion as key into the MAST_URLS dict above.
- forcefetch (bool) – If this is True, the query will be retried even if cached results for it exist.
- cachedir (str) – This points to the directory where results will be downloaded.
- verbose (bool) – If True, will indicate progress and warn of any issues.
- timeout (float) – This sets the amount of time in seconds to wait for the service to respond to our initial request.
- refresh (float) – This sets the amount of time in seconds to wait before checking if the result file is available. If the results file isn’t available after refresh seconds have elapsed, the function will wait for refresh seconds continuously, until maxtimeout is reached or the results file becomes available.
- maxtimeout (float) – The maximum amount of time in seconds to wait for a result to become available after submitting our query request.
- maxtries (int) – The maximum number of tries (across all mirrors tried) to make to either submit the request or download the results, before giving up.
- raiseonfail (bool) – If this is True, the function will raise an Exception if something goes wrong, instead of returning None.
- jitter (float) – This is used to control the scale of the random wait in seconds before starting the query. Useful in parallelized situations.
Returns: This returns a dict of the following form:
{'params':dict of the input params used for the query, 'provenance':'cache' or 'new download', 'result':path to the file on disk with the downloaded data table}
Return type: dict
-
astrobase.services.mast.
tic_conesearch
(ra, decl, radius_arcmin=5.0, apiversion='v0', forcefetch=False, cachedir='~/.astrobase/mast-cache', verbose=True, timeout=10.0, refresh=5.0, maxtimeout=90.0, maxtries=3, jitter=5.0, raiseonfail=False)[source]¶ This runs a TESS Input Catalog cone search on MAST.
If you use this, please cite the TIC paper (Stassun et al 2018; http://adsabs.harvard.edu/abs/2018AJ….156..102S). Also see the “living” TESS input catalog docs:
https://docs.google.com/document/d/1zdiKMs4Ld4cXZ2DW4lMX-fuxAF6hPHTjqjIwGqnfjqI
Also see: https://mast.stsci.edu/api/v0/_t_i_cfields.html for the fields returned by the service and present in the result JSON file.
Parameters: - ra,decl (float) – The center coordinates of the cone-search in decimal degrees.
- radius_arcmin (float) – The cone-search radius in arcminutes.
- apiversion (str) – The API version of the MAST service to use. This sets the URL that this function will call, using apiversion as key into the MAST_URLS dict above.
- forcefetch (bool) – If this is True, the query will be retried even if cached results for it exist.
- cachedir (str) – This points to the directory where results will be downloaded.
- verbose (bool) – If True, will indicate progress and warn of any issues.
- timeout (float) – This sets the amount of time in seconds to wait for the service to respond to our initial request.
- refresh (float) – This sets the amount of time in seconds to wait before checking if the result file is available. If the results file isn’t available after refresh seconds have elapsed, the function will wait for refresh seconds continuously, until maxtimeout is reached or the results file becomes available.
- maxtimeout (float) – The maximum amount of time in seconds to wait for a result to become available after submitting our query request.
- maxtries (int) – The maximum number of tries (across all mirrors tried) to make to either submit the request or download the results, before giving up.
- jitter (float) – This is used to control the scale of the random wait in seconds before starting the query. Useful in parallelized situations.
- raiseonfail (bool) – If this is True, the function will raise an Exception if something goes wrong, instead of returning None.
Returns: This returns a dict of the following form:
{'params':dict of the input params used for the query, 'provenance':'cache' or 'new download', 'result':path to the file on disk with the downloaded data table}
Return type: dict
-
astrobase.services.mast.
tic_xmatch
(ra, decl, radius_arcsec=5.0, apiversion='v0', forcefetch=False, cachedir='~/.astrobase/mast-cache', verbose=True, timeout=90.0, refresh=5.0, maxtimeout=180.0, maxtries=3, jitter=5.0, raiseonfail=False)[source]¶ This does a cross-match with TIC.
Parameters: - ra,decl (np.arrays or lists of floats) – The coordinates that will be cross-matched against the TIC.
- radius_arcsec (float) – The cross-match radius in arcseconds.
- apiversion (str) – The API version of the MAST service to use. This sets the URL that this function will call, using apiversion as key into the MAST_URLS dict above.
- forcefetch (bool) – If this is True, the query will be retried even if cached results for it exist.
- cachedir (str) – This points to the directory where results will be downloaded.
- verbose (bool) – If True, will indicate progress and warn of any issues.
- timeout (float) – This sets the amount of time in seconds to wait for the service to respond to our initial request.
- refresh (float) – This sets the amount of time in seconds to wait before checking if the result file is available. If the results file isn’t available after refresh seconds have elapsed, the function will wait for refresh seconds continuously, until maxtimeout is reached or the results file becomes available.
- maxtimeout (float) – The maximum amount of time in seconds to wait for a result to become available after submitting our query request.
- maxtries (int) – The maximum number of tries (across all mirrors tried) to make to either submit the request or download the results, before giving up.
- jitter (float) – This is used to control the scale of the random wait in seconds before starting the query. Useful in parallelized situations.
- raiseonfail (bool) – If this is True, the function will raise an Exception if something goes wrong, instead of returning None.
Returns: This returns a dict of the following form:
{'params':dict of the input params used for the query, 'provenance':'cache' or 'new download', 'result':path to the file on disk with the downloaded data table}
Return type: dict
-
astrobase.services.mast.
tic_objectsearch
(objectid, idcol_to_use='ID', apiversion='v0', forcefetch=False, cachedir='~/.astrobase/mast-cache', verbose=True, timeout=90.0, refresh=5.0, maxtimeout=180.0, maxtries=3, jitter=5.0, raiseonfail=False)[source]¶ This runs a TIC search for a specified TIC ID.
Parameters: - objectid (str) – The object ID to look up information for.
- idcol_to_use (str) – This is the name of the object ID column to use when looking up the provided objectid. This is one of {‘ID’, ‘HIP’, ‘TYC’, ‘UCAC’, ‘TWOMASS’, ‘ALLWISE’, ‘SDSS’, ‘GAIA’, ‘APASS’, ‘KIC’}.
- apiversion (str) – The API version of the MAST service to use. This sets the URL that this function will call, using apiversion as key into the MAST_URLS dict above.
- forcefetch (bool) – If this is True, the query will be retried even if cached results for it exist.
- cachedir (str) – This points to the directory where results will be downloaded.
- verbose (bool) – If True, will indicate progress and warn of any issues.
- timeout (float) – This sets the amount of time in seconds to wait for the service to respond to our initial request.
- refresh (float) – This sets the amount of time in seconds to wait before checking if the result file is available. If the results file isn’t available after refresh seconds have elapsed, the function will wait for refresh seconds continuously, until maxtimeout is reached or the results file becomes available.
- maxtimeout (float) – The maximum amount of time in seconds to wait for a result to become available after submitting our query request.
- maxtries (int) – The maximum number of tries (across all mirrors tried) to make to either submit the request or download the results, before giving up.
- jitter (float) – This is used to control the scale of the random wait in seconds before starting the query. Useful in parallelized situations.
- raiseonfail (bool) – If this is True, the function will raise an Exception if something goes wrong, instead of returning None.
Returns: This returns a dict of the following form:
{'params':dict of the input params used for the query, 'provenance':'cache' or 'new download', 'result':path to the file on disk with the downloaded data table}
Return type: dict
astrobase.services.simbad module¶
This queries the SIMBAD database using their TAP interface. The main use for this is to serve as a reverse name resolver (i.e. get all object names using a narrow cone-search).
For a more general and useful interface to SIMBAD, see the astroquery package by A. Ginsburg, B. Sipocz, et al.:
http://astroquery.readthedocs.io
-
astrobase.services.simbad.
tap_query
(querystr, simbad_mirror='simbad', returnformat='csv', forcefetch=False, cachedir='~/.astrobase/simbad-cache', verbose=True, timeout=10.0, refresh=2.0, maxtimeout=90.0, maxtries=3, complete_query_later=False, jitter=5.0)[source]¶ This queries the SIMBAD TAP service using the ADQL query string provided.
Parameters: - querystr (str) – This is the ADQL query string. See: http://www.ivoa.net/documents/ADQL/2.0 for the specification.
- simbad_mirror (str) – This is the key used to select a SIMBAD mirror from the SIMBAD_URLS dict above. If set, the specified mirror will be used. If None, a random mirror chosen from that dict will be used.
- returnformat ({'csv','votable','json'}) – The returned file format to request from the GAIA catalog service.
- forcefetch (bool) – If this is True, the query will be retried even if cached results for it exist.
- cachedir (str) – This points to the directory where results will be downloaded.
- verbose (bool) – If True, will indicate progress and warn of any issues.
- timeout (float) – This sets the amount of time in seconds to wait for the service to respond to our initial request.
- refresh (float) – This sets the amount of time in seconds to wait before checking if the result file is available. If the results file isn’t available after refresh seconds have elapsed, the function will wait for refresh seconds continuously, until maxtimeout is reached or the results file becomes available.
- maxtimeout (float) – The maximum amount of time in seconds to wait for a result to become available after submitting our query request.
- maxtries (int) – The maximum number of tries (across all mirrors tried) to make to either submit the request or download the results, before giving up.
- complete_query_later (bool) – If set to True, a submitted query that does not return a result before maxtimeout has passed will be cancelled but its input request parameters and the result URL provided by the service will be saved. If this function is then called later with these same input request parameters, it will check if the query finally finished and a result is available. If so, will download the results instead of submitting a new query. If it’s not done yet, will start waiting for results again. To force launch a new query with the same request parameters, set the forcefetch kwarg to True.
- jitter (float) – This is used to control the scale of the random wait in seconds before starting the query. Useful in parallelized situations.
Returns: This returns a dict of the following form:
{'params':dict of the input params used for the query, 'provenance':'cache' or 'new download', 'result':path to the file on disk with the downloaded data table}
Return type: dict
-
astrobase.services.simbad.
objectnames_conesearch
(racenter, declcenter, searchradiusarcsec, simbad_mirror='simbad', returnformat='csv', forcefetch=False, cachedir='~/.astrobase/simbad-cache', verbose=True, timeout=10.0, refresh=2.0, maxtimeout=90.0, maxtries=1, complete_query_later=True)[source]¶ This queries the SIMBAD TAP service for a list of object names near the coords. This is effectively a “reverse” name resolver (i.e. this does the opposite of SESAME).
Parameters: - racenter,declcenter (float) – The cone-search center coordinates in decimal degrees
- searchradiusarcsec (float) – The radius in arcseconds to search around the center coordinates.
- simbad_mirror (str) – This is the key used to select a SIMBAD mirror from the SIMBAD_URLS dict above. If set, the specified mirror will be used. If None, a random mirror chosen from that dict will be used.
- returnformat ({'csv','votable','json'}) – The returned file format to request from the GAIA catalog service.
- forcefetch (bool) – If this is True, the query will be retried even if cached results for it exist.
- cachedir (str) – This points to the directory where results will be downloaded.
- verbose (bool) – If True, will indicate progress and warn of any issues.
- timeout (float) – This sets the amount of time in seconds to wait for the service to respond to our initial request.
- refresh (float) – This sets the amount of time in seconds to wait before checking if the result file is available. If the results file isn’t available after refresh seconds have elapsed, the function will wait for refresh seconds continuously, until maxtimeout is reached or the results file becomes available.
- maxtimeout (float) – The maximum amount of time in seconds to wait for a result to become available after submitting our query request.
- maxtries (int) – The maximum number of tries (across all mirrors tried) to make to either submit the request or download the results, before giving up.
- complete_query_later (bool) – If set to True, a submitted query that does not return a result before maxtimeout has passed will be cancelled but its input request parameters and the result URL provided by the service will be saved. If this function is then called later with these same input request parameters, it will check if the query finally finished and a result is available. If so, will download the results instead of submitting a new query. If it’s not done yet, will start waiting for results again. To force launch a new query with the same request parameters, set the forcefetch kwarg to True.
Returns: This returns a dict of the following form:
{'params':dict of the input params used for the query, 'provenance':'cache' or 'new download', 'result':path to the file on disk with the downloaded data table}
Return type: dict
astrobase.services.skyview module¶
This gets cutout images from the Digitized Sky Survey using the NASA GSFC SkyView server.
-
astrobase.services.skyview.
get_stamp
(ra, decl, survey='DSS2 Red', scaling='Linear', sizepix=300, forcefetch=False, cachedir='~/.astrobase/stamp-cache', timeout=45.0, retry_failed=True, verbose=True, jitter=5.0)[source]¶ This gets a FITS cutout from the NASA GSFC SkyView service.
This downloads stamps in FITS format from the NASA SkyView service:
https://skyview.gsfc.nasa.gov/current/cgi/query.pl
Parameters: - ra,decl (float) – These are decimal equatorial coordinates for the cutout center.
- survey (str) – The survey name to get the stamp from. This is one of the values in the ‘SkyView Surveys’ option boxes on the SkyView webpage. Currently, we’ve only tested using ‘DSS2 Red’ as the value for this kwarg, but the other ones should work in principle.
- scaling (str) – This is the pixel value scaling function to use.
- sizepix (int) – The width and height of the cutout are specified by this value.
- forcefetch (bool) – If True, will disregard any existing cached copies of the stamp already downloaded corresponding to the requested center coordinates and redownload the FITS from the SkyView service.
- cachedir (str) – This is the path to the astrobase cache directory. All downloaded FITS stamps are stored here as .fits.gz files so we can immediately respond with the cached copy when a request is made for a coordinate center that’s already been downloaded.
- timeout (float) – Sets the timeout in seconds to wait for a response from the NASA SkyView service.
- retry_failed (bool) – If the initial request to SkyView fails, and this is True, will retry until it succeeds.
- verbose (bool) – If True, indicates progress.
- jitter (float) – This is used to control the scale of the random wait in seconds before starting the query. Useful in parallelized situations.
Returns: A dict of the following form is returned:
{ 'params':{input ra, decl and kwargs used}, 'provenance':'cached' or 'new download', 'fitsfile':FITS file to which the cutout was saved on disk }
Return type: dict
astrobase.services.trilegal module¶
This downloads and interacts with galaxy models generated by the TRILEGAL web-form by Prof. Leo Girardi. This module requires the requests and astropy packages only and can be used without astrobase if the accompanying dust.py module is located in the same directory as this module.
If you use this, please cite the TRILEGAL papers:
http://stev.oapd.inaf.it/~webmaster/trilegal_1.6/papers.html
and link to the TRILEGAL website:
http://stev.oapd.inaf.it/cgi-bin/trilegal
The extinction coefficient Av_at_infinity for the requested coordinates is automatically obtained from the 2MASS DUST service at:
http://irsa.ipac.caltech.edu/applications/DUST/
-
astrobase.services.trilegal.
list_trilegal_filtersystems
()[source]¶ This just lists all the filter systems available for TRILEGAL.
-
astrobase.services.trilegal.
query_galcoords
(gal_lon, gal_lat, filtersystem='sloan_2mass', field_deg2=1.0, usebinaries=True, extinction_sigma=0.1, magnitude_limit=26.0, maglim_filtercol=4, trilegal_version=1.6, extraparams=None, forcefetch=False, cachedir='~/.astrobase/trilegal-cache', verbose=True, timeout=60.0, refresh=150.0, maxtimeout=700.0)[source]¶ This queries the TRILEGAL model form, downloads results, and parses them.
Parameters: - gal_lon,gal_lat (float) – These are the center galactic longitude and latitude in degrees.
- filtersystem (str) – This is a key in the TRILEGAL_FILTER_SYSTEMS dict. Use the function
astrobase.services.trilegal.list_trilegal_filtersystems()
to see a nicely formatted table with the key and description for each of these. - field_deg2 (float) – The area of the simulated field in square degrees.
- usebinaries (bool) – If this is True, binaries will be present in the model results.
- extinction_sigma (float) – This is the applied std dev around the Av_extinction value for the galactic coordinates requested.
- magnitude_limit (float) – This is the limiting magnitude of the simulation in the maglim_filtercol band index of the filter system chosen.
- maglim_filtercol (int) – The index in the filter system list of the magnitude limiting band.
- trilegal_version (float) – This is the the version of the TRILEGAL form to use. This can usually be left as-is.
- extraparams (dict or None) –
This is a dict that can be used to override parameters of the model other than the basic ones used for input to this function. All parameters are listed in TRILEGAL_DEFAULT_PARAMS above. See:
http://stev.oapd.inaf.it/cgi-bin/trilegal
for explanations of these parameters.
- forcefetch (bool) – If this is True, the query will be retried even if cached results for it exist.
- cachedir (str) – This points to the directory where results will be downloaded.
- verbose (bool) – If True, will indicate progress and warn of any issues.
- timeout (float) – This sets the amount of time in seconds to wait for the service to respond to our initial request.
- refresh (float) – This sets the amount of time in seconds to wait before checking if the result file is available. If the results file isn’t available after refresh seconds have elapsed, the function will wait for refresh seconds continuously, until maxtimeout is reached or the results file becomes available.
- maxtimeout (float) – The maximum amount of time in seconds to wait for a result to become available after submitting our query request.
Returns: This returns a dict of the form:
{'params':the input param dict used, 'extraparams':any extra params used, 'provenance':'cached' or 'new download', 'tablefile':the path on disk to the downloaded model text file}
Return type: dict
-
astrobase.services.trilegal.
query_radecl
(ra, decl, filtersystem='sloan_2mass', field_deg2=1.0, usebinaries=True, extinction_sigma=0.1, magnitude_limit=26.0, maglim_filtercol=4, trilegal_version=1.6, extraparams=None, forcefetch=False, cachedir='~/.astrobase/trilegal-cache', verbose=True, timeout=60.0, refresh=150.0, maxtimeout=700.0)[source]¶ This runs the TRILEGAL query for decimal equatorial coordinates.
Parameters: - ra,decl (float) – These are the center equatorial coordinates in decimal degrees
- filtersystem (str) – This is a key in the TRILEGAL_FILTER_SYSTEMS dict. Use the function
astrobase.services.trilegal.list_trilegal_filtersystems()
to see a nicely formatted table with the key and description for each of these. - field_deg2 (float) – The area of the simulated field in square degrees. This is in the Galactic coordinate system.
- usebinaries (bool) – If this is True, binaries will be present in the model results.
- extinction_sigma (float) – This is the applied std dev around the Av_extinction value for the galactic coordinates requested.
- magnitude_limit (float) – This is the limiting magnitude of the simulation in the maglim_filtercol band index of the filter system chosen.
- maglim_filtercol (int) – The index in the filter system list of the magnitude limiting band.
- trilegal_version (float) – This is the the version of the TRILEGAL form to use. This can usually be left as-is.
- extraparams (dict or None) –
This is a dict that can be used to override parameters of the model other than the basic ones used for input to this function. All parameters are listed in TRILEGAL_DEFAULT_PARAMS above. See:
http://stev.oapd.inaf.it/cgi-bin/trilegal
for explanations of these parameters.
- forcefetch (bool) – If this is True, the query will be retried even if cached results for it exist.
- cachedir (str) – This points to the directory where results will be downloaded.
- verbose (bool) – If True, will indicate progress and warn of any issues.
- timeout (float) – This sets the amount of time in seconds to wait for the service to respond to our initial request.
- refresh (float) – This sets the amount of time in seconds to wait before checking if the result file is available. If the results file isn’t available after refresh seconds have elapsed, the function will wait for refresh seconds continuously, until maxtimeout is reached or the results file becomes available.
- maxtimeout (float) – The maximum amount of time in seconds to wait for a result to become available after submitting our query request.
Returns: This returns a dict of the form:
{'params':the input param dict used, 'extraparams':any extra params used, 'provenance':'cached' or 'new download', 'tablefile':the path on disk to the downloaded model text file}
Return type: dict
astrobase.services.limbdarkening module¶
Utilities to get stellar limb darkening coefficients for use during transit fitting.
-
astrobase.services.limbdarkening.
get_tess_limb_darkening_guesses
(teff, logg)[source]¶ Given Teff and log(g), query the Claret+2017 limb darkening coefficient grid. Return the nearest match.
TODO: interpolate instead of doing the nearest match. Nearest match is good to maybe only ~200 K and ~0.3 in log(g).
Parameters: - teff (float) – The stellar effective temperature to use.
- logg (float) – The stellar log g value to use.
Returns: (linear_coeff, quadratic_coeff) – Returns a tuple containing the linear and quadratic limb-darkening coefficients for the given effective temperature and log g.
Return type: tuple
astrobase.services.identifiers module¶
Easy conversion between survey identifiers. Works best on bright and/or famous objects, particularly when SIMBAD is involved.
simbad_to_gaiadr2()
: given simbad name, attempt to get GAIA DR2 source_id
gaiadr2_to_tic()
: given GAIA DR2 source_id, attempt to get TIC ID
simbad_to_tic()
: given simbad name, get TIC ID
tic_to_gaiadr2()
: given TIC ID, get GAIA DR2 source_id
-
astrobase.services.identifiers.
simbad_to_gaiadr2
(simbad_name, simbad_mirror='simbad', returnformat='csv', forcefetch=False, cachedir='~/.astrobase/simbad-cache', verbose=True, timeout=10.0, refresh=2.0, maxtimeout=90.0, maxtries=1, complete_query_later=True)[source]¶ Convenience function that, given a SIMBAD object name, returns string of the Gaia-DR2 identifier.
Parameters: - simbad_name (str) – The SIMBAD object name to search for.
- simbad_mirror (str) – This is the key used to select a SIMBAD mirror from the SIMBAD_URLS dict above. If set, the specified mirror will be used. If None, a random mirror chosen from that dict will be used.
- returnformat ({'csv','votable','json'}) – The returned file format to request from the GAIA catalog service.
- forcefetch (bool) – If this is True, the query will be retried even if cached results for it exist.
- cachedir (str) – This points to the directory where results will be downloaded.
- verbose (bool) – If True, will indicate progress and warn of any issues.
- timeout (float) – This sets the amount of time in seconds to wait for the service to respond to our initial request.
- refresh (float) – This sets the amount of time in seconds to wait before checking if the result file is available. If the results file isn’t available after refresh seconds have elapsed, the function will wait for refresh seconds continuously, until maxtimeout is reached or the results file becomes available.
- maxtimeout (float) – The maximum amount of time in seconds to wait for a result to become available after submitting our query request.
- maxtries (int) – The maximum number of tries (across all mirrors tried) to make to either submit the request or download the results, before giving up.
- complete_query_later (bool) – If set to True, a submitted query that does not return a result before maxtimeout has passed will be cancelled but its input request parameters and the result URL provided by the service will be saved. If this function is then called later with these same input request parameters, it will check if the query finally finished and a result is available. If so, will download the results instead of submitting a new query. If it’s not done yet, will start waiting for results again. To force launch a new query with the same request parameters, set the forcefetch kwarg to True.
Returns: gaiadr2_id – Returns the GAIA DR2 ID as a string.
Return type: str
-
astrobase.services.identifiers.
gaiadr2_to_tic
(source_id, gaia_mirror='heidelberg', gaia_data_release='dr2', returnformat='csv', forcefetch=False, cachedir='~/.astrobase/simbad-cache', verbose=True, timeout=10.0, refresh=2.0, maxtimeout=90.0, maxtries=1, complete_query_later=True)[source]¶ First, gets RA/dec from Gaia DR2, given source_id. Then searches TICv8 spatially, and returns matches with the correct DR2 source_id.
Parameters: - source_id (str) – The GAIA DR2 source identifier.
- gaia_mirror ({'gaia','heidelberg','vizier'} or None) – This is the key used to select a GAIA catalog mirror from the GAIA_URLS dict above. If set, the specified mirror will be used. If None, a random mirror chosen from that dict will be used.
- gaia_data_release ({'dr2', 'edr3'}) – The Gaia data release to use for the query. This provides hints for which table to use for the GAIA mirror being queried.
- returnformat ({'csv','votable','json'}) – The returned file format to request from the GAIA catalog service.
- forcefetch (bool) – If this is True, the query will be retried even if cached results for it exist.
- cachedir (str) – This points to the directory where results will be downloaded.
- verbose (bool) – If True, will indicate progress and warn of any issues.
- timeout (float) – This sets the amount of time in seconds to wait for the service to respond to our initial request.
- refresh (float) – This sets the amount of time in seconds to wait before checking if the result file is available. If the results file isn’t available after refresh seconds have elapsed, the function will wait for refresh seconds continuously, until maxtimeout is reached or the results file becomes available.
- maxtimeout (float) – The maximum amount of time in seconds to wait for a result to become available after submitting our query request.
- maxtries (int) – The maximum number of tries (across all mirrors tried) to make to either submit the request or download the results, before giving up.
- completequerylater (bool) – If set to True, a submitted query that does not return a result before maxtimeout has passed will be cancelled but its input request parameters and the result URL provided by the service will be saved. If this function is then called later with these same input request parameters, it will check if the query finally finished and a result is available. If so, will download the results instead of submitting a new query. If it’s not done yet, will start waiting for results again. To force launch a new query with the same request parameters, set the forcefetch kwarg to True.
Returns: tic_id – Returns the TIC ID of the object as a string.
Return type: str
astrobase.services.tesslightcurves module¶
Useful tools for acquiring TESS light-curves. This module contains a number of non-standard dependencies, including lightkurve, eleanor, and astroquery.
Light-curve retrieval: get light-curves from all sectors for a tic_id:
get_two_minute_spoc_lightcurves
get_hlsp_lightcurves
get_eleanor_lightcurves
Visibility queries: check if an ra/dec was observed:
is_two_minute_spoc_lightcurve_available
get_tess_visibility_given_ticid
get_tess_visibility_given_ticids
Still TODO:
get_cpm_lightcurve
-
astrobase.services.tesslightcurves.
get_two_minute_spoc_lightcurves
(tic_id, download_dir=None)[source]¶ This downloads 2-minute TESS SPOC light curves.
Parameters: tic_id (str) – The TIC ID of the object as a string. Returns: lcfiles – List of light-curve file paths. None if none are found and downloaded. Return type: list or None
-
astrobase.services.tesslightcurves.
get_hlsp_lightcurves
(tic_id, hlsp_products=('CDIPS', 'TASOC', 'PATHOS'), download_dir=None, verbose=True)[source]¶ This downloads TESS HLSP light curves for a given TIC ID.
Parameters: - tic_id (str) – The TIC ID of the object as a string.
- hlsp_products (sequence of str) – List of desired HLSP products to search. For instance, [“CDIPS”].
- download_dir (str) – Path of directory to which light-curve will be downloaded.
Returns: lcfiles – List of light-curve file paths. None if none are found and downloaded.
Return type: list or None
-
astrobase.services.tesslightcurves.
get_eleanor_lightcurves
(tic_id, download_dir=None, targetdata_kwargs=None)[source]¶ This downloads light curves from the Eleanor project for a given TIC ID.
Parameters: - tic_id (str) – The TIC ID of the object as a string.
- download_dir (str) – The light curve FITS files will be downloaded here.
- targetdata_kwargs (dict) –
Optional dictionary of keys and values to be passed
eleanor.TargetData
(see https://adina.feinste.in/eleanor/api.html). For instance, you might pass{'height':8, 'width':8, 'do_pca':True, 'do_psf':True, 'crowded_field':False}
to run these settings through to eleanor. The default options used if targetdata_kwargs is None are as follows:{ height=15, width=15, save_postcard=True, do_pca=False, do_psf=False, bkg_size=31, crowded_field=True, cal_cadences=None, try_load=True, regressors=None }
Returns: lcfiles – List of light-curve file paths. These are saved as CSV, rather than FITS, by this function.
Return type: list or None
-
astrobase.services.tesslightcurves.
is_two_minute_spoc_lightcurve_available
(tic_id)[source]¶ This checks if a 2-minute TESS SPOC light curve is available for the TIC ID.
Parameters: tic_id (str) – The TIC ID of the object as a string. Returns: result – True if a 2 minute SPOC light-curve is available, else False. Return type: bool
-
astrobase.services.tesslightcurves.
get_tess_visibility_given_ticid
(tic_id)[source]¶ This checks if a given TIC ID is visible in a TESS sector.
Parameters: tic_id (str) – The TIC ID of the object as a string. Returns: sector_str,full_sector_str – The first element of the tuple contains a string list of the sector numbers where the object is visible. The second element of the tuple contains a string list of the full sector names where the object is visible. For example, “[16, 17]” and “[tess-s0016-1-4, tess-s0017-2-3]”. If empty, will return “[]” and “[]”.
Return type: tuple of strings
-
astrobase.services.tesslightcurves.
get_tess_visibility_given_ticids
(ticids)[source]¶ This gets TESS visibility info for an iterable container of TIC IDs.
Parameters: ticids (iterable of str) – The TIC IDs to look up. Returns: Returns a two-element tuple containing lists of the sector numbers and the full names of the sectors containing the requested TIC IDs. Return type: tuple
astrobase.checkplot
: contains functions to make checkplots: a grid of plots used to quickly decide if a period search for a possibly variable object was successful. Checkplots come in two forms:Python pickles: If you want to interactively browse through large numbers of checkplots (e.g., as part of a large variable star classification project), you can use the checkplotserver webapp that works on checkplot pickle files. This interface allows you to review all phased light curves from all period-finder methods applied, set and save variability tags, object type tags, best periods and epochs, and comments for each object using a browser-based UI (see below). The information entered can then be exported as CSV or JSON for the next stage of a variable star classification pipeline.
PNG images: Alternatively, if you want to simply glance through lots of checkplots (e.g. for an initial look at a collection of light curves), there’s a checkplot-viewer webapp available that operates on checkplot PNG images.

astrobase.cpserver
: contains the implementation of the checkplotserver webapp to review, edit, and export information from checkplot pickles produced as part of a variable star classification effort run on a large light curve collection. Also contains the more light-weight checkplot-viewer webapp to glance through large numbers of checkplot PNGs.astrobase.varclass
: functions for calculating various variability, stellar color and motion, and neighbor proximity features, along with a Random Forest based classifier.astrobase.services
: modules and functions to query various astronomical catalogs and data services, including GAIA, SIMBAD, TRILEGAL, NASA SkyView, and 2MASS DUST.
Other useful bits¶
Modules
astrobase.coordutils
: functions for dealing with coordinates (conversions, distances, proper motion).astrobase.timeutils
: functions for converting from Julian dates to Baryocentric Julian dates, and precessing coordinates between equinoxes and due to proper motion; this will automatically download and save the JPL ephemerides de430.bsp from JPL upon first import.
Subpackages
astrobase.fakelcs
: modules and functions to conduct an end-to-end variable star recovery simulation.
astrobase¶
astrobase package¶
Submodules¶
astrobase.awsutils module¶
This contains functions that handle various AWS services for use with lcproc_aws.py.
-
astrobase.awsutils.
ec2_ssh
(ip_address, keypem_file, username='ec2-user', raiseonfail=False)[source]¶ This opens an SSH connection to the EC2 instance at ip_address.
Parameters: - ip_address (str) – IP address of the AWS EC2 instance to connect to.
- keypem_file (str) – The path to the keypair PEM file generated by AWS to allow SSH connections.
- username (str) – The username to use to login to the EC2 instance.
- raiseonfail (bool) – If True, will re-raise whatever Exception caused the operation to fail and break out immediately.
Returns: This has all the usual paramiko functionality:
- Use SSHClient.exec_command(command, environment=None) to exec a shell command.
- Use SSHClient.open_sftp() to get a SFTPClient for the server. Then call SFTPClient.get() and .put() to copy files from and to the server.
Return type: paramiko.SSHClient
-
astrobase.awsutils.
s3_get_file
(bucket, filename, local_file, altexts=None, client=None, raiseonfail=False)[source]¶ This gets a file from an S3 bucket.
Parameters: - bucket (str) – The AWS S3 bucket name.
- filename (str) – The full filename of the file to get from the bucket
- local_file (str) – Path to where the downloaded file will be stored.
- altexts (None or list of str) – If not None, this is a list of alternate extensions to try for the file other than the one provided in filename. For example, to get anything that’s an .sqlite where .sqlite.gz is expected, use altexts=[‘’] to strip the .gz.
- client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
- raiseonfail (bool) – If True, will re-raise whatever Exception caused the operation to fail and break out immediately.
Returns: Path to the downloaded filename or None if the download was unsuccessful.
Return type: str
-
astrobase.awsutils.
s3_get_url
(url, altexts=None, client=None, raiseonfail=False)[source]¶ This gets a file from an S3 bucket based on its s3:// URL.
Parameters: - url (str) – S3 URL to download. This should begin with ‘s3://’.
- altexts (None or list of str) – If not None, this is a list of alternate extensions to try for the file other than the one provided in filename. For example, to get anything that’s an .sqlite where .sqlite.gz is expected, use altexts=[‘’] to strip the .gz.
- client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
- raiseonfail (bool) – If True, will re-raise whatever Exception caused the operation to fail and break out immediately.
Returns: Path to the downloaded filename or None if the download was unsuccessful. The file will be downloaded into the current working directory and will have a filename == basename of the file on S3.
Return type: str
-
astrobase.awsutils.
s3_put_file
(local_file, bucket, client=None, raiseonfail=False)[source]¶ This uploads a file to S3.
Parameters: - local_file (str) – Path to the file to upload to S3.
- bucket (str) – The AWS S3 bucket to upload the file to.
- client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
- raiseonfail (bool) – If True, will re-raise whatever Exception caused the operation to fail and break out immediately.
Returns: If the file upload is successful, returns the s3:// URL of the uploaded file. If it failed, will return None.
Return type: str or None
-
astrobase.awsutils.
s3_delete_file
(bucket, filename, client=None, raiseonfail=False)[source]¶ This deletes a file from S3.
Parameters: - bucket (str) – The AWS S3 bucket to delete the file from.
- filename (str) – The full file name of the file to delete, including any prefixes.
- client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
- raiseonfail (bool) – If True, will re-raise whatever Exception caused the operation to fail and break out immediately.
Returns: If the file was successfully deleted, will return the delete-marker (https://docs.aws.amazon.com/AmazonS3/latest/dev/DeleteMarker.html). If it wasn’t, returns None
Return type: str or None
-
astrobase.awsutils.
sqs_create_queue
(queue_name, options=None, client=None)[source]¶ This creates an SQS queue.
Parameters: - queue_name (str) – The name of the queue to create.
- options (dict or None) – A dict of options indicate extra attributes the queue should have. See the SQS docs for details. If None, no custom attributes will be attached to the queue.
- client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
Returns: This returns a dict of the form:
{'url': SQS URL of the queue, 'name': name of the queue}
Return type: dict
-
astrobase.awsutils.
sqs_delete_queue
(queue_url, client=None)[source]¶ This deletes an SQS queue given its URL
Parameters: - queue_url (str) – The SQS URL of the queue to delete.
- client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
Returns: True if the queue was deleted successfully. False otherwise.
Return type: bool
-
astrobase.awsutils.
sqs_put_item
(queue_url, item, delay_seconds=0, client=None, raiseonfail=False)[source]¶ This pushes a dict serialized to JSON to the specified SQS queue.
Parameters: - queue_url (str) – The SQS URL of the queue to push the object to.
- item (dict) – The dict passed in here will be serialized to JSON.
- delay_seconds (int) – The amount of time in seconds the pushed item will be held before going ‘live’ and being visible to all queue consumers.
- client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
- raiseonfail (bool) – If True, will re-raise whatever Exception caused the operation to fail and break out immediately.
Returns: If the item was successfully put on the queue, will return the response from the service. If it wasn’t, will return None.
Return type: boto3.Response or None
-
astrobase.awsutils.
sqs_get_item
(queue_url, max_items=1, wait_time_seconds=5, client=None, raiseonfail=False)[source]¶ This gets a single item from the SQS queue.
The queue_url is composed of some internal SQS junk plus a queue_name. For our purposes (lcproc_aws.py), the queue name will be something like:
lcproc_queue_<action>
where action is one of:
runcp runpf
The item is always a JSON object:
{'target': S3 bucket address of the file to process, 'action': the action to perform on the file ('runpf', 'runcp', etc.) 'args': the action's args as a tuple (not including filename, which is generated randomly as a temporary local file), 'kwargs': the action's kwargs as a dict, 'outbucket: S3 bucket to write the result to, 'outqueue': SQS queue to write the processed item's info to (optional)}
The action MUST match the <action> in the queue name for this item to be processed.
Parameters: - queue_url (str) – The SQS URL of the queue to get messages from.
- max_items (int) – The number of items to pull from the queue in this request.
- wait_time_seconds (int) – This specifies how long the function should block until a message is received on the queue. If the timeout expires, an empty list will be returned. If the timeout doesn’t expire, the function will return a list of items received (up to max_items).
- client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
- raiseonfail (bool) – If True, will re-raise whatever Exception caused the operation to fail and break out immediately.
Returns: For each item pulled from the queue in this request (up to max_items), a dict will be deserialized from the retrieved JSON, containing the message items and various metadata. The most important item of the metadata is the receipt_handle, which can be used to acknowledge receipt of all items in this request (see sqs_delete_item below).
If the queue pull fails outright, returns None. If no messages are available for this queue pull, returns an empty list.
Return type: list of dicts or None
-
astrobase.awsutils.
sqs_delete_item
(queue_url, receipt_handle, client=None, raiseonfail=False)[source]¶ This deletes a message from the queue, effectively acknowledging its receipt.
Call this only when all messages retrieved from the queue have been processed, since this will prevent redelivery of these messages to other queue workers pulling fromn the same queue channel.
Parameters: - queue_url (str) – The SQS URL of the queue where we got the messages from. This should be the same queue used to retrieve the messages in sqs_get_item.
- receipt_handle (str) – The receipt handle of the queue message that we’re responding to, and will acknowledge receipt of. This will be present in each message retrieved using sqs_get_item.
- client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
- raiseonfail (bool) – If True, will re-raise whatever Exception caused the operation to fail and break out immediately.
Returns: Return type: Nothing.
-
astrobase.awsutils.
make_ec2_nodes
(security_groupid, subnet_id, keypair_name, iam_instance_profile_arn, launch_instances=1, ami='ami-04681a1dbd79675a5', instance='t3.micro', ebs_optimized=True, user_data=None, wait_until_up=True, client=None, raiseonfail=False)[source]¶ This makes new EC2 worker nodes.
This requires a security group ID attached to a VPC config and subnet, a keypair generated beforehand, and an IAM role ARN for the instance. See:
https://docs.aws.amazon.com/cli/latest/userguide/tutorial-ec2-ubuntu.html
Use user_data to launch tasks on instance launch.
Parameters: - security_groupid (str) – The security group ID of the AWS VPC where the instances will be launched.
- subnet_id (str) – The subnet ID of the AWS VPC where the instances will be launched.
- keypair_name (str) – The name of the keypair to be used to allow SSH access to all instances launched here. This corresponds to an already downloaded AWS keypair PEM file.
- iam_instance_profile_arn (str) – The ARN string corresponding to the AWS instance profile that describes the permissions the launched instances have to access other AWS resources. Set this up in AWS IAM.
- launch_instances (int) – The number of instances to launch in this request.
- ami (str) – The Amazon Machine Image ID that describes the OS the instances will use after launch. The default ID is Amazon Linux 2 in the US East region.
- instance (str) – The instance type to launch. See the following URL for a list of IDs: https://aws.amazon.com/ec2/pricing/on-demand/
- ebs_optimized (bool) – If True, will enable EBS optimization to speed up IO. This is usually True for all instances made available in the last couple of years.
- user_data (str or None) – This is either the path to a file on disk that contains a shell-script or a string containing a shell-script that will be executed by root right after the instance is launched. Use to automatically set up workers and queues. If None, will not execute anything at instance start up.
- wait_until_up (bool) – If True, will not return from this function until all launched instances are verified as running by AWS.
- client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
- raiseonfail (bool) – If True, will re-raise whatever Exception caused the operation to fail and break out immediately.
Returns: Returns launched instance info as a dict, keyed by instance ID.
Return type: dict
-
astrobase.awsutils.
delete_ec2_nodes
(instance_id_list, client=None)[source]¶ This deletes EC2 nodes and terminates the instances.
Parameters: - instance_id_list (list of str) – A list of EC2 instance IDs to terminate.
- client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
Returns: Return type: Nothing.
-
astrobase.awsutils.
make_spot_fleet_cluster
(security_groupid, subnet_id, keypair_name, iam_instance_profile_arn, spot_fleet_iam_role, target_capacity=20, spot_price=0.4, expires_days=7, allocation_strategy='lowestPrice', instance_types=['m5.xlarge', 'm5.2xlarge', 'c5.xlarge', 'c5.2xlarge', 'c5.4xlarge'], instance_weights=None, instance_ami='ami-04681a1dbd79675a5', instance_user_data=None, instance_ebs_optimized=True, wait_until_up=True, client=None, raiseonfail=False)[source]¶ This makes an EC2 spot-fleet cluster.
This requires a security group ID attached to a VPC config and subnet, a keypair generated beforehand, and an IAM role ARN for the instance. See:
https://docs.aws.amazon.com/cli/latest/userguide/tutorial-ec2-ubuntu.html
Use user_data to launch tasks on instance launch.
Parameters: - security_groupid (str) – The security group ID of the AWS VPC where the instances will be launched.
- subnet_id (str) – The subnet ID of the AWS VPC where the instances will be launched.
- keypair_name (str) – The name of the keypair to be used to allow SSH access to all instances launched here. This corresponds to an already downloaded AWS keypair PEM file.
- iam_instance_profile_arn (str) – The ARN string corresponding to the AWS instance profile that describes the permissions the launched instances have to access other AWS resources. Set this up in AWS IAM.
- spot_fleet_iam_role (str) – This is the name of AWS IAM role that allows the Spot Fleet Manager to scale up and down instances based on demand and instances failing, etc. Set this up in IAM.
- target_capacity (int) – The number of instances to target in the fleet request. The fleet manager service will attempt to maintain this number over the lifetime of the Spot Fleet Request.
- spot_price (float) – The bid price in USD for the instances. This is per hour. Keep this at about half the hourly on-demand price of the desired instances to make sure your instances aren’t taken away by AWS when it needs capacity.
- expires_days (int) – The number of days this request is active for. All instances launched by this request will live at least this long and will be terminated automatically after.
- allocation_strategy ({'lowestPrice', 'diversified'}) – The allocation strategy used by the fleet manager.
- instance_types (list of str) – List of the instance type to launch. See the following URL for a list of IDs: https://aws.amazon.com/ec2/pricing/on-demand/
- instance_weights (list of float or None) – If instance_types is a list of different instance types, this is the relative weight applied towards launching each instance type. This can be used to launch a mix of instances in a defined ratio among their types. Doing this can make the spot fleet more resilient to AWS taking back the instances if it runs out of capacity.
- instance_ami (str) – The Amazon Machine Image ID that describes the OS the instances will use after launch. The default ID is Amazon Linux 2 in the US East region.
- instance_user_data (str or None) – This is either the path to a file on disk that contains a shell-script or a string containing a shell-script that will be executed by root right after the instance is launched. Use to automatically set up workers and queues. If None, will not execute anything at instance start up.
- instance_ebs_optimized (bool) – If True, will enable EBS optimization to speed up IO. This is usually True for all instances made available in the last couple of years.
- wait_until_up (bool) – If True, will not return from this function until the spot fleet request is acknowledged by AWS.
- client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
- raiseonfail (bool) – If True, will re-raise whatever Exception caused the operation to fail and break out immediately.
Returns: This is the spot fleet request ID if successful. Otherwise, returns None.
Return type: str or None
-
astrobase.awsutils.
delete_spot_fleet_cluster
(spot_fleet_reqid, client=None)[source]¶ This deletes a spot-fleet cluster.
Parameters: - spot_fleet_reqid (str) – The fleet request ID returned by make_spot_fleet_cluster.
- client (boto3.Client or None) – If None, this function will instantiate a new boto3.Client object to use in its operations. Alternatively, pass in an existing boto3.Client instance to re-use it here.
Returns: Return type: Nothing.
astrobase.coordutils module¶
Contains various useful tools for coordinate conversion, etc.
-
astrobase.coordutils.
angle_wrap
(angle, radians=False)[source]¶ Wraps the input angle to 360.0 degrees.
Parameters: - angle (float) – The angle to wrap around 360.0 deg.
- radians (bool) – If True, will assume that the input is in radians. The output will then also be in radians.
Returns: Wrapped angle. If radians is True: input is assumed to be in radians, output is also in radians.
Return type: float
-
astrobase.coordutils.
decimal_to_dms
(decimal_value)[source]¶ Converts from decimal degrees (for declination coords) to DD:MM:SS.
Parameters: decimal_value (float) – A decimal value to convert to degrees, minutes, seconds sexagesimal format. Returns: A four element tuple is returned: (sign, HH, MM, SS.ssss…) Return type: tuple
-
astrobase.coordutils.
decimal_to_hms
(decimal_value)[source]¶ Converts from decimal degrees (for RA coords) to HH:MM:SS.
Parameters: decimal_value (float) – A decimal value to convert to hours, minutes, seconds. Negative values will be wrapped around 360.0. Returns: A three element tuple is returned: (HH, MM, SS.ssss…) Return type: tuple
-
astrobase.coordutils.
hms_str_to_tuple
(hms_string)[source]¶ Converts a string of the form HH:MM:SS or HH MM SS to a tuple of the form (HH, MM, SS).
Parameters: hms_string (str) – A RA coordinate string of the form ‘HH:MM:SS.sss’ or ‘HH MM SS.sss’. Returns: A three element tuple is returned (HH, MM, SS.ssss…) Return type: tuple
-
astrobase.coordutils.
dms_str_to_tuple
(dms_string)[source]¶ Converts a string of the form [+-]DD:MM:SS or [+-]DD MM SS to a tuple of the form (sign, DD, MM, SS).
Parameters: dms_string (str) – A declination coordinate string of the form ‘[+-]DD:MM:SS.sss’ or ‘[+-]DD MM SS.sss’. The sign in front of DD is optional. If it’s not there, this function will assume that the coordinate string is a positive value. Returns: A four element tuple of the form: (sign, DD, MM, SS.ssss…). Return type: tuple
-
astrobase.coordutils.
hms_str_to_decimal
(hms_string)[source]¶ Converts a HH:MM:SS string to decimal degrees.
Parameters: hms_string (str) – A right ascension coordinate string of the form: ‘HH:MM:SS.sss’ or ‘HH MM SS.sss’. Returns: The RA value in decimal degrees (wrapped around 360.0 deg if necessary.) Return type: float
-
astrobase.coordutils.
dms_str_to_decimal
(dms_string)[source]¶ Converts a DD:MM:SS string to decimal degrees.
Parameters: dms_string (str) – A declination coordinate string of the form: ‘[+-]DD:MM:SS.sss’ or ‘[+-]DD MM SS.sss’. Returns: The declination value in decimal degrees. Return type: float
-
astrobase.coordutils.
hms_to_decimal
(hours, minutes, seconds, returndeg=True)[source]¶ Converts from HH, MM, SS to a decimal value.
Parameters: - hours (int) – The HH part of a RA coordinate.
- minutes (int) – The MM part of a RA coordinate.
- seconds (float) – The SS.sss part of a RA coordinate.
- returndeg (bool) – If this is True, then will return decimal degrees as the output. If this is False, then will return decimal HOURS as the output. Decimal hours are sometimes used in FITS headers.
Returns: The right ascension value in either decimal degrees or decimal hours depending on returndeg.
Return type: float
-
astrobase.coordutils.
dms_to_decimal
(sign, degrees, minutes, seconds)[source]¶ Converts from DD:MM:SS to a decimal value.
Parameters: - sign ({'+', '-', ''}) – The sign part of a Dec coordinate.
- degrees (int) – The DD part of a Dec coordinate.
- minutes (int) – The MM part of a Dec coordinate.
- seconds (float) – The SS.sss part of a Dec coordinate.
Returns: The declination value in decimal degrees.
Return type: float
-
astrobase.coordutils.
great_circle_dist
(ra1, dec1, ra2, dec2)[source]¶ Calculates the great circle angular distance between two coords.
This calculates the great circle angular distance in arcseconds between two coordinates (ra1,dec1) and (ra2,dec2). This is basically a clone of GCIRC from the IDL Astrolib.
Parameters: - ra1,dec1 (float or array-like) – The first coordinate’s right ascension and declination value(s) in decimal degrees.
- ra2,dec2 (float or array-like) – The second coordinate’s right ascension and declination value(s) in decimal degrees.
Returns: Great circle distance between the two coordinates in arseconds.
Return type: float or array-like
Notes
If (ra1, dec1) is scalar and (ra2, dec2) is scalar: the result is a float distance in arcseconds.
If (ra1, dec1) is scalar and (ra2, dec2) is array-like: the result is an np.array with distance in arcseconds between (ra1, dec1) and each element of (ra2, dec2).
If (ra1, dec1) is array-like and (ra2, dec2) is scalar: the result is an np.array with distance in arcseconds between (ra2, dec2) and each element of (ra1, dec1).
If (ra1, dec1) and (ra2, dec2) are both array-like: the result is an np.array with the pair-wise distance in arcseconds between each element of the two coordinate lists. In this case, if the input array-likes are not the same length, then excess elements of the longer one will be ignored.
-
astrobase.coordutils.
xmatch_basic
(ra1, dec1, ra2, dec2, match_radius=5.0)[source]¶ Finds the closest object in (ra2, dec2) to scalar coordinate pair (ra1, dec1) and returns the distance in arcseconds.
This is a quick matcher that uses the great_circle_dist function to find the closest object in (ra2, dec2) within match_radius arcseconds to (ra1, dec1). (ra1, dec1) must be a scalar pair, while (ra2, dec2) must be array-likes of the same lengths.
Parameters: - ra1,dec1 (float) – Coordinate of the object to find matches to. In decimal degrees.
- ra2,dec2 (array-like) – The coordinates that will be searched for matches. In decimal degrees.
- match_radius (float) – The match radius in arcseconds to use for the match.
Returns: A two element tuple like the following:
(True -> no match found or False -> found a match, minimum distance between target and list in arcseconds)
Return type: tuple
-
astrobase.coordutils.
xmatch_neighbors
(ra1, dec1, ra2, dec2, match_radius=60.0, includeself=False, sortresults=True)[source]¶ Finds the closest objects in (ra2, dec2) to scalar coordinate pair (ra1, dec1) and returns the indices of the objects that match.
This is a quick matcher that uses the great_circle_dist function to find the closest object in (ra2, dec2) within match_radius arcseconds to (ra1, dec1). (ra1, dec1) must be a scalar pair, while (ra2, dec2) must be array-likes of the same lengths.
Parameters: - ra1,dec1 (float) – Coordinate of the object to find matches to. In decimal degrees.
- ra2,dec2 (array-like) – The coordinates that will be searched for matches. In decimal degrees.
- match_radius (float) – The match radius in arcseconds to use for the match.
- includeself (bool) – If this is True, the object itself will be included in the match results.
- sortresults (bool) – If this is True, the match indices will be sorted by distance.
Returns: A tuple like the following is returned:
(True -> matches found or False -> no matches found, minimum distance between target and list, np.array of indices where list of coordinates is closer than `match_radius` arcseconds from the target, np.array of distances in arcseconds)
Return type: tuple
-
astrobase.coordutils.
make_kdtree
(ra, decl)[source]¶ This makes a scipy.spatial.CKDTree on (ra, decl).
Parameters: ra,decl (array-like) – The right ascension and declination coordinate pairs in decimal degrees. Returns: The cKDTRee object generated by this function is returned and can be used to run various spatial queries. Return type: scipy.spatial.CKDTree
-
astrobase.coordutils.
conesearch_kdtree
(kdtree, racenter, declcenter, searchradiusdeg, conesearchworkers=1)[source]¶ This does a cone-search around (racenter, declcenter) in kdtree.
Parameters: - kdtree (scipy.spatial.CKDTree) – This is a kdtree object generated by the make_kdtree function.
- racenter,declcenter (float or array-like) – This is the center coordinate to run the cone-search around in decimal degrees. If this is an np.array, will search for all coordinate pairs in the array.
- searchradiusdeg (float) – The search radius to use for the cone-search in decimal degrees.
- conesearchworkers (int) – The number of parallel workers to launch for the cone-search.
Returns: If (racenter, declcenter) is a single coordinate, this will return a list of the indices of the matching objects in the kdtree. If (racenter, declcenter) are array-likes, this will return an object array containing lists of matching object indices for each coordinate searched.
Return type: list or np.array of lists
-
astrobase.coordutils.
xmatch_kdtree
(kdtree, extra, extdecl, xmatchdistdeg, closestonly=True)[source]¶ This cross-matches between kdtree and (extra, extdecl) arrays.
Returns the indices of the kdtree and the indices of extra, extdecl that xmatch successfully.
Parameters: - kdtree (scipy.spatial.CKDTree) – This is a kdtree object generated by the make_kdtree function.
- extra,extdecl (array-like) – These are np.arrays of ‘external’ coordinates in decimal degrees that will be cross-matched against the objects in kdtree.
- xmatchdistdeg (float) – The match radius to use for the cross-match in decimal degrees.
- closestonly (bool) – If closestonly is True, then this function returns only the closest matching indices in (extra, extdecl) for each object in kdtree if there are any matches. Otherwise, it returns a list of indices in (extra, extdecl) for all matches within xmatchdistdeg between kdtree and (extra, extdecl).
Returns: Returns a tuple of the form:
(list of `kdtree` indices matching to external objects, list of all `extra`/`extdecl` indices that match to each element in `kdtree` within the specified cross-match distance)
Return type: tuple of lists
-
astrobase.coordutils.
total_proper_motion
(pmra, pmdecl, decl)[source]¶ This calculates the total proper motion of an object.
Parameters: - pmra (float or array-like) – The proper motion(s) in right ascension, measured in mas/yr.
- pmdecl (float or array-like) – The proper motion(s) in declination, measured in mas/yr.
- decl (float or array-like) – The declination of the object(s) in decimal degrees.
Returns: The total proper motion(s) of the object(s) in mas/yr.
Return type: float or array-like
-
astrobase.coordutils.
reduced_proper_motion
(mag, propermotion)[source]¶ This calculates the reduced proper motion using the mag measurement provided.
Parameters: - mag (float or array-like) – The magnitude(s) to use to calculate the reduced proper motion(s).
- propermotion (float or array-like) – The total proper motion of the object(s). Use the total_proper_motion function to calculate this if you have pmra, pmdecl, and decl values. propermotion should be in mas/yr.
Returns: The reduced proper motion for the object(s). This is effectively a measure of the absolute magnitude in the band provided.
Return type: float or array-like
-
astrobase.coordutils.
equatorial_to_galactic
(ra, decl, equinox='J2000')[source]¶ This converts from equatorial coords to galactic coords.
Parameters: - ra (float or array-like) – Right ascension values(s) in decimal degrees.
- decl (float or array-like) – Declination value(s) in decimal degrees.
- equinox (str) – The equinox that the coordinates are measured at. This must be recognizable by Astropy’s SkyCoord class.
Returns: The galactic coordinates (l, b) for each element of the input (ra, decl).
Return type: tuple of (float, float) or tuple of (np.array, np.array)
-
astrobase.coordutils.
galactic_to_equatorial
(gl, gb)[source]¶ This converts from galactic coords to equatorial coordinates.
Parameters: - gl (float or array-like) – Galactic longitude values(s) in decimal degrees.
- gb (float or array-like) – Galactic latitude value(s) in decimal degrees.
Returns: The equatorial coordinates (RA, DEC) for each element of the input (gl, gb) in decimal degrees. These are reported in the ICRS frame.
Return type: tuple of (float, float) or tuple of (np.array, np.array)
-
astrobase.coordutils.
xieta_from_radecl
(inra, indecl, incenterra, incenterdecl, deg=True)[source]¶ This returns the image-plane projected xi-eta coords for inra, indecl.
Parameters: - inra,indecl (array-like) – The equatorial coordinates to get the xi, eta coordinates for in decimal degrees or radians.
- incenterra,incenterdecl (float) – The center coordinate values to use to calculate the plane-projected coordinates around.
- deg (bool) – If this is True, the input angles are assumed to be in degrees and the output is in degrees as well.
Returns: This is the (xi, eta) coordinate pairs corresponding to the image-plane projected coordinates for each pair of input equatorial coordinates in (inra, indecl).
Return type: tuple of np.arrays
astrobase.emailutils module¶
This is a small utility module to send email using an SMTP server that requires logins. The email settings are stored in a file called .emailsettings that should be located in the ~/.astrobase/ directory in your home directory. This file should have permissions 0600 (so only you can read/write to it), and should contain the following info in a single row, separated by the | character:
<email user>|<email password>|<email server>
Example:
exampleuser@email.com|correcthorsebatterystaple|mail.example.com
NOTE: This assumes the email server uses STARTTLS encryption and listens on SMTP port 587. Most email servers support this.
-
astrobase.emailutils.
send_email
(sender, subject, content, email_recipient_list, email_address_list, email_user=None, email_pass=None, email_server=None)[source]¶ This sends an email to addresses, informing them about events.
The email account settings are retrieved from the settings file as described above.
Parameters: - sender (str) – The name of the sender to use in the email header.
- subject (str) – Subject of the email.
- content (str) – Content of the email.
- list (email_recipient) – This is a list of email recipient names of the form: [‘Example Person 1’, ‘Example Person 1’, …]
- list – This is a list of email recipient addresses of the form: [‘example1@example.com’, ‘example2@example.org’, …]
- email_user (str) – The username of the email server account that will send the emails. If this is None, the value of EMAIL_USER from the ~/.astrobase/.emailsettings file will be used. If that is None as well, this function won’t work.
- email_pass (str) – The password of the email server account that will send the emails. If this is None, the value of EMAIL_PASS from the ~/.astrobase/.emailsettings file will be used. If that is None as well, this function won’t work.
- email_server (str) – The address of the email server that will send the emails. If this is None, the value of EMAIL_USER from the ~/.astrobase/.emailsettings file will be used. If that is None as well, this function won’t work.
Returns: True if email sending succeeded. False if email sending failed.
Return type: bool
astrobase.gcputils module¶
This contains useful functions to set up Google Cloud Platform services for use with lcproc_gcp.py.
-
astrobase.gcputils.
make_gce_instances
()[source]¶ This makes new GCE worker nodes.
Use preemptible instances and startup/shutdown scripts to emulate AWS spot fleet behavior and run stuff at cheaper prices.
TODO: finish this
-
astrobase.gcputils.
gcs_get_file
(bucketname, filename, local_file, altexts=None, client=None, service_account_json=None, raiseonfail=False)[source]¶ This gets a single file from a Google Cloud Storage bucket.
Parameters: - bucketname (str) – The name of the GCS bucket to download the file from.
- filename (str) – The full name of the file to download, including all prefixes.
- local_file (str) – Path to where the downloaded file will be stored.
- altexts (None or list of str) – If not None, this is a list of alternate extensions to try for the file other than the one provided in filename. For example, to get anything that’s an .sqlite where .sqlite.gz is expected, use altexts=[‘’] to strip the .gz.
- client (google.cloud.storage.Client instance) – The instance of the Client to use to perform the download operation. If this is None, a new Client will be used. If this is None and service_account_json points to a downloaded JSON file with GCS credentials, a new Client with the provided credentials will be used. If this is not None, the existing Client instance will be used.
- service_account_json (str) – Path to a downloaded GCS credentials JSON file.
- raiseonfail (bool) – If True, will re-raise whatever Exception caused the operation to fail and break out immediately.
Returns: Path to the downloaded filename or None if the download was unsuccessful.
Return type: str
-
astrobase.gcputils.
gcs_get_url
(url, altexts=None, client=None, service_account_json=None, raiseonfail=False)[source]¶ This gets a single file from a Google Cloud Storage bucket.
This uses the gs:// URL instead of a bucket name and key.
Parameters: - url (str) – GCS URL to download. This should begin with ‘gs://’.
- altexts (None or list of str) – If not None, this is a list of alternate extensions to try for the file other than the one provided in filename. For example, to get anything that’s an .sqlite where .sqlite.gz is expected, use altexts=[‘’] to strip the .gz.
- client (google.cloud.storage.Client instance) – The instance of the Client to use to perform the download operation. If this is None, a new Client will be used. If this is None and service_account_json points to a downloaded JSON file with GCS credentials, a new Client with the provided credentials will be used. If this is not None, the existing Client instance will be used.
- service_account_json (str) – Path to a downloaded GCS credentials JSON file.
- raiseonfail (bool) – If True, will re-raise whatever Exception caused the operation to fail and break out immediately.
Returns: Path to the downloaded filename or None if the download was unsuccessful.
Return type: str
-
astrobase.gcputils.
gcs_put_file
(local_file, bucketname, service_account_json=None, client=None, raiseonfail=False)[source]¶ This puts a single file into a Google Cloud Storage bucket.
Parameters: - local_file (str) – Path to the file to upload to GCS.
- bucket (str) – The GCS bucket to upload the file to.
- service_account_json (str) – Path to a downloaded GCS credentials JSON file.
- client (google.cloud.storage.Client instance) – The instance of the Client to use to perform the download operation. If this is None, a new Client will be used. If this is None and service_account_json points to a downloaded JSON file with GCS credentials, a new Client with the provided credentials will be used. If this is not None, the existing Client instance will be used.
- raiseonfail (bool) – If True, will re-raise whatever Exception caused the operation to fail and break out immediately.
Returns: If the file upload is successful, returns the gs:// URL of the uploaded file. If it failed, will return None.
Return type: str or None
-
astrobase.gcputils.
gps_create_topic
()[source]¶ This creates a Google Pub/Sub topic.
TODO: finish this
-
astrobase.gcputils.
gps_delete_topic
()[source]¶ This deletes a Google Pub/Sub topic.
TODO: finish this
astrobase.lcdb module¶
Serves as a lightweight PostgreSQL DB interface for other modules in this project.
-
class
astrobase.lcdb.
LCDB
(database=None, user=None, password=None, host=None)[source]¶ Bases:
object
This is an object serving as an interface to a PostgreSQL DB.
LCDB’s main purpose is to avoid creating new postgres connections for each query; these are relatively expensive. Instead, we get new cursors when needed, and then pass these around as needed.
-
database
¶ Name of the database to connect to.
Type: str
-
user
¶ User name of the database server user.
Type: str
-
password
¶ Password for the database server user.
Type: str
-
host
¶ Database hostname or IP address to connect to.
Type: str
-
connection
¶ The underlying connection to the database.
Type: psycopg2.Connection object
-
cursors
¶ The keys of this dict are random hash strings, the values of this dict are the actual Cursor objects.
Type: dict of psycopg2.Cursor objects
-
open
(database, user, password, host)[source]¶ This opens a new database connection.
Parameters: - database (str) – Name of the database to connect to.
- user (str) – User name of the database server user.
- password (str) – Password for the database server user.
- host (str) – Database hostname or IP address to connect to.
-
open_default
()[source]¶ This opens the database connection using the default database parameters given in the ~/.astrobase/astrobase.conf file.
-
autocommit
()[source]¶ This sets the database connection to autocommit. Must be called before any cursors have been instantiated.
-
cursor
(handle, dictcursor=False)[source]¶ This gets or creates a DB cursor for the current DB connection.
Parameters: - handle (str) – The name of the cursor to look up in the existing list or if it doesn’t exist, the name to be used for a new cursor to be returned.
- dictcursor (bool) – If True, returns a cursor where each returned row can be addressed as a dictionary by column name.
Returns: Return type: psycopg2.Cursor instance
-
newcursor
(dictcursor=False)[source]¶ This creates a DB cursor for the current DB connection using a randomly generated handle. Returns a tuple with cursor and handle.
Parameters: dictcursor (bool) – If True, returns a cursor where each returned row can be addressed as a dictionary by column name. Returns: The tuple is of the form (handle, psycopg2.Cursor instance). Return type: tuple
-
astrobase.magnitudes module¶
Contains various useful functions for converting between magnitude systems.
-
astrobase.magnitudes.
convert_constants
(jmag, hmag, kmag, cjhk, cjh, cjk, chk, cj, ch, ck)[source]¶ This converts between JHK and BVRI/SDSS mags.
Not meant to be used directly. See the functions below for more sensible interface. This function does the grunt work of converting from JHK to either BVRI or SDSS ugriz. while taking care of missing values for any of jmag, hmag, or kmag.
Parameters: - jmag,hmag,kmag (float) – 2MASS J, H, Ks mags to use to convert.
- cjhk,cjh,cjk,chk,cj,ch,ck (lists) – Constants to use when converting.
Returns: The converted magnitude in SDSS or BVRI system.
Return type: float
-
astrobase.magnitudes.
jhk_to_bmag
(jmag, hmag, kmag)[source]¶ Converts given J, H, Ks mags to a B magnitude value.
Parameters: jmag,hmag,kmag (float) – 2MASS J, H, Ks mags of the object. Returns: The converted B band magnitude. Return type: float
-
astrobase.magnitudes.
jhk_to_vmag
(jmag, hmag, kmag)[source]¶ Converts given J, H, Ks mags to a V magnitude value.
Parameters: jmag,hmag,kmag (float) – 2MASS J, H, Ks mags of the object. Returns: The converted V band magnitude. Return type: float
-
astrobase.magnitudes.
jhk_to_rmag
(jmag, hmag, kmag)[source]¶ Converts given J, H, Ks mags to an R magnitude value.
Parameters: jmag,hmag,kmag (float) – 2MASS J, H, Ks mags of the object. Returns: The converted R band magnitude. Return type: float
-
astrobase.magnitudes.
jhk_to_imag
(jmag, hmag, kmag)[source]¶ Converts given J, H, Ks mags to an I magnitude value.
Parameters: jmag,hmag,kmag (float) – 2MASS J, H, Ks mags of the object. Returns: The converted I band magnitude. Return type: float
-
astrobase.magnitudes.
jhk_to_sdssu
(jmag, hmag, kmag)[source]¶ Converts given J, H, Ks mags to an SDSS u magnitude value.
Parameters: jmag,hmag,kmag (float) – 2MASS J, H, Ks mags of the object. Returns: The converted SDSS u band magnitude. Return type: float
-
astrobase.magnitudes.
jhk_to_sdssg
(jmag, hmag, kmag)[source]¶ Converts given J, H, Ks mags to an SDSS g magnitude value.
Parameters: jmag,hmag,kmag (float) – 2MASS J, H, Ks mags of the object. Returns: The converted SDSS g band magnitude. Return type: float
-
astrobase.magnitudes.
jhk_to_sdssr
(jmag, hmag, kmag)[source]¶ Converts given J, H, Ks mags to an SDSS r magnitude value.
Parameters: jmag,hmag,kmag (float) – 2MASS J, H, Ks mags of the object. Returns: The converted SDSS r band magnitude. Return type: float
-
astrobase.magnitudes.
jhk_to_sdssi
(jmag, hmag, kmag)[source]¶ Converts given J, H, Ks mags to an SDSS i magnitude value.
Parameters: jmag,hmag,kmag (float) – 2MASS J, H, Ks mags of the object. Returns: The converted SDSS i band magnitude. Return type: float
-
astrobase.magnitudes.
jhk_to_sdssz
(jmag, hmag, kmag)[source]¶ Converts given J, H, Ks mags to an SDSS z magnitude value.
Parameters: jmag,hmag,kmag (float) – 2MASS J, H, Ks mags of the object. Returns: The converted SDSS z band magnitude. Return type: float
-
astrobase.magnitudes.
absolute_gaia_magnitude
(gaia_mag, gaia_parallax_mas, gaia_mag_err=None, gaia_parallax_err_mas=None)[source]¶ Calculates the GAIA absolute magnitude for object (or array of objects).
Given a G mag and the parallax measured by GAIA, gets the absolute mag using the usual equation:
G - M_G = 5 x log10(d_pc) - 5 M_G = 5 - 5log10(d_pc) + G
Parameters: - gaia_mag (float or array-like) – The measured GAIA G magnitude.
- gaia_parallax_max (float or array-like) – The measured parallax of the object in mas.
- gaia_mag_err (float or array-like or None) – The measurement error in GAIA G magnitude.
- gaia_parallax_err_mas (float or array-like or None) – The measurement error in GAIA parallax in mas.
Returns: - float or array-like – The absolute magnitude M_G of the object(s).
- If both _err input kwargs are provided, will return a tuple of the form:: – (M_G float or array-like, M_G_err float or array-like)
astrobase.timeutils module¶
Contains various useful tools for dealing with time in astronomical contexts.
-
astrobase.timeutils.
precess_coordinates
(ra, dec, epoch_one, epoch_two, jd=None, mu_ra=0.0, mu_dec=0.0, outscalar=False)[source]¶ Precesses target coordinates ra, dec from epoch_one to epoch_two.
This takes into account the jd of the observations, as well as the proper motion of the target mu_ra, mu_dec. Adapted from J. D. Hartman’s VARTOOLS/converttime.c [coordprecess].
Parameters: - ra,dec (float) – The equatorial coordinates of the object at epoch_one to precess in decimal degrees.
- epoch_one (float) – Origin epoch to precess from to target epoch. This is a float, like: 1985.0, 2000.0, etc.
- epoch_two (float) – Target epoch to precess from origin epoch. This is a float, like: 2000.0, 2018.0, etc.
- jd (float) – The full Julian date to use along with the propermotions in mu_ra, and mu_dec to handle proper motion along with the coordinate frame precession. If one of jd, mu_ra, or mu_dec is missing, the proper motion will not be used to calculate the final precessed coordinates.
- mu_ra,mu_dec (float) – The proper motion in mas/yr in right ascension and declination. If these are provided along with jd, the total proper motion of the object will be taken into account to calculate the final precessed coordinates.
- outscalar (bool) – If True, converts the output coordinates from one-element np.arrays to scalars.
Returns: precessed_ra, precessed_dec – A tuple of precessed equatorial coordinates in decimal degrees at epoch_two taking into account proper motion if jd, mu_ra, and mu_dec are provided.
Return type: float
-
astrobase.timeutils.
get_epochs_given_midtimes_and_period
(t_mid, period, err_t_mid=None, t0_fixed=None, t0_percentile=None, verbose=False)[source]¶ This calculates the future epochs for a transit, given a period and a starting epoch
The equation used is:
t_mid = period*epoch + t0
Default behavior if no kwargs are used is to define t0 as the median finite time of the passed t_mid array.
Only one of err_t_mid or t0_fixed should be passed.
Parameters: - t_mid (np.array) – A np.array of transit mid-time measurements
- period (float) – The period used to calculate epochs, per the equation above. For typical use cases, a period precise to ~1e-5 days is sufficient to get correct epochs.
- err_t_mid (None or np.array) – If provided, contains the errors of the transit mid-time measurements. The zero-point epoch is then set equal to the average of the transit times, weighted as 1/err_t_mid^2 . This minimizes the covariance between the transit epoch and the period (e.g., Gibson et al. 2013). For standard O-C analysis this is the best method.
- t0_fixed (None or float:) – If provided, use this t0 as the starting epoch. (Overrides all others).
- t0_percentile (None or float) – If provided, use this percentile of t_mid to define t0.
Returns: This is the of the form (integer_epoch_array, t0). integer_epoch_array is an array of integer epochs (float-type), of length equal to the number of finite mid-times passed.
Return type: tuple
-
astrobase.timeutils.
unixtime_to_jd
(unix_time)[source]¶ This converts UNIX time in seconds to a Julian date in UTC (JD_UTC).
Parameters: unix_time (float) – A UNIX time in decimal seconds since the 1970 UNIX epoch. Returns: jd – The Julian date corresponding to the provided UNIX time. Return type: float
-
astrobase.timeutils.
datetime_to_jd
(dt)[source]¶ This converts a Python datetime object (naive, time in UT) to JD_UTC.
Parameters: dt (datetime) – A naive Python datetime object (e.g. with no tz attribute) measured at UTC. Returns: jd – The Julian date corresponding to the datetime object. Return type: float
-
astrobase.timeutils.
jd_to_datetime
(jd, returniso=False)[source]¶ This converts a UTC JD to a Python datetime object or ISO date string.
Parameters: - jd (float) – The Julian date measured at UTC.
- returniso (bool) – If False, returns a naive Python datetime object corresponding to jd. If True, returns the ISO format string corresponding to the date and time at UTC from jd.
Returns: Depending on the value of returniso.
Return type: datetime or str
-
astrobase.timeutils.
jd_now
()[source]¶ Gets the Julian date at the current time.
Returns: The current Julian date in days. Return type: float
-
astrobase.timeutils.
jd_to_mjd
(jd)[source]¶ Converts Julian Date to Modified Julian Date.
Parameters: jd (float) – The Julian date measured at UTC. Returns: mjd – mjd = jd - 2400000.5 Return type: float
-
astrobase.timeutils.
mjd_to_jd
(mjd)[source]¶ Converts Modified Julian date to Julian Date.
Parameters: mjd (float) – The Modified Julian date measured at UTC. Returns: jd – jd = mjd + 2400000.5 Return type: float
-
astrobase.timeutils.
jd_corr
(jd, ra, dec, obslon=None, obslat=None, obsalt=None, jd_type='bjd')[source]¶ Returns BJD_TDB or HJD_TDB for input JD_UTC.
The equation used is:
BJD_TDB = JD_UTC + JD_to_TDB_corr + romer_delay
where:
- JD_to_TDB_corr is the difference between UTC and TDB JDs
- romer_delay is the delay caused by finite speed of light from Earth-Sun
This is based on the code at:
https://mail.scipy.org/pipermail/astropy/2014-April/003148.html
Note that this does not correct for:
- precession of coordinates if the epoch is not 2000.0
- precession of coordinates if the target has a proper motion
- Shapiro delay
- Einstein delay
Parameters: - jd (float or array-like) – The Julian date(s) measured at UTC.
- ra,dec (float) – The equatorial coordinates of the object in decimal degrees.
- obslon,obslat,obsalt (float or None) – The longitude, latitude of the observatory in decimal degrees and altitude of the observatory in meters. If these are not provided, the corrected JD will be calculated with respect to the center of the Earth.
- jd_type ({'bjd','hjd'}) – Conversion type to perform, either to Baryocentric Julian Date (‘bjd’) or to Heliocenter Julian Date (‘hjd’).
Returns: The converted BJD or HJD.
Return type: float or np.array
Subpackages¶
astrobase.fakelcs package¶
This contains various modules that run variable star classification and characterize its reliability and completeness via simulating LCs:
astrobase.fakelcs.generation
: fake light curve generation and injection of variability.astrobase.fakelcs.recovery
: recovery of fake light curve variability and periodic variable stars.
This generates light curves of variable stars using the astrobase.lcmodels package, adds noise and observation sampling to them based on given parameters (or example light curves). See fakelcrecovery.py for functions that run a full recovery simulation.
NOTE 1: the parameters for these light curves are currently all chosen from uniform distributions, which obviously doesn’t reflect reality (some of the current parameter upper and lower limits are realistic, however). Some of these will be updated with real-life distributions as soon as I find them, especially for periods and amplitudes (along with references).
NOTE 2: To generate custom distributions, one can subclass scipy.stats.rv_continuous and override the _pdf and _cdf methods (or just the _rvs method directly to get the distributed variables if distribution’s location and scale don’t really matter). This is described here:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_continuous.html
and doesn’t seem to be restricted to distributions described by analytic functions only. It’s probably possible to get a histogram of some complex distribution in parameter space and use a kernel-density estimator to get the PDF and CDF, e.g.:
http://scikit-learn.org/stable/modules/density.html#kernel-density.
NOTE 3: any distribution parameter below having to do with magnitudes/flux is by default in MAGNITUDES. So depth, amplitude, etc. distributions and their limits will have to be adjusted appropriately for fluxes. IT IS NOT SUFFICIENT to just set magsarefluxes = True.
FIXME: in the future, we’ll do all the amplitude, etc. distributions in differential fluxes canonically, and then take logs where appropriate if magsarefluxes = False.
FIXME: we should add RA/DEC values that are taken from GAIA if we provide a radec box for the simulation to take place in. in this way, we can parameterize blending and take it into account in the recovery as well.
FIXME: check if object coordinates end up so that two or more objects lie within a chosen blend radius. if this happens, we should check if the blender(s) are variable, and add in some fraction of their phased light curve to the blendee. if the blender(s) are not variable, add in a constant fraction of the brightness to the blendee’s light curve. the blending fraction is multiplied into the light curve of the blender(s) and the resulting flux added to the blendee’s light curve.
- given the FWHM of the instrument, figure out the overlap
- we need to calculate the pixel area for blendee and the sum of pixel areas covered by the blenders. this will require input kwargs for pixel size of the detector and FWHM of the star (this might need to be calculated based on the brightness of the star)
FIXME: add input from TRILEGAL produced .dat files for color and mag information. This will let us generate pulsating variables with their actual colors.
-
astrobase.fakelcs.generation.
generate_transit_lightcurve
(times, mags=None, errs=None, paramdists={'transitdepth': <scipy.stats._distn_infrastructure.rv_frozen object>, 'transitduration': <scipy.stats._distn_infrastructure.rv_frozen object>, 'transitperiod': <scipy.stats._distn_infrastructure.rv_frozen object>}, magsarefluxes=False)[source]¶ This generates fake planet transit light curves.
Parameters: - times (np.array) – This is an array of time values that will be used as the time base.
- mags,errs (np.array) – These arrays will have the model added to them. If either is None, np.full_like(times, 0.0) will used as a substitute and the model light curve will be centered around 0.0.
- paramdists (dict) –
This is a dict containing parameter distributions to use for the model params, containing the following keys
{'transitperiod', 'transitdepth', 'transitduration'}
The values of these keys should all be ‘frozen’ scipy.stats distribution objects, e.g.:
https://docs.scipy.org/doc/scipy/reference/stats.html#continuous-distributions The variability epoch will be automatically chosen from a uniform distribution between times.min() and times.max().
The ingress duration will be automatically chosen from a uniform distribution ranging from 0.05 to 0.5 of the transitduration.
The transitdepth will be flipped automatically as appropriate if magsarefluxes=True.
- magsarefluxes (bool) – If the generated time series is meant to be a flux time-series, set this to True to get the correct sign of variability amplitude.
Returns: A dict of the form below is returned:
{'vartype': 'planet', 'params': {'transitperiod': generated value of period, 'transitepoch': generated value of epoch, 'transitdepth': generated value of transit depth, 'transitduration': generated value of transit duration, 'ingressduration': generated value of transit ingress duration}, 'times': the model times, 'mags': the model mags, 'errs': the model errs, 'varperiod': the generated period of variability == 'transitperiod' 'varamplitude': the generated amplitude of variability == 'transitdepth'}
Return type: dict
-
astrobase.fakelcs.generation.
generate_eb_lightcurve
(times, mags=None, errs=None, paramdists={'depthratio': <scipy.stats._distn_infrastructure.rv_frozen object>, 'pdepth': <scipy.stats._distn_infrastructure.rv_frozen object>, 'pduration': <scipy.stats._distn_infrastructure.rv_frozen object>, 'period': <scipy.stats._distn_infrastructure.rv_frozen object>, 'secphase': <scipy.stats._distn_infrastructure.rv_frozen object>}, magsarefluxes=False)[source]¶ This generates fake EB light curves.
Parameters: - times (np.array) – This is an array of time values that will be used as the time base.
- mags,errs (np.array) – These arrays will have the model added to them. If either is None, np.full_like(times, 0.0) will used as a substitute and the model light curve will be centered around 0.0.
- paramdists (dict) –
This is a dict containing parameter distributions to use for the model params, containing the following keys
{'period', 'pdepth', 'pduration', 'depthratio', 'secphase'}
The values of these keys should all be ‘frozen’ scipy.stats distribution objects, e.g.:
https://docs.scipy.org/doc/scipy/reference/stats.html#continuous-distributions The variability epoch will be automatically chosen from a uniform distribution between times.min() and times.max().
The pdepth will be flipped automatically as appropriate if magsarefluxes=True.
- magsarefluxes (bool) – If the generated time series is meant to be a flux time-series, set this to True to get the correct sign of variability amplitude.
Returns: A dict of the form below is returned:
{'vartype': 'EB', 'params': {'period': generated value of period, 'epoch': generated value of epoch, 'pdepth': generated value of priary eclipse depth, 'pduration': generated value of prim eclipse duration, 'depthratio': generated value of prim/sec eclipse depth ratio}, 'times': the model times, 'mags': the model mags, 'errs': the model errs, 'varperiod': the generated period of variability == 'period' 'varamplitude': the generated amplitude of variability == 'pdepth'}
Return type: dict
-
astrobase.fakelcs.generation.
generate_flare_lightcurve
(times, mags=None, errs=None, paramdists={'amplitude': <scipy.stats._distn_infrastructure.rv_frozen object>, 'decayconst': <scipy.stats._distn_infrastructure.rv_frozen object>, 'nflares': [1, 5], 'risestdev': <scipy.stats._distn_infrastructure.rv_frozen object>}, magsarefluxes=False)[source]¶ This generates fake flare light curves.
Parameters: - times (np.array) – This is an array of time values that will be used as the time base.
- mags,errs (np.array) – These arrays will have the model added to them. If either is None, np.full_like(times, 0.0) will used as a substitute and the model light curve will be centered around 0.0.
- paramdists (dict) –
This is a dict containing parameter distributions to use for the model params, containing the following keys
{'amplitude', 'nflares', 'risestdev', 'decayconst'}
The values of these keys should all be ‘frozen’ scipy.stats distribution objects, e.g.:
https://docs.scipy.org/doc/scipy/reference/stats.html#continuous-distributions The flare_peak_time for each flare will be generated automatically between times.min() and times.max() using a uniform distribution.
The amplitude will be flipped automatically as appropriate if magsarefluxes=True.
- magsarefluxes (bool) – If the generated time series is meant to be a flux time-series, set this to True to get the correct sign of variability amplitude.
Returns: A dict of the form below is returned:
{'vartype': 'flare', 'params': {'amplitude': generated value of flare amplitudes, 'nflares': generated value of number of flares, 'risestdev': generated value of stdev of rise time, 'decayconst': generated value of decay constant, 'peaktime': generated value of flare peak time}, 'times': the model times, 'mags': the model mags, 'errs': the model errs, 'varamplitude': the generated amplitude of variability == 'amplitude'}
Return type: dict
-
astrobase.fakelcs.generation.
generate_sinusoidal_lightcurve
(times, mags=None, errs=None, paramdists={'amplitude': <scipy.stats._distn_infrastructure.rv_frozen object>, 'fourierorder': [2, 10], 'period': <scipy.stats._distn_infrastructure.rv_frozen object>, 'phioffset': 0.0}, magsarefluxes=False)[source]¶ This generates fake sinusoidal light curves.
This can be used for a variety of sinusoidal variables, e.g. RRab, RRc, Cepheids, Miras, etc. The functions that generate these model LCs below implement the following table:
## FOURIER PARAMS FOR SINUSOIDAL VARIABLES # # type fourier period [days] # order dist limits dist # RRab 8 to 10 uniform 0.45--0.80 uniform # RRc 3 to 6 uniform 0.10--0.40 uniform # HADS 7 to 9 uniform 0.04--0.10 uniform # rotator 2 to 5 uniform 0.80--120.0 uniform # LPV 2 to 5 uniform 250--500.0 uniform
FIXME: for better model LCs, figure out how scipy.signal.butter works and low-pass filter using scipy.signal.filtfilt.
Parameters: - times (np.array) – This is an array of time values that will be used as the time base.
- mags,errs (np.array) – These arrays will have the model added to them. If either is None, np.full_like(times, 0.0) will used as a substitute and the model light curve will be centered around 0.0.
- paramdists (dict) –
This is a dict containing parameter distributions to use for the model params, containing the following keys
{'period', 'fourierorder', 'amplitude', 'phioffset'}
The values of these keys should all be ‘frozen’ scipy.stats distribution objects, e.g.:
https://docs.scipy.org/doc/scipy/reference/stats.html#continuous-distributions The variability epoch will be automatically chosen from a uniform distribution between times.min() and times.max().
The amplitude will be flipped automatically as appropriate if magsarefluxes=True.
- magsarefluxes (bool) – If the generated time series is meant to be a flux time-series, set this to True to get the correct sign of variability amplitude.
Returns: A dict of the form below is returned:
{'vartype': 'sinusoidal', 'params': {'period': generated value of period, 'epoch': generated value of epoch, 'amplitude': generated value of amplitude, 'fourierorder': generated value of fourier order, 'fourieramps': generated values of fourier amplitudes, 'fourierphases': generated values of fourier phases}, 'times': the model times, 'mags': the model mags, 'errs': the model errs, 'varperiod': the generated period of variability == 'period' 'varamplitude': the generated amplitude of variability == 'amplitude'}
Return type: dict
-
astrobase.fakelcs.generation.
generate_rrab_lightcurve
(times, mags=None, errs=None, paramdists={'amplitude': <scipy.stats._distn_infrastructure.rv_frozen object>, 'fourierorder': [8, 11], 'period': <scipy.stats._distn_infrastructure.rv_frozen object>, 'phioffset': 3.141592653589793}, magsarefluxes=False)[source]¶ This generates fake RRab light curves.
Parameters: - times (np.array) – This is an array of time values that will be used as the time base.
- mags,errs (np.array) – These arrays will have the model added to them. If either is None, np.full_like(times, 0.0) will used as a substitute and the model light curve will be centered around 0.0.
- paramdists (dict) –
This is a dict containing parameter distributions to use for the model params, containing the following keys
{'period', 'fourierorder', 'amplitude'}
The values of these keys should all be ‘frozen’ scipy.stats distribution objects, e.g.:
https://docs.scipy.org/doc/scipy/reference/stats.html#continuous-distributions The variability epoch will be automatically chosen from a uniform distribution between times.min() and times.max().
The amplitude will be flipped automatically as appropriate if magsarefluxes=True.
- magsarefluxes (bool) – If the generated time series is meant to be a flux time-series, set this to True to get the correct sign of variability amplitude.
Returns: A dict of the form below is returned:
{'vartype': 'RRab', 'params': {'period': generated value of period, 'epoch': generated value of epoch, 'amplitude': generated value of amplitude, 'fourierorder': generated value of fourier order, 'fourieramps': generated values of fourier amplitudes, 'fourierphases': generated values of fourier phases}, 'times': the model times, 'mags': the model mags, 'errs': the model errs, 'varperiod': the generated period of variability == 'period' 'varamplitude': the generated amplitude of variability == 'amplitude'}
Return type: dict
-
astrobase.fakelcs.generation.
generate_rrc_lightcurve
(times, mags=None, errs=None, paramdists={'amplitude': <scipy.stats._distn_infrastructure.rv_frozen object>, 'fourierorder': [2, 3], 'period': <scipy.stats._distn_infrastructure.rv_frozen object>, 'phioffset': 4.71238898038469}, magsarefluxes=False)[source]¶ This generates fake RRc light curves.
Parameters: - times (np.array) – This is an array of time values that will be used as the time base.
- mags,errs (np.array) – These arrays will have the model added to them. If either is None, np.full_like(times, 0.0) will used as a substitute and the model light curve will be centered around 0.0.
- paramdists (dict) –
This is a dict containing parameter distributions to use for the model params, containing the following keys
{'period', 'fourierorder', 'amplitude'}
The values of these keys should all be ‘frozen’ scipy.stats distribution objects, e.g.:
https://docs.scipy.org/doc/scipy/reference/stats.html#continuous-distributions The variability epoch will be automatically chosen from a uniform distribution between times.min() and times.max().
The amplitude will be flipped automatically as appropriate if magsarefluxes=True.
- magsarefluxes (bool) – If the generated time series is meant to be a flux time-series, set this to True to get the correct sign of variability amplitude.
Returns: A dict of the form below is returned:
{'vartype': 'RRc', 'params': {'period': generated value of period, 'epoch': generated value of epoch, 'amplitude': generated value of amplitude, 'fourierorder': generated value of fourier order, 'fourieramps': generated values of fourier amplitudes, 'fourierphases': generated values of fourier phases}, 'times': the model times, 'mags': the model mags, 'errs': the model errs, 'varperiod': the generated period of variability == 'period' 'varamplitude': the generated amplitude of variability == 'amplitude'}
Return type: dict
-
astrobase.fakelcs.generation.
generate_hads_lightcurve
(times, mags=None, errs=None, paramdists={'amplitude': <scipy.stats._distn_infrastructure.rv_frozen object>, 'fourierorder': [5, 10], 'period': <scipy.stats._distn_infrastructure.rv_frozen object>, 'phioffset': 3.141592653589793}, magsarefluxes=False)[source]¶ This generates fake HADS light curves.
Parameters: - times (np.array) – This is an array of time values that will be used as the time base.
- mags,errs (np.array) – These arrays will have the model added to them. If either is None, np.full_like(times, 0.0) will used as a substitute and the model light curve will be centered around 0.0.
- paramdists (dict) –
This is a dict containing parameter distributions to use for the model params, containing the following keys
{'period', 'fourierorder', 'amplitude'}
The values of these keys should all be ‘frozen’ scipy.stats distribution objects, e.g.:
https://docs.scipy.org/doc/scipy/reference/stats.html#continuous-distributions The variability epoch will be automatically chosen from a uniform distribution between times.min() and times.max().
The amplitude will be flipped automatically as appropriate if magsarefluxes=True.
- magsarefluxes (bool) – If the generated time series is meant to be a flux time-series, set this to True to get the correct sign of variability amplitude.
Returns: A dict of the form below is returned:
{'vartype': 'HADS', 'params': {'period': generated value of period, 'epoch': generated value of epoch, 'amplitude': generated value of amplitude, 'fourierorder': generated value of fourier order, 'fourieramps': generated values of fourier amplitudes, 'fourierphases': generated values of fourier phases}, 'times': the model times, 'mags': the model mags, 'errs': the model errs, 'varperiod': the generated period of variability == 'period' 'varamplitude': the generated amplitude of variability == 'amplitude'}
Return type: dict
-
astrobase.fakelcs.generation.
generate_rotator_lightcurve
(times, mags=None, errs=None, paramdists={'amplitude': <scipy.stats._distn_infrastructure.rv_frozen object>, 'fourierorder': [2, 3], 'period': <scipy.stats._distn_infrastructure.rv_frozen object>, 'phioffset': 4.71238898038469}, magsarefluxes=False)[source]¶ This generates fake rotator light curves.
Parameters: - times (np.array) – This is an array of time values that will be used as the time base.
- mags,errs (np.array) – These arrays will have the model added to them. If either is None, np.full_like(times, 0.0) will used as a substitute and the model light curve will be centered around 0.0.
- paramdists (dict) –
This is a dict containing parameter distributions to use for the model params, containing the following keys
{'period', 'fourierorder', 'amplitude'}
The values of these keys should all be ‘frozen’ scipy.stats distribution objects, e.g.:
https://docs.scipy.org/doc/scipy/reference/stats.html#continuous-distributions The variability epoch will be automatically chosen from a uniform distribution between times.min() and times.max().
The amplitude will be flipped automatically as appropriate if magsarefluxes=True.
- magsarefluxes (bool) – If the generated time series is meant to be a flux time-series, set this to True to get the correct sign of variability amplitude.
Returns: A dict of the form below is returned:
{'vartype': 'rotator', 'params': {'period': generated value of period, 'epoch': generated value of epoch, 'amplitude': generated value of amplitude, 'fourierorder': generated value of fourier order, 'fourieramps': generated values of fourier amplitudes, 'fourierphases': generated values of fourier phases}, 'times': the model times, 'mags': the model mags, 'errs': the model errs, 'varperiod': the generated period of variability == 'period' 'varamplitude': the generated amplitude of variability == 'amplitude'}
Return type: dict
-
astrobase.fakelcs.generation.
generate_lpv_lightcurve
(times, mags=None, errs=None, paramdists={'amplitude': <scipy.stats._distn_infrastructure.rv_frozen object>, 'fourierorder': [2, 3], 'period': <scipy.stats._distn_infrastructure.rv_frozen object>, 'phioffset': 4.71238898038469}, magsarefluxes=False)[source]¶ This generates fake long-period-variable (LPV) light curves.
Parameters: - times (np.array) – This is an array of time values that will be used as the time base.
- mags,errs (np.array) – These arrays will have the model added to them. If either is None, np.full_like(times, 0.0) will used as a substitute and the model light curve will be centered around 0.0.
- paramdists (dict) –
This is a dict containing parameter distributions to use for the model params, containing the following keys
{'period', 'fourierorder', 'amplitude'}
The values of these keys should all be ‘frozen’ scipy.stats distribution objects, e.g.:
https://docs.scipy.org/doc/scipy/reference/stats.html#continuous-distributions The variability epoch will be automatically chosen from a uniform distribution between times.min() and times.max().
The amplitude will be flipped automatically as appropriate if magsarefluxes=True.
- magsarefluxes (bool) – If the generated time series is meant to be a flux time-series, set this to True to get the correct sign of variability amplitude.
Returns: A dict of the form below is returned:
{'vartype': 'LPV', 'params': {'period': generated value of period, 'epoch': generated value of epoch, 'amplitude': generated value of amplitude, 'fourierorder': generated value of fourier order, 'fourieramps': generated values of fourier amplitudes, 'fourierphases': generated values of fourier phases}, 'times': the model times, 'mags': the model mags, 'errs': the model errs, 'varperiod': the generated period of variability == 'period' 'varamplitude': the generated amplitude of variability == 'amplitude'}
Return type: dict
-
astrobase.fakelcs.generation.
generate_cepheid_lightcurve
(times, mags=None, errs=None, paramdists={'amplitude': <scipy.stats._distn_infrastructure.rv_frozen object>, 'fourierorder': [8, 11], 'period': <scipy.stats._distn_infrastructure.rv_frozen object>, 'phioffset': 3.141592653589793}, magsarefluxes=False)[source]¶ This generates fake Cepheid light curves.
Parameters: - times (np.array) – This is an array of time values that will be used as the time base.
- mags,errs (np.array) – These arrays will have the model added to them. If either is None, np.full_like(times, 0.0) will used as a substitute and the model light curve will be centered around 0.0.
- paramdists (dict) –
This is a dict containing parameter distributions to use for the model params, containing the following keys
{'period', 'fourierorder', 'amplitude'}
The values of these keys should all be ‘frozen’ scipy.stats distribution objects, e.g.:
https://docs.scipy.org/doc/scipy/reference/stats.html#continuous-distributions The variability epoch will be automatically chosen from a uniform distribution between times.min() and times.max().
The amplitude will be flipped automatically as appropriate if magsarefluxes=True.
- magsarefluxes (bool) – If the generated time series is meant to be a flux time-series, set this to True to get the correct sign of variability amplitude.
Returns: A dict of the form below is returned:
{'vartype': 'cepheid', 'params': {'period': generated value of period, 'epoch': generated value of epoch, 'amplitude': generated value of amplitude, 'fourierorder': generated value of fourier order, 'fourieramps': generated values of fourier amplitudes, 'fourierphases': generated values of fourier phases}, 'times': the model times, 'mags': the model mags, 'errs': the model errs, 'varperiod': the generated period of variability == 'period' 'varamplitude': the generated amplitude of variability == 'amplitude'}
Return type: dict
-
astrobase.fakelcs.generation.
make_fakelc
(lcfile, outdir, magrms=None, randomizemags=True, randomizecoords=False, lcformat='hat-sql', lcformatdir=None, timecols=None, magcols=None, errcols=None)[source]¶ This preprocesses an input real LC and sets it up to be a fake LC.
Parameters: - lcfile (str) – This is an input light curve file that will be used to copy over the time-base. This will be used to generate the time-base for fake light curves to provide a realistic simulation of the observing window function.
- outdir (str) – The output directory where the the fake light curve will be written.
- magrms (dict) – This is a dict containing the SDSS r mag-RMS (SDSS rmag-MAD preferably) relation based on all light curves that the input lcfile is from. This will be used to generate the median mag and noise corresponding to the magnitude chosen for this fake LC.
- randomizemags (bool) – If this is True, then a random mag between the first and last magbin in magrms will be chosen as the median mag for this light curve. This choice will be weighted by the mag bin probability obtained from the magrms kwarg. Otherwise, the median mag will be taken from the input lcfile’s lcdict[‘objectinfo’][‘sdssr’] key or a transformed SDSS r mag generated from the input lcfile’s lcdict[‘objectinfo’][‘jmag’], [‘hmag’], and [‘kmag’] keys. The magrms relation for each magcol will be used to generate Gaussian noise at the correct level for the magbin this light curve’s median mag falls into.
- randomizecoords (bool) – If this is True, will randomize the RA, DEC of the output fake object and not copy over the RA/DEC from the real input object.
- lcformat (str) – This is the formatkey associated with your input real light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curve specified in lcfile.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- timecols (list of str or None) – The timecol keys to use from the input lcdict in generating the fake light curve. Fake LCs will be generated for each each timecol/magcol/errcol combination in the input light curve.
- magcols (list of str or None) – The magcol keys to use from the input lcdict in generating the fake light curve. Fake LCs will be generated for each each timecol/magcol/errcol combination in the input light curve.
- errcols (list of str or None) – The errcol keys to use from the input lcdict in generating the fake light curve. Fake LCs will be generated for each each timecol/magcol/errcol combination in the input light curve.
Returns: A tuple of the following form is returned:
(fakelc_fpath, fakelc_lcdict['columns'], fakelc_lcdict['objectinfo'], fakelc_lcdict['moments'])
Return type: tuple
-
astrobase.fakelcs.generation.
collection_worker
(task)[source]¶ This wraps process_fakelc for make_fakelc_collection below.
Parameters: task (tuple) – This is of the form:
task[0] = lcfile task[1] = outdir task[2] = magrms task[3] = dict with keys: {'lcformat', 'timecols', 'magcols', 'errcols', 'randomizeinfo'}
Returns: This returns a tuple of the form: (fakelc_fpath, fakelc_lcdict['columns'], fakelc_lcdict['objectinfo'], fakelc_lcdict['moments'])
Return type: tuple
-
astrobase.fakelcs.generation.
make_fakelc_collection
(lclist, simbasedir, magrmsfrom, magrms_interpolate='quadratic', magrms_fillvalue='extrapolate', maxlcs=25000, maxvars=2000, randomizemags=True, randomizecoords=False, vartypes=('EB', 'RRab', 'RRc', 'cepheid', 'rotator', 'flare', 'HADS', 'planet', 'LPV'), lcformat='hat-sql', lcformatdir=None, timecols=None, magcols=None, errcols=None)[source]¶ This prepares light curves for the recovery sim.
Collects light curves from lclist using a uniform sampling among them. Copies them to the simbasedir, zeroes out their mags and errs but keeps their time bases, also keeps their RMS and median mags for later use. Calculates the mag-rms relation for the entire collection and writes that to the simbasedir as well.
The purpose of this function is to copy over the time base and mag-rms relation of an existing light curve collection to use it as the basis for a variability recovery simulation.
This returns a pickle written to the simbasedir that contains all the information for the chosen ensemble of fake light curves and writes all generated light curves to the simbasedir/lightcurves directory. Run the add_variability_to_fakelc_collection function after this function to add variability of the specified type to these generated light curves.
Parameters: - lclist (list of str) – This is a list of existing project light curves. This can be generated
from
astrobase.lcproc.catalogs.make_lclist()
or similar. - simbasedir (str) – This is the directory to where the fake light curves and their information will be copied to.
- magrmsfrom (str or dict) –
This is used to generate magnitudes and RMSes for the objects in the output collection of fake light curves. This arg is either a string pointing to an existing pickle file that must contain a dict or a dict variable that MUST have the following key-vals at a minimum:
{'<magcol1_name>': { 'binned_sdssr_median': array of median mags for each magbin 'binned_lcmad_median': array of LC MAD values per magbin }, '<magcol2_name>': { 'binned_sdssr_median': array of median mags for each magbin 'binned_lcmad_median': array of LC MAD values per magbin }, . . ...}
where magcol1_name, etc. are the same as the magcols listed in the magcols kwarg (or the default magcols for the specified lcformat). Examples of the magrmsfrom dict (or pickle) required can be generated by the
astrobase.lcproc.varthreshold.variability_threshold()
function. - magrms_interpolate,magrms_fillvalue (str) –
These are arguments that will be passed directly to the scipy.interpolate.interp1d function to generate interpolating functions for the mag-RMS relation. See:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.interp1d.html
for details.
- maxlcs (int) – This is the total number of light curves to choose from lclist and generate as fake LCs.
- maxvars (int) – This is the total number of fake light curves that will be marked as variable.
- vartypes (list of str) – This is a list of variable types to put into the collection. The vartypes for each fake variable star will be chosen uniformly from this list.
- lcformat (str) – This is the formatkey associated with your input real light curves’ format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in lclist.
- lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
- timecols (list of str or None) – The timecol keys to use from the input lcdict in generating the fake light curve. Fake LCs will be generated for each each timecol/magcol/errcol combination in the input light curves.
- magcols (list of str or None) – The magcol keys to use from the input lcdict in generating the fake light curve. Fake LCs will be generated for each each timecol/magcol/errcol combination in the input light curves.
- errcols (list of str or None) – The errcol keys to use from the input lcdict in generating the fake light curve. Fake LCs will be generated for each each timecol/magcol/errcol combination in the input light curves.
Returns: Returns the string file name of a pickle containing all of the information for the fake LC collection that has been generated.
Return type: str
- lclist (list of str) – This is a list of existing project light curves. This can be generated
from
-
astrobase.fakelcs.generation.
add_fakelc_variability
(fakelcfile, vartype, override_paramdists=None, magsarefluxes=False, overwrite=False)[source]¶ This adds variability of the specified type to the fake LC.
The procedure is (for each magcol):
- read the fakelcfile, get the stored moments and vartype info
- add the periodic variability specified in vartype and varparamdists. if vartype == None, then do nothing in this step. If override_vartype is not None, override stored vartype with specified vartype. If override_varparamdists provided, override with specified varparamdists. NOTE: the varparamdists must make sense for the vartype, otherwise, weird stuff will happen.
- add the median mag level stored in fakelcfile to the time series
- add Gaussian noise to the light curve as specified in fakelcfile
- add a varinfo key and dict to the lcdict with varperiod, varepoch, varparams
- write back to fake LC pickle
- return the varinfo dict to the caller
Parameters: - fakelcfile (str) – The name of the fake LC file to process.
- vartype (str) – The type of variability to add to this fake LC file.
- override_paramdists (dict) – A parameter distribution dict as in the generate_XX_lightcurve functions above. If provided, will override the distribution stored in the input fake LC file itself.
- magsarefluxes (bool) – Sets if the variability amplitude is in fluxes and not magnitudes.
- overwite (bool) – This overwrites the input fake LC file with a new variable LC even if it’s been processed before.
Returns: A dict of the following form is returned:
{'objectid':lcdict['objectid'], 'lcfname':fakelcfile, 'actual_vartype':vartype, 'actual_varparams':lcdict['actual_varparams']}
Return type: dict
-
astrobase.fakelcs.generation.
add_variability_to_fakelc_collection
(simbasedir, override_paramdists=None, overwrite_existingvar=False)[source]¶ This adds variability and noise to all fake LCs in simbasedir.
If an object is marked as variable in the fakelcs-info.pkl file in simbasedir, a variable signal will be added to its light curve based on its selected type, default period and amplitude distribution, the appropriate params, etc. the epochs for each variable object will be chosen uniformly from its time-range (and may not necessarily fall on a actual observed time). Nonvariable objects will only have noise added as determined by their params, but no variable signal will be added.
Parameters: - simbasedir (str) – The directory containing the fake LCs to process.
- override_paramdists (dict) –
This can be used to override the stored variable parameters in each fake LC. It should be a dict of the following form:
{'<vartype1>': {'<param1>: a scipy.stats distribution function or the np.random.randint function, . . . '<paramN>: a scipy.stats distribution function or the np.random.randint function}
for any vartype in VARTYPE_LCGEN_MAP. These are used to override the default parameter distributions for each variable type.
- overwrite_existingvar (bool) – If this is True, then will overwrite any existing variability in the input fake LCs in simbasedir.
Returns: This returns a dict containing the fake LC filenames as keys and variability info for each as values.
Return type: dict
This is a companion module for fakelcs/generation.py. It runs LCs generated using functions in that module through variable star detection and classification to see how well they are recovered.
-
astrobase.fakelcs.recovery.
read_fakelc
(fakelcfile)[source]¶ This just reads a pickled fake LC.
Parameters: fakelcfile (str) – The fake LC file to read. Returns: This returns an lcdict. Return type: dict
-
astrobase.fakelcs.recovery.
get_varfeatures
(simbasedir, mindet=1000, nworkers=None)[source]¶ This runs lcproc.lcvfeatures.parallel_varfeatures on fake LCs in simbasedir.
Parameters: - simbasedir (str) – The directory containing the fake LCs to process.
- mindet (int) – The minimum number of detections needed to accept an LC and process it.
- nworkers (int or None) – The number of parallel workers to use when extracting variability features from the input light curves.
Returns: The path to the varfeatures pickle created after running the lcproc.lcvfeatures.parallel_varfeatures function.
Return type: str
-
astrobase.fakelcs.recovery.
precision
(ntp, nfp)[source]¶ This calculates precision.
https://en.wikipedia.org/wiki/Precision_and_recall
Parameters: - ntp (int) – The number of true positives.
- nfp (int) – The number of false positives.
Returns: The precision calculated using ntp/(ntp + nfp).
Return type: float
-
astrobase.fakelcs.recovery.
recall
(ntp, nfn)[source]¶ This calculates recall.
https://en.wikipedia.org/wiki/Precision_and_recall
Parameters: - ntp (int) – The number of true positives.
- nfn (int) – The number of false negatives.
Returns: The precision calculated using ntp/(ntp + nfn).
Return type: float
-
astrobase.fakelcs.recovery.
matthews_correl_coeff
(ntp, ntn, nfp, nfn)[source]¶ This calculates the Matthews correlation coefficent.
https://en.wikipedia.org/wiki/Matthews_correlation_coefficient
Parameters: - ntp (int) – The number of true positives.
- ntn (int) – The number of true negatives
- nfp (int) – The number of false positives.
- nfn (int) – The number of false negatives.
Returns: The Matthews correlation coefficient.
Return type: float
-
astrobase.fakelcs.recovery.
get_recovered_variables_for_magbin
(simbasedir, magbinmedian, stetson_stdev_min=2.0, inveta_stdev_min=2.0, iqr_stdev_min=2.0, statsonly=True)[source]¶ This runs variability selection for the given magbinmedian.
To generate a full recovery matrix over all magnitude bins, run this function for each magbin over the specified stetson_stdev_min and inveta_stdev_min grid.
Parameters: - simbasedir (str) – The input directory of fake LCs.
- magbinmedian (float) – The magbin to run the variable recovery for. This is an item from the dict from simbasedir/fakelcs-info.pkl: `fakelcinfo[‘magrms’][magcol] list for each magcol and designates which magbin to get the recovery stats for.
- stetson_stdev_min (float) – The minimum sigma above the trend in the Stetson J variability index distribution for this magbin to use to consider objects as variable.
- inveta_stdev_min (float) – The minimum sigma above the trend in the 1/eta variability index distribution for this magbin to use to consider objects as variable.
- iqr_stdev_min (float) – The minimum sigma above the trend in the IQR variability index distribution for this magbin to use to consider objects as variable.
- statsonly (bool) – If this is True, only the final stats will be returned. If False, the full arrays used to generate the stats will also be returned.
Returns: The returned dict contains statistics for this magbin and if requested, the full arrays used to calculate the statistics.
Return type: dict
-
astrobase.fakelcs.recovery.
magbin_varind_gridsearch_worker
(task)[source]¶ This is a parallel grid search worker for the function below.
-
astrobase.fakelcs.recovery.
variable_index_gridsearch_magbin
(simbasedir, stetson_stdev_range=(1.0, 20.0), inveta_stdev_range=(1.0, 20.0), iqr_stdev_range=(1.0, 20.0), ngridpoints=32, ngridworkers=None)[source]¶ This runs a variable index grid search per magbin.
For each magbin, this does a grid search using the stetson and inveta ranges provided and tries to optimize the Matthews Correlation Coefficient (best value is +1.0), indicating the best possible separation of variables vs. nonvariables. The thresholds on these two variable indexes that produce the largest coeff for the collection of fake LCs will probably be the ones that work best for actual variable classification on the real LCs.
https://en.wikipedia.org/wiki/Matthews_correlation_coefficient
For each grid-point, calculates the true positives, false positives, true negatives, false negatives. Then gets the precision and recall, confusion matrix, and the ROC curve for variable vs. nonvariable.
Once we’ve identified the best thresholds to use, we can then calculate variable object numbers:
- as a function of magnitude
- as a function of period
- as a function of number of detections
- as a function of amplitude of variability
Writes everything back to simbasedir/fakevar-recovery.pkl. Use the plotting function below to make plots for the results.
Parameters: - simbasedir (str) – The directory where the fake LCs are located.
- stetson_stdev_range (sequence of 2 floats) – The min and max values of the Stetson J variability index to generate a grid over these to test for the values of this index that produce the ‘best’ recovery rate for the injected variable stars.
- inveta_stdev_range (sequence of 2 floats) – The min and max values of the 1/eta variability index to generate a grid over these to test for the values of this index that produce the ‘best’ recovery rate for the injected variable stars.
- iqr_stdev_range (sequence of 2 floats) – The min and max values of the IQR variability index to generate a grid over these to test for the values of this index that produce the ‘best’ recovery rate for the injected variable stars.
- ngridpoints (int) –
The number of grid points for each variability index grid. Remember that this function will be searching in 3D and will require lots of time to run if ngridpoints is too large.
For the default number of grid points and 25000 simulated light curves, this takes about 3 days to run on a 40 (effective) core machine with 2 x Xeon E5-2650v3 CPUs.
- ngridworkers (int or None) – The number of parallel grid search workers that will be launched.
Returns: The returned dict contains a list of recovery stats for each magbin and each grid point in the variability index grids that were used. This dict can be passed to the plotting function below to plot the results.
Return type: dict
-
astrobase.fakelcs.recovery.
plot_varind_gridsearch_magbin_results
(gridsearch_results)[source]¶ This plots the gridsearch results from variable_index_gridsearch_magbin.
Parameters: gridsearch_results (dict) – This is the dict produced by variable_index_gridsearch_magbin above. Returns: The returned dict contains filenames of the recovery rate plots made for each variability index. These include plots of the precision, recall, and Matthews Correlation Coefficient over each magbin and a heatmap of these values over the grid points of the variability index stdev values arrays used. Return type: dict
-
astrobase.fakelcs.recovery.
run_periodfinding
(simbasedir, pfmethods=('gls', 'pdm', 'bls'), pfkwargs=({}, {}, {'startp': 1.0, 'maxtransitduration': 0.3}), getblssnr=False, sigclip=5.0, nperiodworkers=10, ncontrolworkers=4, liststartindex=None, listmaxobjects=None)[source]¶ This runs periodfinding using several period-finders on a collection of fake LCs.
As a rough benchmark, 25000 fake LCs with 10000–50000 points per LC take about 26 days in total to run on an invocation of this function using GLS+PDM+BLS and 10 periodworkers and 4 controlworkers (so all 40 ‘cores’) on a 2 x Xeon E5-2660v3 machine.
Parameters: - pfmethods (sequence of str) – This is used to specify which periodfinders to run. These must be in the lcproc.periodsearch.PFMETHODS dict.
- pfkwargs (sequence of dict) – This is used to provide optional kwargs to the period-finders.
- getblssnr (bool) – If this is True, will run BLS SNR calculations for each object and magcol. This takes a while to run, so it’s disabled (False) by default.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- nperiodworkers (int) – This is the number of parallel period-finding worker processes to use.
- ncontrolworkers (int) – This is the number of parallel period-finding control workers to use. Each control worker will launch nperiodworkers worker processes.
- liststartindex (int) – The starting index of processing. This refers to the filename list generated by running glob.glob on the fake LCs in simbasedir.
- maxobjects (int) – The maximum number of objects to process in this run. Use this with liststartindex to effectively distribute working on a large list of input light curves over several sessions or machines.
Returns: The path to the output summary pickle produced by lcproc.periodsearch.parallel_pf
Return type: str
-
astrobase.fakelcs.recovery.
check_periodrec_alias
(actualperiod, recoveredperiod, tolerance=0.001)[source]¶ This determines what kind of aliasing (if any) exists between recoveredperiod and actualperiod.
Parameters: - actualperiod (float) – The actual period of the object.
- recoveredperiod (float) – The recovered period of the object.
- tolerance (float) – The absolute difference required between the input periods to mark the recovered period as close to the actual period.
Returns: The type of alias determined for the input combination of periods. This will be CSV string with values taken from the following list, based on the types of alias found:
['actual', 'twice', 'half', 'ratio_over_1plus', 'ratio_over_1minus', 'ratio_over_1plus_twice', 'ratio_over_1minus_twice', 'ratio_over_1plus_thrice', 'ratio_over_1minus_thrice', 'ratio_over_minus1', 'ratio_over_twice_minus1']
Return type: str
-
astrobase.fakelcs.recovery.
periodicvar_recovery
(fakepfpkl, simbasedir, period_tolerance=0.001)[source]¶ Recovers the periodic variable status/info for the simulated PF result.
- Uses simbasedir and the lcfbasename stored in fakepfpkl to figure out where the LC for this object is.
- Gets the actual_varparams, actual_varperiod, actual_vartype, actual_varamplitude elements from the LC.
- Figures out if the current objectid is a periodic variable (using actual_vartype).
- If it is a periodic variable, gets the canonical period assigned to it.
- Checks if the period was recovered in any of the five best periods reported by any of the period-finders, checks if the period recovered was a harmonic of the period.
- Returns the objectid, actual period and vartype, recovered period, and recovery status.
Parameters: - fakepfpkl (str) – This is a periodfinding-<objectid>.pkl[.gz] file produced in the simbasedir/periodfinding subdirectory after run_periodfinding above is done.
- simbasedir (str) – The base directory where all of the fake LCs and period-finding results are.
- period_tolerance (float) – The maximum difference that this function will consider between an actual period (or its aliases) and a recovered period to consider it as as a ‘recovered’ period.
Returns: Returns a dict of period-recovery results.
Return type: dict
-
astrobase.fakelcs.recovery.
periodrec_worker
(task)[source]¶ This is a parallel worker for running period-recovery.
Parameters: task (tuple) – This is used to pass args to the periodicvar_recovery function:
task[0] = period-finding result pickle to work on task[1] = simbasedir task[2] = period_tolerance
Returns: This is the dict produced by the periodicvar_recovery function for the input period-finding result pickle. Return type: dict
-
astrobase.fakelcs.recovery.
parallel_periodicvar_recovery
(simbasedir, period_tolerance=0.001, liststartind=None, listmaxobjects=None, nworkers=None)[source]¶ This is a parallel driver for periodicvar_recovery.
Parameters: - simbasedir (str) – The base directory where all of the fake LCs and period-finding results are.
- period_tolerance (float) – The maximum difference that this function will consider between an actual period (or its aliases) and a recovered period to consider it as as a ‘recovered’ period.
- liststartindex (int) – The starting index of processing. This refers to the filename list generated by running glob.glob on the period-finding result pickles in simbasedir/periodfinding.
- listmaxobjects (int) – The maximum number of objects to process in this run. Use this with liststartindex to effectively distribute working on a large list of input period-finding result pickles over several sessions or machines.
- nperiodworkers (int) – This is the number of parallel period-finding worker processes to use.
Returns: Returns the filename of the pickle produced containing all of the period recovery results.
Return type: str
-
astrobase.fakelcs.recovery.
plot_periodicvar_recovery_results
(precvar_results, aliases_count_as_recovered=None, magbins=None, periodbins=None, amplitudebins=None, ndetbins=None, minbinsize=1, plotfile_ext='png')[source]¶ This plots the results of periodic var recovery.
This function makes plots for periodicvar recovered fraction as a function of:
- magbin
- periodbin
- amplitude of variability
- ndet
with plot lines broken down by:
- magcol
- periodfinder
- vartype
- recovery status
The kwargs magbins, periodbins, amplitudebins, and ndetbins can be used to set the bin lists as needed. The kwarg minbinsize controls how many elements per bin are required to accept a bin in processing its recovery characteristics for mags, periods, amplitudes, and ndets.
Parameters: - precvar_results (dict or str) – This is either a dict returned by parallel_periodicvar_recovery or the pickle created by that function.
- aliases_count_as_recovered (list of str or 'all') –
This is used to set which kinds of aliases this function considers as ‘recovered’ objects. Normally, we require that recovered objects have a recovery status of ‘actual’ to indicate the actual period was recovered. To change this default behavior, aliases_count_as_recovered can be set to a list of alias status strings that should be considered as ‘recovered’ objects as well. Choose from the following alias types:
'twice' recovered_p = 2.0*actual_p 'half' recovered_p = 0.5*actual_p 'ratio_over_1plus' recovered_p = actual_p/(1.0+actual_p) 'ratio_over_1minus' recovered_p = actual_p/(1.0-actual_p) 'ratio_over_1plus_twice' recovered_p = actual_p/(1.0+2.0*actual_p) 'ratio_over_1minus_twice' recovered_p = actual_p/(1.0-2.0*actual_p) 'ratio_over_1plus_thrice' recovered_p = actual_p/(1.0+3.0*actual_p) 'ratio_over_1minus_thrice' recovered_p = actual_p/(1.0-3.0*actual_p) 'ratio_over_minus1' recovered_p = actual_p/(actual_p - 1.0) 'ratio_over_twice_minus1' recovered_p = actual_p/(2.0*actual_p - 1.0)
or set aliases_count_as_recovered=’all’ to include all of the above in the ‘recovered’ periodic var list.
- magbins (np.array) – The magnitude bins to plot the recovery rate results over. If None, the default mag bins will be used: np.arange(8.0,16.25,0.25).
- periodbins (np.array) – The period bins to plot the recovery rate results over. If None, the default period bins will be used: np.arange(0.0,500.0,0.5).
- amplitudebins (np.array) – The variability amplitude bins to plot the recovery rate results over. If None, the default amplitude bins will be used: np.arange(0.0,2.0,0.05).
- ndetbins (np.array) – The ndet bins to plot the recovery rate results over. If None, the default ndet bins will be used: np.arange(0.0,60000.0,1000.0).
- minbinsize (int) – The minimum number of objects per bin required to plot a bin and its recovery fraction on the plot.
- plotfile_ext ({'png','pdf'}) – Sets the plot output files’ extension.
Returns: A dict containing recovery fraction statistics and the paths to each of the plots made.
Return type: dict
astrobase.periodbase package¶
This package contains various useful tools for finding periods in astronomical time-series observations.
astrobase.periodbase.spdm
: Stellingwerf (1978) phase-dispersion minimization period search algorithm.astrobase.periodbase.saov
: Schwarzenberg-Czerny (1989) analysis of variance period search algorithm.astrobase.periodbase.smav
: Schwarzenberg-Czerny (1996) multi-harmonic AoV period search algorithm.astrobase.periodbase.zgls
: Zechmeister & Kurster (2009) generalized Lomb-Scargle period search algorithm.astrobase.periodbase.kbls
: Kovacs et al. (2002) Box-Least-Squares search using a wrapped eebls.f from G. Kovacs.astrobase.periodbase.abls
: Kovacs et al. (2002) BLS using Astropy’s implementation.astrobase.periodbase.htls
: Hippke & Heller (2019) Transit-Least-Squares period search algorithm.astrobase.periodbase.macf
: McQuillan et al. (2013a, 2014) ACF period search algorithm.
Some utility functions are present in:
astrobase.periodbase.utils
: Functions to generate frequency grids and other useful bits.astrobase.periodbase.falsealarm
: Functions to calculate false-alarm probabilities.
This top-level module hoists all period-finder functions up into the
astrobase.periodbase
namespace, so you can do:
from astrobase import periodbase
periodbase.<name of period-finder function>
-
astrobase.periodbase.
use_astropy_bls
()[source]¶ This function can be used to switch from the default astrobase BLS implementation (kbls) to the Astropy version (abls).
If this is called, subsequent calls to the BLS periodbase functions will use the Astropy versions instead:
from astrobase import periodbase # initially points to periodbase.kbls.bls_serial_pfind periodbase.bls_serial_pfind(...) # initially points to periodbase.kbls.bls_parallel_pfind periodbase.bls_parallel_pfind(...) periodbase.use_astropy_bls() # now points to periodbase.abls.bls_serial_pfind periodbase.bls_serial_pfind(...) # now points to periodbase.abls.bls_parallel_pfind periodbase.bls_parallel_pfind(...)
This contains some utilities for periodbase functions.
independent_freq_count()
: gets the number of independent frequencies when calculating false alarm probabilities.get_frequency_grid()
: generates frequency grids automatically.make_combined_periodogram()
: makes a combined periodogram from the results of several period-finders
FIXME: add an iterative peak-removal and refit mode to all period-finders here.
-
astrobase.periodbase.utils.
resort_by_time
(times, mags, errs)[source]¶ Resorts the input arrays so they’re in time order.
NOTE: the input arrays must not have nans in them.
Parameters: times,mags,errs (np.arrays) – The times, mags, and errs arrays to resort by time. The times array is assumed to be the first one in the input args. Returns: times,mags,errs – The resorted times, mags, errs arrays. Return type: np.arrays
-
astrobase.periodbase.utils.
independent_freq_count
(frequencies, times, conservative=True)[source]¶ This estimates the number of independent frequencies in a periodogram.
This follows the terminology on page 3 of Zechmeister & Kurster (2009):
M = DELTA_f / delta_f
where:
DELTA_f = freq.max() - freq.min() delta_f = 1.0/(times.max() - times.min())
Parameters: - frequencies (np.array) – The frequencies array used for the calculation of the GLS periodogram.
- times (np.array) – The array of input times used for the calculation of the GLS periodogram.
- conservative (bool) –
If True, will follow the prescription given in Schwarzenberg-Czerny (2003):
http://adsabs.harvard.edu/abs/2003ASPC..292..383S
and estimate the number of independent frequences as:
min(N_obs, N_freq, DELTA_f/delta_f)
Returns: M – The number of independent frequencies.
Return type: int
-
astrobase.periodbase.utils.
get_frequency_grid
(times, samplesperpeak=5, nyquistfactor=5, minfreq=None, maxfreq=None, returnf0dfnf=False)[source]¶ This calculates a frequency grid for the period finding functions in this module.
Based on the autofrequency function in astropy.stats.lombscargle.
Parameters: - times (np.array) – The times to use to generate the frequency grid over.
- samplesperpeak (int) – The minimum sample coverage each frequency point in the grid will get.
- nyquistfactor (int) – The multiplier over the Nyquist rate to use.
- minfreq,maxfreq (float or None) – If not None, these will be the limits of the frequency grid generated.
- returnf0dfnf (bool) – If this is True, will return the values of f0, df, and Nf generated for this grid.
Returns: A grid of frequencies.
Return type: np.array
-
astrobase.periodbase.utils.
make_combined_periodogram
(pflist, outfile, addmethods=False)[source]¶ This just puts all of the period-finders on a single periodogram.
This will renormalize all of the periodograms so their values lie between 0 and 1, with values lying closer to 1 being more significant. Periodograms that give the same best periods will have their peaks line up together.
Parameters: - pflist (list of dict) –
This is a list of result dicts from any of the period-finders in periodbase. To use your own period-finders’ results here, make sure the result dict is of the form and has at least the keys below:
{'periods': np.array of all periods searched by the period-finder, 'lspvals': np.array of periodogram power value for each period, 'bestperiod': a float value that is the period with the highest peak in the periodogram, i.e. the most-likely actual period, 'method': a three-letter code naming the period-finder used; must be one of the keys in the `astrobase.periodbase.METHODLABELS` dict, 'nbestperiods': a list of the periods corresponding to periodogram peaks (`nbestlspvals` below) to annotate on the periodogram plot so they can be called out visually, 'nbestlspvals': a list of the power values associated with periodogram peaks to annotate on the periodogram plot so they can be called out visually; should be the same length as `nbestperiods` above, 'kwargs': dict of kwargs passed to your own period-finder function}
- outfile (str) – This is the output file to write the output to. NOTE: EPS/PS won’t work because we use alpha transparency to better distinguish between the various periodograms.
- addmethods (bool) – If this is True, will add all of the normalized periodograms together, then renormalize them to between 0 and 1. In this way, if all of the period-finders agree on something, it’ll stand out easily. FIXME: implement this kwarg.
Returns: The name of the generated plot file.
Return type: str
- pflist (list of dict) –
This contains functions useful for false-alarm probability calculation.
bootstrap_falsealarmprob()
: calculates the false alarm probability for a period using bootstrap resampling.
-
astrobase.periodbase.falsealarm.
bootstrap_falsealarmprob
(lspinfo, times, mags, errs, nbootstrap=250, magsarefluxes=False, sigclip=10.0, npeaks=None)[source]¶ Calculates the false alarm probabilities of periodogram peaks using bootstrap resampling of the magnitude time series.
The false alarm probability here is defined as:
(1.0 + sum(trialbestpeaks[i] > peak[j]))/(ntrialbestpeaks + 1)
for each best periodogram peak j. The index i is for each bootstrap trial. This effectively gives us a significance for the peak. Smaller FAP means a better chance that the peak is real.
The basic idea is to get the number of trial best peaks that are larger than the current best peak and divide this by the total number of trials. The distribution of these trial best peaks is obtained after scrambling the mag values and rerunning the specified periodogram method for a bunch of trials.
lspinfo is the output dict from a periodbase periodogram function and MUST contain a ‘method’ key that corresponds to one of the keys in the LSPMETHODS dict above. This will let this function know which periodogram function to run to generate the bootstrap samples. The lspinfo SHOULD also have a ‘kwargs’ key that corresponds to the input keyword arguments for the periodogram function as it was run originally, to keep everything the same during the bootstrap runs. If this is missing, default values will be used.
FIXME: this may not be strictly correct; must look more into bootstrap significance testing. Also look into if we’re doing resampling correctly for time series because the samples are not iid. Look into moving block bootstrap.
Parameters: - lspinfo (dict) –
A dict of period-finder results from one of the period-finders in periodbase, or your own functions, provided it’s of the form and contains at least the keys listed below:
{'periods': np.array of all periods searched by the period-finder, 'lspvals': np.array of periodogram power value for each period, 'bestperiod': a float value that is the period with the highest peak in the periodogram, i.e. the most-likely actual period, 'method': a three-letter code naming the period-finder used; must be one of the keys in the `astrobase.periodbase.METHODLABELS` dict, 'nbestperiods': a list of the periods corresponding to periodogram peaks (`nbestlspvals` below) to annotate on the periodogram plot so they can be called out visually, 'nbestlspvals': a list of the power values associated with periodogram peaks to annotate on the periodogram plot so they can be called out visually; should be the same length as `nbestperiods` above, 'kwargs': dict of kwargs passed to your own period-finder function}
If you provide your own function’s period-finder results, you should add a corresponding key for it to the LSPMETHODS dict above so the bootstrap function can use it correctly. Your period-finder function should take times, mags, errs and any extra parameters as kwargs and return a dict of the form described above. A small worked example:
from your_module import your_periodfinder_func from astrobase import periodbase periodbase.LSPMETHODS['your-finder'] = your_periodfinder_func # run a period-finder session your_pfresults = your_periodfinder_func(times, mags, errs, **extra_kwargs) # run bootstrap to find FAP falsealarm_info = periodbase.bootstrap_falsealarmprob( your_pfresults, times, mags, errs, nbootstrap=250, magsarefluxes=False, )
- times,mags,errs (np.arrays) – The magnitude/flux time-series to process along with their associated measurement errors.
- nbootstrap (int) – The total number of bootstrap trials to run. This is set to 250 by default, but should probably be around 1000 for realistic results.
- magsarefluxes (bool) – If True, indicates the input time-series is fluxes and not mags.
- sigclip (float or int or sequence of two floats/ints or None) –
If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.
If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.
If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.
- npeaks (int or None) – The number of peaks from the list of ‘nbestlspvals’ in the period-finder result dict to run the bootstrap for. If None, all of the peaks in this list will have their FAP calculated.
Returns: Returns a dict of the form:
{'peaks':allpeaks, 'periods':allperiods, 'probabilities':allfaps, 'alltrialbestpeaks':alltrialbestpeaks}
Return type: dict
- lspinfo (dict) –
Installation¶
Requirements¶
This package requires the following other packages:
- numpy
- scipy
- astropy
- matplotlib
- Pillow
- jplephem
- requests
- tornado
- pyeebls
- tqdm
- scikit-learn
For optional functionality, some additional packages from PyPI are required:
- To use the
astrobase.lcdb
module, you’ll need psycopg2-binary or psycopg2. - To use
astrobase.lcfit.transits.mandelagol_fit_magseries()
for fitting Mandel-Agol planetary transit models, you’ll need batman-package, emcee, corner, and h5py. - To use the Amazon AWS enabled light curve work drivers in the
astrobase.lcproc.awsrun
module, you’ll need paramiko, boto3, and awscli, as well as an AWS account.
Installing with pip¶
If you’re using:
- 64-bit Linux and Python 2.7, 3.4, 3.5, 3.6, 3.7
- 64-bit Mac OSX 10.12+ with Python 2.7 or 3.6
- 64-bit Windows with Python 2.7 and 3.6
You can simply install astrobase with:
(venv)$ pip install astrobase
Otherwise, you’ll need to make sure that a Fortran compiler and numpy are installed beforehand to compile the pyeebls package that astrobase depends on:
## you'll need a Fortran compiler. ##
## on Linux: dnf/yum/apt install gcc gfortran ##
## on OSX (using homebrew): brew install gcc && brew link gcc ##
## make sure numpy is installed as well! ##
## this is required for the pyeebls module installation ##
(venv)$ pip install numpy # in a virtualenv
# or use dnf/yum/apt install numpy to install systemwide
Once that’s done, install astrobase:
(venv)$ pip install astrobase
Other installation methods¶
To Install all the optional dependencies as well:
(venv)$ pip install astrobase[all]
To install the latest version (may be unstable at times):
$ git clone https://github.com/waqasbhatti/astrobase
$ cd astrobase
$ python setup.py install
$ # or use pip install . to install requirements automatically
$ # or use pip install -e . to install in develop mode along with requirements
$ # or use pip install -e .[all] to install in develop mode along with all requirements
Citing Astrobase¶
Released versions of Astrobase are archived at the Zenodo repository. Zenodo provides a DOI that can be cited for each specific version. The following bibtex entry for Astrobase v0.3.8 may be useful as a template. You can substitute in values of month, year, version, doi, and url for the version of astrobase you used for your publication:
@misc{wbhatti_astrobase,
author = {Waqas Bhatti and
Luke G. Bouma and
Joshua Wallace},
title = {\texttt{Astrobase}},
month = feb,
year = 2018,
version = {0.3.8},
publisher = {Zenodo},
doi = {10.5281/zenodo.1185231},
url = {https://doi.org/10.5281/zenodo.1185231}
}
Alternatively, the following bibtex entry can be used for all versions of Astrobase (the DOI will always resolve to the latest version):
@misc{wbhatti_astrobase,
author = {Waqas Bhatti and
Luke G. Bouma and
Joshua Wallace},
title = {\texttt{Astrobase}},
month = oct,
year = 2017,
publisher = {Zenodo},
doi = {10.5281/zenodo.1185231},
url = {https://doi.org/10.5281/zenodo.1185231}
}
Also see this AAS Journals note on citing repositories.
Period-finder algorithms¶
If you use any of the period-finder methods implemented by
astrobase.periodbase
, please also make sure to cite their respective
papers as well.
- the generalized Lomb-Scargle algorithm from Zechmeister & Kurster (2008)
- the phase dispersion minimization algorithm from Stellingwerf (1978, 2011)
- the AoV and AoV-multiharmonic algorithms from Schwarzenberg-Czerny (1989, 1996)
- the BLS algorithm from Kovacs et al. (2002)
- the ACF period-finding algorithm from McQuillan et al. (2013a, 2014)
Changelog¶
Please see https://github.com/waqasbhatti/astrobase/blob/master/CHANGELOG.md for the latest changelog for tagged versions.
License¶
Astrobase is provided under the MIT License. See the LICENSE file for the full text.