Welcome to Goodman HTS Pipeline’s Documentation

https://travis-ci.org/soar-telescope/goodman.svg?branch=master https://coveralls.io/repos/github/soar-telescope/goodman/badge.svg?branch=master Documentation Status http://img.shields.io/badge/powered%20by-AstroPy-orange.svg?style=flat

This is the User Manual for the Goodman Spectroscopic Data Reduction Pipeline. It provides an overview of the pipeline’s main features, instructions on its use and how to run it on our dedicated Data Reduction Server, and installation instructions for those who wish to run it on their own computers.

Overview

The Goodman Spectroscopic Data Reduction Pipeline - The Goodman Pipeline - is a Python-based package for producing science-ready, wavelength-calibrated, 1-D spectra. The goal of The Goodman Pipeline is to provide SOAR users with an easy to use, very well documented software for reducing spectra obtained with the Goodman High Troughput Spectrograph. Though the current implementation assumes offline data reduction, our aim is to provide the capability to run it in real time, so 1-D wavelength calibrated spectra can be produced shortly after the shutter closes.

The pipeline is primarily intended to be run on a data reduction dedicated computer though it is available for local installation. The Goodman Spectroscopic Pipeline project is hosted at GitHub at it’s GitHub Repository.

Instructions for running the software are provided in the Usage section of this guide. How to access the the data reduction server is on Setup for Remote Use or if you prefer to install a local version instructions are in Install

Currently the pipeline is separated into two main components. The initial processing is done by redccd, which does the following processess.

  • Identifies calibrations and science frames.
  • Create master bias.
  • Creates master flats and normalizes it.
  • Apply overscan correction.
  • Trims the image.
  • For spectroscopic data find slit edges and trims again.
  • Applies bias correction.
  • Applies flat correction.
  • Applies cosmic ray removal.

The spectroscopic processing is done by redspec and carries out the following steps:

  • Identifies point-source targets.
  • Traces the spectra.
  • Extracts the spectra.
  • Estimates and subtract background.
  • Saves extracted (1D) spectra, without wavelength calibration.
  • Finds the wavelength solution.
  • Linearizes data (resample)
  • Writes the wavelength solution to FITS header
  • Creates a new file for the wavelength-calibrated 1D spectrum

Usage

The Goodman Spectroscopic Pipeline is designed to be simple to use, however simple does not always is the best case for everyone, thus The Goodman Pipeline is also flexible.

Getting Help.

This manual is intended to be the prefered method to get help. However the quickest option is using -h or --help

redccd --help

Will print the list of arguments along with a quick explanation and default values.

It is the same for redspec

redspec --help

Prepare Data for Reduction

If you did a good job preparing and doing the observation this should be an easy step, either way, keep in mind the following steps.

  • Remove all focus sequence.
  • Remove all target acquisition or test frames.
  • Using your observation’s log remove all unwanted files.
  • Make sure all data has the same gain (GAIN) and readout noise (RDNOISE)
  • Make sure all data has the same Region Of Interest or ROI (ROI).

The pipeline does not modify the original files unless there are problems with fits compliance, is never a bad idea to keep copies of your original data in a safe place.

Processing your 2D images

It is the first step in the reduction process, the main tasks are listed below.

  • Create master bias
  • Create master flats
  • Apply Corrections:
    • Overscan
    • Trim image
    • Detect slit and trim out non-illuminated areas
    • Bias correction
    • Normalized flat field correction
    • Cosmic ray rejection

Note

Some older Goodman HTS data has headers that are not FITS compliant, In such cases the headers are fixed and that is the only modification done to raw data.

The 2D images are initially reduced using redccd. You can simply move to the directory where your raw data is located and do:

redccd

Though you can modify the behavior in several ways.

Running redccd will create a directory called RED where it will put your reduced data. If you want to run it again it will prevent you from accidentally removing your already reduced data unless you use --auto-clean this will tell the pipeline to delete the RED directory and start over.

redccd --auto-clean

A summary of the most important command line arguments are presented below.

  • --cosmic <method> Let you select the method to do Cosmic Ray Removal.
  • --debug Show extended messages and plots of intermediate steps.
  • --flat-normalize <method> Let you select the method to do Flat Normalization.
  • --flat-norm-order <order> Set order for the model used to do Flat Normalization. Default 15.
  • --ignore-bias Ignores the existence or lack of BIAS data.
  • --ignore-flats Ignores the existence or lack of FLAT data.
  • --raw-path <path> Set the directory where the raw data is located, can be relative.
  • --red-path <path> Set the directory where the reduced data will be stored. Default RED.
  • --saturation <saturation> Set the saturation level. Flats exceeding the saturation level will be discarded. Default 65.000 ADU.

This is intended to work with spectroscopic and imaging data, that it is why the process is split in two.

Extracting the spectra

After you are done Processing your 2D images it is time to extract the spectrum into a wavelength-calibrated 1D file.

The script is called redspec. The tasks performed are the following:

  • Classifies data and creates the match of OBJECT and COMP if it exists.
  • Identifies targets
  • Extracts targets
  • Saves extracted targets to 1D spectrum
  • Finds wavelength solution automatically
  • Linearizes data
  • Saves wavelength calibrated file

First you have to move into the RED directory, this is a precautionary method to avoid unintended deletion of your raw data. Then you can simply do:

redspec

And the pipeline should work its magic, though this might not be the desired behavior for every user or science case, we have implemented a set of command line arguments which are listed below.

  • --data-path <path> Folder were data to be processed is located. Default is current working directory.
  • --proc-path <path> Folder were processed data will be stored. Default is current working directory.
  • --search-pattern <pattern> Prefix for picking up files. Default cfzsto. See File Prefixes.
  • --extraction <method> Select the Extraction Methods. The only one implemented at the moment is fractional .
  • --reference-files <path> Folder where to find reference-lamps
  • --debug Shows extended and more messages. Also show plots of intermediate steps.
  • --max-targets <value> Maximum number of targets to detect in a single image. Default is 3.
  • --save-plots Save plots as described in Plotting & Save
  • --plot-results Show plots during execution.

The mathematical model used to define the wavelength solution is recorded in the header even though the data has been linearized for record purpose.

Description of custom keywords

The pipeline adds several keywords to keep track of the process and in general for keeping important information available. The following table gives a description of all the keywords added by The Goodman Pipeline, though not all of them are added to all the images.

General Purpose Keywords

These keywords are used for record purpose, except for GSP_FNAM which is used to keep track of the file name.

General purpose keywords, added to all images at the moment of the first read.
Keyword Purpose
GSP_VERS Pipeline version.
GSP_ONAM Original file name, first read.
GSP_PNAM Parent file name.
GSP_FNAM Current file name.
GSP_PATH Path from where the file was read.
GSP_TECH Observing technique. Imaging or Spectroscopy.
GSP_DATE Date of processing.
GSP_OVER Overscan region.
GSP_TRIM Trim section.
GSP_SLIT Slit trim section. From slit-illuminated area.
GSP_BIAS Master bias file used.
GSP_FLAT Master flat file used.
GSP_NORM Master flat normalization method.
GSP_COSM Cosmic ray rejection method.
GSP_WRMS Wavelength solution RMS Error.
GSP_WPOI Number of points used to calculate RMS Error.
GSP_WREJ Number of points rejected from RMS Error Calculation.
GSP_DCRR Reference paper for DCR software (cosmic ray rejection).

Non-linear wavelength solution

Since writing non-linear wavelength solutions to the headers using the FITS standard (reference) is extremely complex and not necessarily well documented, we came up with the solution of simply describing the mathematical model from astropy’s modeling. This allows for maintaining the data untouched while keeping a reliable description of the wavelength solution.

The current implementation will work for writting any polinomial model. Reading is implemented only for Chebyshev1D which is the model by default.

Keywords used to describe a non-linear wavelength solution.
Keyword Purpose
GSP_FUNC Name of mathematical model from astropy’s modeling
GSP_ORDR Order of the model used.
GSP_NPIX Number of pixels.
GSP_C000 Value of parameter c0.
GSP_C001 Value of parameter c1.
GSP_C002 Value of parameter c2. This goes on depending the order.

Combined Images

Every image used in a combination of images is recorded in the header of the resulting one. The order does not have importance but most likely the header of the first one will be used.

The combination is made using the combine() method with the following parameters

  • method='median'
  • sigma_clip=True
  • sigma_clip_low_thresh=1.0
  • sigma_clip_high_thresh=1.0

At this moment these parameters are not user-configurable.

Keywords that list all the images used to produce a combined image.
Keyword Purpose
GSP_IC01 First image used to create combined.
GSP_IC02 Second image used to create combined.

Detected lines

The reference lamp library maintains the lamps non-linearized and also they get a record of the pixel value and its equivalent in angstrom. In the following table a three-line lamp is shown.

Description of all the keywords used to list lines in lamps in Pixel and Angstrom.
Keyword Purpose
GSP_P001 Pixel value for the first line detected.
GSP_P002 Pixel value for the second line detected.
GSP_P003 Pixel value for the third line detected.
GSP_A001 Angstrom value for the first line detected.
GSP_A002 Angstrom value for the second line detected.
GSP_A003 Angstrom value for the third line detected.

Cosmic Ray Removal

The argument --cosmic <method> has four options but there are only two real methods.

default (default):

Different methods work different for different binning. So if <method> is set to default the pipeline will decide as follows:

dcr for binning 1x1

lacosmic for binning 2x2 and 3x3 though binning 3x3 has not being tested.

dcr:

It was already said that this method work better for binning 1x1. More information can be found on Installing DCR. The disadvantages of this method is that is a program written in C and it is required to write the file to the disk, process it and read it back again. Still is faster than lacosmic.

The parameters for running dcr are written in a file called dcr.par a lookup table and a file generator have been implemented but you can parse custom parameters by placing a dcr.par file in a different directory and point it using --dcr-par-file <path>.

lacosmic:
This is the preferred method for files with binning 2x2 and 3x3. This is the Astroscrappy’s implementation and is run with the default parameters. Future versions might include some parameter adjustment.
none:
Skips the cosmic ray removal process.

Asymetric binnings have not been tested but the pipeline only takes in consideration the dispersion axis to decide. This does not mean that the spatial binning does not impact the performance of any of the methods, we just don’t know it yet.

Flat Normalization

There are three possible <method> (s) to do the normalization of master flats. For the method using a model the default model’s order is 15. It can be set using --flat-norm-order <order>.

mean:
Calculates the mean of the image using numpy’s mean() and divide the image by it.
simple (default):
Collapses the master flat across the spatial direction, fits a Chebyshev1D model of order 15 and divide the full image by this fitted model.
full:
Fits a Chebyshev1D model to every line/column (dispersion axis) and divides it by the fitted model. This method takes too much to process and it has been left in the code for experimentation purposes only.

Extraction Methods

The argument --extraction <method> has two options but only fractional is implemented.

fractional:
Fractional pixel extraction differs from a simple and rough extraction in how it deals with the edges of the region. pipeline.core.core.extract_fractional_pixel()
optimal:
Unfortunately this method has not been implemented yet.

File Prefixes

There are several ways one can do this but we selected adding prefixes to the file name because is easier to add and also easy to filter using a terminal, for instance.

ls cfzsto*fits

or in python

import glob

file_list = glob.glob('cfzsto*fits')

So what does all those letter mean? Here is a table to explain it.

Characters and meaning of prefixes
Letter Meaning
o Overscan Correction Applied
t Trim Correction Applied
s Slit trim correction applied
z Bias correction applied
f Flat correction applied
c Cosmic rays removed
e Spectrum extracted to 1D
w 1D Spectrum wavelength calibrated

So, for an original file named file.fits:

o_file.fits

Means the file have been overscan corrected while

eczsto_file.fits

Means the spectrum has been extracted to a 1D file but the file has not been flat fielded (f missing).

Ideally after running redccd the file should be named:

cfzsto_file.fits

And after running redspec:

wecfzsto_file.fits

Setup for Remote Use

The Goodman Spectroscopic Data Reduction Pipeline has been installed on a dedicated computer at SOAR. The procedure requires to open a VNC session, for which you need to be connected to the SOAR VPN. The credentials for the VPN are the same you used for your observing run, provided by your Support Scientist, who will also give you the information for the data reduction computer VNC connection.

Note

IRAF is available in the data server at SOAR. Running iraf will open an xgterm and ds9 windows. iraf-only will open xgterm but not ds9

Establish a VNC connection

Separately, you should receive a server hostname, IP, display number and VNC-password.

VNC display number and working folder assigned to each partner.
Display Partner/Institution Folder
:1 NOAO /home/goodman/data/NOAO
:2 Brazil /home/goodman/data/BRAZIL
:3 UNC /home/goodman/data/UNC
:4 MSU /home/goodman/data/MSU
:5 Chile /home/goodman/data/CHILE

For this tutorial we will call the vnc server host name as <vnc-server> the display number is <display-number> and your password is <password>.

The VNC connection should work with any VNC Client like TightVNC, TigerVNC, RealVNC, etc. The first two run on Linux and can be used directly with the vncviewer command line.

Important

Please, help us to create an organized enviroment by creating a new folder using the format YYYY-MM-DD within your institution’s directory and using it to process your data.

VNC from the Terminal

Find the <display-number> that corresponds to you from the VNC Displays table. Open a terminal, and assuming you have installed vncviewer.

vncviewer <vnc-server>:<display-number>

You will be asked to type in the <password> provided.

Important

The real values for <vnc-server> and <password> should be provided by your support scientist.

If the connection succeeds you will see a Centos 7 Desktop using Gnome.

Install

Using the pipeline remotely is the recommended method, in which case you don’t need to worry about software requirements.

However, for users who wish to go ahead with a local installation, we provide simple instructions in the current section.

Requirements

The The Goodman Pipeline is completely written in Python 3.x and relies on several libraries like:

  • NumPy
  • SciPy
  • MatPlotLib
  • Pandas
  • AstroPy
  • AstroPy/ccdproc
  • AstroPy/astroplan
  • DCR

Using Conda

We do not recommend the installation of these libraries or the The Goodman Pipeline in your system since updates and upgrades may ruin it. We rather recommend the use of Virtual Environments. If you are not familiar with this term, please check the official documentation by visiting the links below:

Another option is to install Conda, a Virtual Environment Manager, or AstroConda, the same but for astronomers. Everything you need to know about installing both can be found in the link below:

Working with Virtual Environments

Virtual environments are a very useful tool, the main contribution of them being:

  • Portability
  • Protection to the host environment
  • Flexibility

If you know nothing about them we recommend you to start in the Conda site.

For the purpose of this manual we will just say that a Virtual Environment lets you have a custom set of libraries/tools in one place, and most importantly is independent of your host system. Installation will not be discussed here but you can visit this link for information.

Discover what environments exist in your system.

conda env list

Will print a list where the first column is the name.

Activate (enter) the virtual Environment.

source activate <venv-name>

Where <venv-name> is the name of your virtual environment. Your shell’s prompt will change to:

(<venv-name>) [user@hostname folder-name]$

Deactivate (leave) the virtual environment.

source deactivate

This time the prompt will change again to:

[user@hostname folder-name]$

Using PIP

Warning

You may find that ccdproc and astroplan do not come with Astroconda. They are not available on any Conda channel either. That means that you will have to install them separately. You can do so by downloading the source files and installing them by hand, or simply activate your Virtual Environment and then install these two packages using pip with

pip install ccdproc astroplan

Setup for local installation

System installation is not recommended because it can mess things up specially in Linux and Mac OS. Before you proceed, make sure that your system has all the required libraries, as described in Requirements.

Once you have Python running and all the libraries installed either using Conda/AstroConda or not, you may download the last version available in the following address:

Before continuing, make sure that your Virtual Environment is active if this is the case. There are several ways of doing this but normally the command below should work:

$ source activate <my_environment_name>

Where <my_environment_name> is the name of your Virtual Environment (e.g. astroconda).

Now you can finally install the The Goodman Pipeline. Download the file, decompress it, and enter the directory created during the file decompression. Test the installation by typing:

$ python setup.py test

If you have any errors, check the traceback. If you find difficulties carring on at this poing, you may contact us by opening a new issue or using the e-mail goodman-pipeline@ctio.noao.edu.

If no error messages start popping up in your screen, you are good to carry on with the installation.

$ python setup.py install

Note

This will install the pipeline in the currently active Python version. If you have Virtual Environments, make sure that it is active. If not, you can add the --user option to install only for your user and avoid needing root access.

Installing DCR

Acknowledgement Note

Please cite: Pych, W., 2004, PASP, 116, 148

In terms of cosmic ray rejection we shifted to a non-python package because the results were much better compared to LACosmic’s implementation in Astropy. LACosmic was not designed to work with spectroscopy. Though since version 1.1.0 we shifted from Astropy to Astroscrappy’s implementation of LACosmic.

The latest version of the Goodman Spectroscopic Pipeline uses a modified version of dcr to help with the pipeline’s workflow. It is included under

<path_to_download_location>/goodman/pipeline/data/dcr-source/dcr/

goodman is the folder that will be created once you untar or unzip the latest release of the The Goodman Pipeline.

Important

The changes we made to DCR include deletion of all HISTORY and COMMENT keywords, which we don’t use in the pipeline. And addition of a couple of custom keywords, such as: GSP_FNAM, which stores the name of the file being created. GSP_DCRR which stores the reference to the paper to cite.

You are still encouraged to visit the official Link. We remind again that users of the Goodman Pipeline should cite the DCR paper with the reference indicated above.

Compiling DCR

Compiling dcr is actually very simple.

cd <path_to_download_location>/goodman/pipeline/data/dcr-source/dcr/

Then simply type:

make

This will compile dcr and also it will create other files. The executable binary here is dcr.

We have successfully compiled dcr right out the box in several platforms, such as:

  • Ubuntu 16.04
  • Centos 7.1, 7.4
  • MacOS Sierra
  • Solaris 11

Installing the DCR binary

This is a suggested method. If you are not so sure what you are doing, we recommend you follow the steps shown below. If you are a more advanced user and you want to do it your own way, all you have to achieve is to have the dcr executable binary in your $PATH variable.

  1. Open a terminal

  2. In your home directory create a hidden directory .bin (Home directory should be the default when you open a new terminal window)

    mkdir ~/.bin

  3. Move the binary of your choice and rename it dcr. If you compiled it, most likely it’s already called dcr so you can ignore the renaming part of this step.

    mv dcr.Ubuntu16.04 ~/.bin/dcr

    Or

    mv dcr ~/.bin/dcr

  4. Add your $HOME/.bin directory to your $PATH variable. Open the file .bashrc and add the following line.

    export PATH=$PATH:/home/myusername/.bin

    Where /home/myusername is of course your home directory.

  5. Close and reopen the terminal or load the .bashrc file.

    source ~/.bashrc

License

BSD 3-Clause License

Copyright (c) 2018, SOAR Telescope

All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  • Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
  • Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
  • Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Authors and Credits

Development Team

Bruno Quint dedicated part of his time as post-doc to this project. Given that, Bruno Quint would like to acknowledge CNPq for the fellowship which allowed him to contribute to the development of the pipeline.

Contributors

We acknowledge the important contribution of David Sanmartim, who developed the initial incarnation of the redccd module. We thank Tina Armond for her invaluable help in adding calibrated comparison lamps to the library of reference comparison lamps for wavelength solution.

Acknowledgements

Our work would not be possible without the friendly work atmosphere at CTIO headquarters in La Serena, were we can interact with our SOAR and CTIO colleagues in lively and useful discussions that have been important in making the Goodman pipeline possible. We also acknowledge fruitful discussions and suggestions from our colleagues Bart Dunlop, Chris Clemens, and Erik Dennihy, at University of North Carolina at Chapel Hill.

Questions & Answers

  1. What is the Goodman High Throughput Spectrograph?.
answer.
  1. How does the pipeline select the reference lamp?.
answer.
  1. How should I organize the data?.
More than organized your data should be cleaned of undesired files. There are some general assumptions in the implementation of the pipeline’s data organization system that might get confused by files that are not supposed to be there.
  1. What is slit trim?.

Is a process to trim the 2D spectroscopic images to the slit illuminated area only. It works by fitting a box function to the dispersion-axis-collapsed spatial profile.

The box function is Box1D . The reason for doing it is because the non-illuminated area causes all sorts of problems in later steps, such as: existence of nan in master flats.

Change History

V1.1.0 24-07-2018

  • Bugs fixed
    • --keep-cosmic-file would work for dcr but not for lacosmic
  • Changed organization of ReadTheDocs information
    • New structure
    • Added references to external packages
    • This page is the single place to add changes information. CHANGES.md still exist but contains a link here.
  • Added --version argument.
  • Implemented astroscrappy’s LACosmic method
  • removed ccdproc’s cosmicray_lacosmic().
  • created default method for cosmic ray rejection.
    • For binning 1x1 default is dcr
    • For binning 2x2 default is lacosmic
    • For binning 3x3 default is lacosmic

methods dcr, lacosmic or none can still be forced by using --cosmic <method>

V1.0.3 11-07-2018

  • Bugs fixed
    • programatically access to the version number did not work because it was based purely on setup.cfg now setup.py has a function that creates the file pipeline.version which is accessed by pipeline/__init__.py
    • File naming was making some file dissapear by being overwritten for files that contained more than one target the next file name would match the previous one. A differentiator was added.

V1.0.2 10-07-2018

  • Removed module goodman/pipeline/info.py and placed all metadata in goodman/setup.cfg.
  • Several updates to documentation
    • Added comment on how to organize data on soardata3.
    • Added link to licence on footer.
    • User manual now is in ReadTheDocs and no longer available as a pdf.
    • Improved information on debug plots
  • Bugs Fixed.
    • fixed GSP_FNAM value for reference lamps
    • Spectral limit calculation by including binning into the equation
    • Included binning in the calculation of the wavelength solution
    • Corrected messages and conditions under which the prefix for cosmic ray rejection is used
    • Image combination call and messages
  • Other additions + Added lookup table dcr.par file generator and found optimal parameters for Red camera and binning 2x2

V1.0.1 xx-xx-2018

  • Moved user manual from external repo to goodman/docs/
  • Added version checker
  • Centralised metadata (__version__, __licence__, etc) in goodman/setup.cfg
  • Added CHANGES.md

V1.0.0 29-04-2018

  • First production ready release

API Documentation