Analysis of large data sets
The upcoming surveys are so large that their analysis has to be automated. Using this motto, I have been exploring the use of the k-means clustering algorithm to organize (classify and process) larger data sets.
- Pre-processing of raw data sets (in the solar-physics context).
- Classification of galaxy spectra. ASK classication of all the galaxies with spectra in SDSS DR7
- Classification of SEGUE stellar spectra
- A single pass k-mean ... faster than the traditional algorithm by Ordovas and SA (2104)