Public Data Sets

2009 PHM Challenge Competition Data Set

The 2009 PHM Challenge focused on fault detection and magnitude estimation for a generic gearbox using accelerometer data and information about bearing geometry. Participants were scored based on their ability to correctly identify type, location, and magnitude and damage in a gear system.


The data is representative of generic industrial gearbox data. Details of the apparatus used to collect the data are provided on the apparatus page.


Data were sampled synchronously from accelerometers mounted on both the input and output shaft retaining plates. An attached tachometer generates 10 pulses per revolution providing very accurate zero crossing information.

Data were collected at 30, 35, 40, 45 and 50 Hz shaft speed, under high and low loading. Additionally, different repeated runs are included in the data, although the run time and load were not sufficient to induce significant fault progression. There are a total of 560 samples to be classified.

Data are provided in .csv files, with three columns – the first column is input voltage, second is output voltage, and the third is tachometer.

The data can be downloaded as five 84 MB Zip files: part 1, part 2, part 3, part 4, part 5.

Note that the above data set is unlabeled. A new (November 2009) 480 MB data set with labels is available for download here. These data are very similar to those used in the competition. These data may be used to develop and evaluate new algorithms.

Domain Fundamentals

Some fundamental signal processing techniques and diagnostic features for gearbox components are provided here. Additionally, code to assist in data analysis is posted on the challenge blog.

Performance Evaluation

The goal is to minimize the Hamming distance between your results matrix and the true state of the system. For example, if the true state of the system is [1,0,0,1,0] and you submit [1,1,0,0,0,0], your score is 2. Best possible performance is indicated by a Hamming distance score of 0 and the worst score is (560×45=)25200, provided the uploaded file is correctly formatted. Only the Hamming distance score mentioned above is used to determine the final scores and ranking.