Data Analysis Competition
THE 2009 COMPETITION HAS CONCLUDED. MANY THANKS TO ALL PARTICIPANTS.
The PHM Data Challenge is a competition open to all potential conference attendees. This year the challenge is focused on fault detection and magnitude estimation for a generic gearbox using accelerometer data and information about bearing geometry.
Participants will be scored based on their ability to correctly identify type, location, and magnitude and damage in a gear system. Winners of the Student and the Professional categories who attend the conference and submit an invited paper to IJPHM on their technique will be awarded a cash prize. Top scoring participants will be invited to present at a special session of the conference.
Additional information can be found on the competition blog, http://phm09challenge.blogspot.com.
Teams may be comprised of one or more researchers. One winner from each of two categories will be determined on the basis of score. The categories are:
- Professional: open to anyone (including mixed teams)
- Student: open to any team with all members enrolled as full time students during the spring 2009 or fall 2010 semesters.
Teams must declare what category they belong to when signing up. There is a cash prize of $1000 for the top entrant from each category, contingent upon:
- attending the conference
- giving an invited presentation on the winning technique
- submitting a journal-quality paper to the International Journal of Prognostics and Health Management (IJPHM) which discloses the full algorithm used.
Additionally, top scoring teams will be invited to give presentations at the special session, and submit papers to IJPHM. Submission of the challenge special session papers is outside the regular paper submission process and follows its own schedule (see the timetable at the bottom of this page).
The organizers of the competition reserve the right to both modify these rules and disqualify any team at their discretion.
Teams may register by contacting the Competition organizers with their name(s), a team alias under which the scores would be posted, affiliation(s) with address(es), contact phone number (for verification) and competition category (professional or student). Student teams should also send the name of the university and the semesters where they are enrolled full-time. You will be emailed your username and password after verification.
PLEASE NOTE: In the spirit of fair competition, we allow only one account per team. Please do not register multiple times under different user names, under fictitious names, or using anonymous accounts. Competition organizers reserve the right to delete multiple entries from the same person (or team) and/or to disqualify those who are trying to “game” the system or using fictitious identities.
The data is representative of generic industrial gearbox data. Details of the apparatus used to collect the data are provided on the apparatus page.
Data were sampled synchronously from accelerometers mounted on both the input and output shaft retaining plates. An attached tachometer generates 10 pulses per revolution providing very accurate zero crossing information.
Data were collected at 30, 35, 40, 45 and 50 Hz shaft speed, under high and low loading. Additionally, different repeated runs are included in the data, although the run time and load were not sufficient to induce significant fault progression. There are a total of 560 samples to be classified.
Data are provided in .csv files, with three columns – the first column is input voltage, second is output voltage, and the third is tachometer.
The data can be downloaded as one 420 MB Zip file using the bittorrent protocol (torrent). Please seed!
Additionally, there is a new (November 2009) 480 MB data set with labels available for download here. These data are very similar to those used in the competition. These data may be used to develop and evaluate new algorithms.
Some fundamental signal processing techniques and diagnostic features for gearbox components are provided here. Additionally, code to assist in data analysis will be posted on the challenge blog http://phm09challenge.blogspot.com
Results must be submitted as a comma separated value (CSV) file, with exactly 560 rows and 45 columns. The values must be 1 or 0, with 1 corresponding to “true” and 0 corresponding to “false”. This page specifies what the values in the columns correspond to. Each row corresponds to run number, i.e., row 1 should correspond to the data in Run_1.csv, etc.
You can submit your results here.
The goal is to minimize the Hamming distance between your results matrix and the true state of the system. For example, if the true state of the system is [1,0,0,1,0] and you submit [1,1,0,0,0,0], your score is 2. Best possible performance is indicated by a Hamming distance score of 0 and the worst score is (560×45=)25200, provided the uploaded file is correctly formatted. Only the Hamming distance score mentioned above is used to determine the final scores and ranking.
Scores from submitted test set results will be posted to the leaderboard. You can use the leaderboard to see where you are in comparison to the other competitors. Your best entry will be posted on the leaderboard. If your current upload ties your previous best (even after tie-breakers), then the latest upload will be displayed on the leaderboard. Our hope is that the leaderboard will inspire some friendly competition!
The leaderboard is updated only once per day around noon Pacific time.
Questions may be submitted using the site contact form. Answers will be posted to the FAQ at http://phm09challenge.blogspot.com
Schedule for PHM Data Challenge
|10 April 2009||Data released|
|13 July 2009||Final submissions due|
|20 July 2009||Winners announced, Invitation to submit paper|
|24 July 2009||Confirmation of willingness to present and publish FULL algorithm in IJPHM|
|28 July 2009||Winners announced|
|31 August 2009||Papers due to IJPHM|
|14 September 2009||Reviewers’ comments back to authors|
|25 September 2009||Final Paper Due|