Welcome to The Bump Hunting Project by Patient Rule Induction Method. This website hosts a brief description of the goal of the project and its software PRIMsrc. It describes why and how you can use the software and provides some general remarks and links about it.


Overview

"Bump Hunting" refers to the procedure of mapping out local regions of the input space (attribute/feature/predictor) where a target function of interest, usually unknown, assumes larger (or smaller) values than its average over the entire space. These sought of extreme values in the target function are also known as local/global extrema.

The picture below illustrates the idea. The sunshine over the mountain range shows how light can uncover peaks, highlands and valleys, just like we want to do for data structures in the target function by "Bump Hunting".

Mountains (Bill Wight Photography, Copyright 2015, with permission)

The problem of "Bump Hunting" covers mathematical / statistical tasks such as:

PRIMsrc implements a unified treatment of the "Bump Hunting" task in high-dimensional space. It uses a generic rule-induction algorithm by recursive peelings derived from the Patient Rule Induction Method (PRIM), initially introduced by Fisher & Friedman in 1999 (see Wiki "References"). It generates simple decision rules delineating a region in the multi-dimensional input space, where the target function is unusually larger (or smaller).


Why Use PRIMsrc?

The fact that the method (i) makes minimal assumptions about the data, (ii) gives easily interpretable rules with estimated variance and (iii) can target for any desired responses (being supervised for Survival, Regression and Classification (SRC) settings), makes it highly attractive to the user.

Unlike classical regression, classification and clustering problems, "Bump Hunting" is interested in:

Multiple applications exist in an increasing range of problems spanning from medical, engineering, marketing, business analysis and materials research:


Readme

Visit the software Readme webpage to learn about License, Downloads, Branches, Requirements, Installation and Usage


Wiki

Visit the project Wiki webpage for Roadmap, Documentation ,Examples, Publications, Case Studies, Support and How to Contribute (code and documentation).


Authors/Contributors

Jean-Eudes Dazard, PhD.
Center for Proteomics and Bioinformatics
Case Western Reserve University
Cleveland, Ohio, USA

J. Sunil Rao, PhD.
Division of Biostatistics
Department of Epidemiology and Public Health
The University of Miami
Miami, Florida, USA

Michael LeBlanc, PhD.
Fred Hutchinson Cancer Research Center
Public Health Sciences.
Department of Biostatistics, School of Public Health
The University of Washington
Seattle, Washington, USA

Michael Choe, MD.
Case Western Reserve University
Cleveland, Ohio, USA

Tarn Duong, PhD.
Research scientist
Computer Science Laboratory (LIPN)
University of Paris 13
Paris, France 


Acknowledgements

Project funded in part by the National Institute of Health - National Cancer Institute, Grant: R01-CA160593 awarded to J.Sunil Rao/J-E. Dazard (co-PIs). This work was also made possible thanks to the help of Alberto Santana, MBA (Analyst Programmer, CWRU) and the High Performance Computing Resource in the Core Facility for Advanced Research Computing at Case Western Reserve University. Thanks also to professional photographer Bill Wight CA for the nice illustration picture above.


web counter
web counter