Weibull-R : Weibull Analysis on R
Package ‘WeibullR’ is now available on CRAN!
After several years of development it finally appeared that a fairly complete scope of analysis on weibull and lognormal distributions was finalized enough for CRAN submission. This makes the package more easily accessed and now available in a compiled form for OSX users.
- - Interval data can now be input, analyzed and displayed.
- - Third parameter optimization with display of data modified by t0.
- - Fisher Matrix bounds can be applied to mle fitted data.
- - A particularly stable likelihood ratio contour algorithm has been implemented in fast C++ code.
- - Likelihood ratio bounds can be applied to mle fitted data.
- - Contour map displays can be drawn based on wblr data entry alone.
- - Ability to alter display of tied values by highest, mean, lowest or sequential specification.
(Code to drop tied values may yet follow, but we really don’t know why anyone would want to do this!)
- - Weibayes analysis is available for datasets with few, one or even no failures points.
The WeibullR package provides a flexible data entry capability with three levels of usage.
*Quick Fit* Functions
Functions with intuitive names `MLEw2p` through `MRRln3p` for preparing simple fits, bounds, and displays using default options. Only data sets with exact failure times and/or suspensions are processed.
The quick fit functions return a simple named vector of the fitted parameters with appropriate goodness of fit measure(s).
Optional preparation of appropriate interval bounds (at 90\\% confidence), or a display of fit and bounds are controlled by two final arguments taking logical entry, such that a function call like `MLEw2p(input_data,T,T)` will generate a plot with the fitted data and confidence interval bounds.
When the first logical for bounds is set to TRUE, the returned object will be a list with the fitted parameter vector first and dataframe of bound values second.
wblr Object Model
Construction of a wblr object is initiated by providing a data set through function `wblr`.
Modification of the object with the progressive addition of fits and confidence interval bounds is made via functions `wblr.fit` and `wblr.conf`.
Fine control over many aspects of fit, confidence, and display are made possible using a flexible options mechanism.
Display for single object models is via S3 methods `plot` or `contour`, while multiple objects *(provided as a list)* can be displayed on a single plot using `plot.wblr`, `plot_contour`, or `contour.wblr`.
Access to back-end functions providing all the functionality of the upper levels of usage are provided as exported functions.
These functions may provide advanced users with resources to expand analysis further than has been implemented within the WeibullR package.
Data entry is made through the *Quick Fit* functions, `wblr`, or on the backend through `getPPP` for rank regression, and `mleframe` for mle processing.
In all cases the primary argument `x` can be a vector of exact time failures or a dataframe with `time`, and `event`columns as a minimum. An additional column `qty` may optionally be used to record duplicated data.
If the dataframe entry is not used (in favor of an exact time failure vector), a second argument, `s`, can be used to enter a vector of last observed success times for right censored data (suspensions).
Beyond the entry of the first two data types, interval data (including discoveries with last known success time=0) are entered via argument `interval` as a dataframe with columns`left`, and `right` as a miniumum. As with the primary argument dataframe entry, an additional column `qty` may optionally be used to record duplicated interval data. Such interval data entry is not supported with the Quick Fit functions.
Please help the development of this significant package. We need realistic example data, particularly interval data. There are ways to ‘anonymize’ data so that it won’t reveal proprietary information. Good data examples can help the community appreciate the value of such reliability analysis and help us test and validate the code.
Historical notes on life data analysis and our original work
Life data analysis has long been available on R. The survival package, sponsored by the Mayo Foundation, has recommended status within the R community. Several other packages support distribution fitting, predominantly by maximum likelihood estimation. The distinction of the packages developed here is the focus on the linear transform of the CDF for graphical display. This is the hallmark of Waloddi Weibull's engineering approach to reliability data, which was originally a fit by eyesight in the absence of any computational automation. Least squares linear regression quickly followed due to its relative computational ease. Eventually, with wider access to computational resources, statisticians espoused the maximum likelihood estimation (MLE) technique due to its "excellent" properties.
Certainly for large (>200 points) data set evaluation the MLE approach should be a priority. This is rarely a problem in bio-statistic study. Unfortunately reliability engineering study tends to rely on small data sets. Due to the severe financial consequences of failure, decisions must be made with limited availability of data. However, in the small data set environment the MLE technique produces a known bias in fitted results. When applied to predictive analytic study of failures using the Weibull distribution, the MLE may result in undesirable optimism.
This project was started as an expository implementation of several functions supporting reliability analysis methods presented in "The New Weibull Handbook, Fifth Edition" by Dr. Robert B. Abernethy. Unfortunately it has been found that this self-published work lacked peer review and has come under academic attack1 for some content. For this reason a determination was made not to submit the abrem package to CRAN and there has been a suspension in activity as the authors moved on to other priorities.
However, the abrem package has generated considerable interest among reliability engineers even in its development status. For this reason it is desirable to continue this work, perhaps with a re-packaging, in order to bring a full array of analytical study for the reliability practitioner to employ. In addition to the mentioned linear display this project work contains several features not easily found elsewhere on CRAN. Features include varied options for least squares linear regression, the maximum likelihood estimation, third parameter fitting for the distributions of interest and confidence bound determination by several methods. Although not implemented in the application layer, predictive maintenance optimization and entry of interval data have been handled in the technical code. Additionally methods of data set comparison are in development.
The packages forming the "Abernethy Reliability Methods", abrem, are distributed as back-end calculation packages, which are called by an application layer. The calculation packages require compilation, which results in a binary file for Windows with a .zip extension. These are not to be confused with compressed file archives, rather they are intended to be opened by R during an installation process only.
Developmental source repositories are on Github.
Here is a user's guide for the abrem application layer.
Carles C.G. has produced a Shiny application using the original abrem packages.
1. Genschel, U. and Meeker, W.Q. (2010). A Comparison of Maximum Likelihood and Median-Rank Regression for Weibull Estimation. Quality Engineering, 22(4): 236–255.