Software

CRAN Version Development Version
Version Downloads Source CodeTests
(Linux)
Tests (Windows) Test Coverage
betaboost
CoxFlexBoost
Daim
gamboostLSS
kangar00
lethal
mboost
OpenML
opm
papeR
stabs
  • The packages above are ordered alphabetically.

  • Badges show specific package information and test results from github if available.

  • Click on the badges to go to the CRAN package, package source or test results.

  • For more information on a package's purpose click on the package name (or scroll down).

  • Below the packages are grouped by topics. Topics are in alphabetical order.

Boosting

The following packages are devoted to boosting methods. The package mboost is a very generic implementation of boosting methods for a wide range of models. It is further enhanced to fit GAMLSS models, i.e., models that regress multiple parameters on covariates by gamboostLSS.

  • mboost: Model-Based Boosting.

    Functional gradient descent algorithm (boosting) for optimizing general risk functions utilizing component-wise (penalised) least squares estimates or regression trees as base-learners for fitting generalized linear, additive and interaction models to potentially high-dimensional data.

    Authors: Torsten Hothorn, Peter Bühlmann, Thomas Kneib, Matthias Schmid, Benjamin Hofner

    Find out more about mboost

    • A tutorial showing the usage of mboost (including some of the latest features) is available as
        R> vignette("mboost_tutorial", package = "mboost")
      and appeared in Computational Statistics.

    • Short paper showing new features of mboost 2.0 series available at JMLR MLOSS: Model-based Boosting 2.0.

  • gamboostLSS: Boosting Methods for GAMLSS models.

    Boosting models for fitting generalized additive models for location, shape and scale (GAMLSS) to potentially high dimensional data.

    Authors: Benjamin Hofner, Andreas Mayr, Nora Fenske, Matthias Schmid

    Find out more about gamboostLSS

  • CoxFlexBoost: Boosting Flexible Cox Models (with Time-Varying Effects).

    Likelihood-based boosting approach to fit flexible, structured survival models with component-wise linear or P-spline base-learners. Variable selection and model choice are built in features.

    R package version 0.7-0 (beta) now available on R-Forge.

    Author: Benjamin Hofner

  • betaboost: Boosting Beta Regression

    Implements boosting beta regression for potentially high-dimensional data. The betaboost packages uses the same parametrization as betareg to make results directly comparable. The underlying boosting algorithms are implemented via the R add-on packages mboost and gamboostLSS.

    Authors: Andreas Mayr, Benjamin Hofner, Leonie Weinhold, Matthias Schmid

Biological applications

The package lethal can be used to compute lethal doses for count data outcomes (e.g., number of cells). It uses flexible smooth effects models and provides various statistical inference procedures. The package opm can be used to store, manipulate and analysze phenotype microarray data. Splines can be used to fit the growth curves.

  • lethal: Compute lethal doses (LD) with confidence intervals.

    Compute lethal doses for count data based on generalized additive models (GAMs) together with parametric bootstrap confidence intervals for the lethal dose. The package is designed for experiments with counts as outcome, which need a separate preparation for each measurment. Examples for such experiments are survival experiments where the survival is measured as the number of colony forming units (c.f.u.). In this case, one cannot measure one prepartation multiple times with various doses but one needs one experiment (with one or more biological replicates) for each dose.

    Author: Benjamin Hofner

    Find out more about lethal

  • opm: Tools for analysing OmniLog(R) Phenotype Microarray data.

    Tools for analysing OmniLog® and MicroStation™ phenotype microarray (PM) data as produced by the devices distributed by BIOLOG Inc. as well as similar kinds of data such as growth curves. Major facilities are plotting data, accurately estimating curve parameters, comparing and discretising data, creating phylogenetic formats and reports for taxonomic journals, drawing the PM analysis results in biochemical pathway graphs optionally including genome annotations, running multiple comparisons of means, easy interaction with powerful feature-selection approaches, integrating metadata, using the YAML format for the storage of data and metadata, batch conversion of large numbers of files, and database I/O.

    R package version 1.1-0 available on CRAN (archived).
    Current development version available on R-forge.
    An install script for current versions can be found at http://www.goeker.org/opm.

    Author: Markus Goeker with contributions by Benjamin Hofner, Lea A.I. Vaas, Johannes Sikorski, Nora Buddruhs and Anne Fiebig

    Find out more about opm

Classification

An implementation of various methods to evaluate the performance of classification models is given in the package Daim.

  • Daim: Diagnostic accuracy of classification models.

    Several functions for evaluating the accuracy of classification models. The package provides the following performance measures: repeated k-fold cross-validation, 0.632 and 0.632+ bootstrap estimation of the misclassification rate, sensitivity, specificity and AUC. If an application is computationally intensive, parallel execution can be used to reduce the computational effort.

    Authors: Sergej Potapov, Werner Adler, Benjamin Hofner and Berthold Lausen

GWAS Studies

Methods to analyze genome-wide assissition studies (GWAS) are implemented in the package kangar00. The package also allows, in conjunction with mboost (see above), to fit kernel boosting models for GWAS.

  • kangar00: Kernel Approaches for Nonlinear Genetic Association Regression

    Methods to extract information on pathways, genes and SNPs from online databases. It provides functions for data preparation and evaluation of genetic influence on a binary outcome using the logistic kernel machine test (LKMT). Three different kernel functions are offered to analyze genotype information in this variance component test: A linear kernel, a size-adjusted kernel and a network based kernel.

    Authors: Juliane Manitz, Stefanie Friedrichs, Patricia Burger, Benjamin Hofner, Ngoc Thuy Ha

Machine Learning

To allow better collaboration the web platform OpenML provides a rich infrastructure. The following package provides direct access from within R:

  • OpenML: Exploring Machine Learning Better, Together

    'OpenML.org' is an online machine learning platform where researchers can automatically share data, machine learning tasks and experiments and organize them online to work and collaborate more effectively. We provide a R interface to the OpenML REST API in order to download and upload data sets, tasks, flows and runs, see http://www.openml.org/guide for more information.

    Authors: Giuseppe Casalicchio, Bernd Bischl, Dominik Kirchhoff, Michel Lang, Benjamin Hofner, Jakob Bossek, Pascal Kerschke, Joaquin Vanschoren

Reproducible Research

To achieve reproducible results, it is important to be able to easily generate and modify standard output. The package papeR aims at providing tools to prettify output for reports and to generate tables for easy usage in reports.

  • papeR: A Toolbox for Writing Pretty Papers and Reports

    A toolbox for writing knitr, Sweave or other LaTeX- or markdown-based reports and to prettify the output of various estimated models.

    Author: Benjamin Hofner

    Find out more about papeR

    • A tutorial on how to use papeR can be found here.

Stability Selection

Stability selection is implemented in a versatile and gerneric package stabs. It implements both, the standard error bound derived by Meinshausen & Bühlmann (2010) and the improved error bounds of Shah & Samworth (2013).

  • stabs: Stability Selection with Error Control.

    Resampling procedures to assess the stability of selected variables with additional finite sample error control for high-dimensional variable selection procedures such as Lasso or boosting. Both, standard stability selection (Meinshausen & Bühlmann, 2010) and complementarty pairs stability selection with improved error bounds (Shah & Samworth, 2013) are implemented. The package can be combined with arbitrary user specified variable selection approaches.

    Authors: Benjamin Hofner and Torsten Hothorn

    Find out more about stabs