R: Aggregate kinetics using curve-parameter estimation

do_aggr {opm}

R Documentation

Aggregate kinetics using curve-parameter estimation

Description

Aggregate the kinetic data using curve-parameter estimation, i.e. infer parameters from the kinetic data stored in an OPM object using either the grofit package or the built-in method. Optionally include the aggregated values in a novel OPMA object together with previously collected information.

Usage

  ## S4 method for signature 'MOPMX'
do_aggr(object, ...) 
  ## S4 method for signature 'OPM'
do_aggr(object, boot = 0L, verbose = FALSE,
    cores = 1L, options = if (identical(method, "splines"))
      set_spline_options()
    else
      list(), method = "splines", plain = FALSE, logt0 = FALSE) 
  ## S4 method for signature 'OPMS'
do_aggr(object, ...) 
  ## S4 method for signature 'matrix'
do_aggr(object, what = c("AUC", "A"),
    boot = 100L, ci = 0.95, as.pe = "median", ci.type = "norm",
    time.pos = 1L, transposed = FALSE, raw = FALSE, ...)

Arguments

`object`	`OPM`, `OPMS` or `MOPMX` object or matrix as output by `measurements`, i.e. with the time points in the first columns and the measurements in the remaining columns (there must be at least two). For deviations from this scheme see `time.pos` and `transposed`.
`boot`	Integer scalar. Number of bootstrap replicates used to estimate 95-percent confidence intervals (CIs) for the parameters. Set this to zero to omit bootstrapping, resulting in `NA` entries for the CIs. Note that under the default settings of the matrix method for `as.pe`, bootstrapping is also necessary to obtain the point estimate.
`verbose`	Logical scalar. Print progress messages?
`cores`	Integer scalar. Number of cores to use. Setting this to a value larger than `1` requires that `mclapply` from the parallel package can be run with more than 1 core, which is impossible under Windows. The `cores` argument has no effect if `opm-fast` is chosen (see below). If `cores` is zero or negative, the overall number of cores on the system as determined by `detectCores` from the parallel package is used after addition of the original `cores` argument. For instance, if the system has eight cores, `-1` means using seven cores.
`options`	List. For its use in grofit mode, see `grofit.control` in that package. The `boot` and `verbose` settings, as the most important ones, are added separately (see above). The verbose mode is not very useful in parallel processing. With `method` `"splines"`, options can be specified using the function `set_spline_options`.
`method`	Character scalar. The aggregation method to use. Currently only the following methods are supported: splines Fit various splines (smoothing splines and P-splines from mgcv and smoothing splines via `smooth.spline`) to PM data. Recommended. grofit The `grofit` function in the eponymous package, with spline fitting as default. opm-fast The native, faster parameter estimation implemented in the matrix method. This will only yield two of the four parameters, the area under the curve and the maximum height. The area under the curve is estimated as the sum of the areas given by the trapezoids defined by each pair of adjacent time points. The maximum height is just the result of `max`. By default the median of the bootstrap values is used as point estimate. For details see `as.pe`.
`plain`	Logical scalar. If `TRUE`, only the aggregated values are returned (as a matrix, for details see below). Otherwise they are integrated in an `OPMA` object together with `object`.
`logt0`	Logical scalar passed to `measurements`.
`what`	Character scalar. Which parameter to estimate. Currently only two are supported.
`ci`	Confidence interval to use in the output. Ignored if `boot` is not positive.
`as.pe`	Character scalar determining what to output as the point estimate. Either `median`, `mean` or `pe`; the first two calculate the point estimate from the bootstrapping replicates, the third one use the point estimate from the raw data. If `boot` is 0, `as.pe` is reset to `pe`, if necessary, and a warning is issued.
`ci.type`	Character scalar determining the way the confidence intervals are calculated. Either `norm`, `basic` or `perc`; see `boot.ci` from the boot package for details.
`time.pos`	Character or integer scalar indicating the position of the column (or row, see next argument) with the time points.
`transposed`	Character or integer scalar indicating whether the matrix is transposed compared to the default.
`raw`	Logical scalar. Return the raw bootstrapping result without CI estimation and construction of the usually resulting matrix?
`...`	Optional arguments passed between the methods or to `boot` from the eponymous package.

Details

Behaviour is special if the plate_type is one of those that have to be explicitly set using gen_iii and there is just one point measurement. Because this behaviour is usual for plates measured either in Generation-III (identification) mode or on a MicroStation(TM), the point estimate is simply regarded as ‘A’ parameter (maximum height) and all other parameters are set to NA.

The OPMS method just applies the OPM method to each contained plate in turn; there are no inter-dependencies. The same holds for the MOPMX method.

Note that some spline-fitting methods would crash with constant input data (horizontal lines instead of curves). As it is not entirely clear that those input data always represent artefacts, spline-fitting is skipped in such cases and replaced by reading the maximum height and the area under the curve directly from the data but setting the slope and the lag phase to NA, with a warning.

Examples with plain = TRUE are not given, as only the return value is different: Let x be the normal result of do_aggr(). The matrix returned if plain is TRUE could then be received using aggregated(x), whereas the ‘method’ and the ‘settings’ attributes could be obtained as components of the list returned by aggr_settings(x).

The matrix method quickly estimates the curve parameters AUC (area under the curve) or A (maximum height). This is not normally directly called by an opm user but via the other do_aggr methods.

The aggregated values can be queried for using has_aggr and received using aggregated.

Value

If plain is FALSE, an OPMA object. Otherwise a numeric matrix of the same structure than the one returned by aggregated but with an additional ‘settings’ attribute containing the (potentially modified) list proved via the settings argument, and a ‘method’ attribute corresponding to the method argument.

The matrix method returns a numeric matrix with three rows (point estimate, lower and upper CI) and as many columns as data columns (or rows) in object. If raw is TRUE, it returns an object of the class ‘boot’.

References

Brisbin, I. L., Collins, C. T., White, G. C., McCallum, D. A. 1986 A new paradigm for the analysis and interpretation of growth data: the shape of things to come. The Auk 104, 552–553.

Efron, B. 1979 Bootstrap methods: another look at the jackknife. Annals of Statistics 7, 1–26.

Kahm, M., Hasenbrink, G., Lichtenberg-Frate, H., Ludwig, J., Kschischo, M. grofit: Fitting biological growth curves with R. Journal of Statistical Software 33, 1–21.

Vaas, L. A. I., Sikorski, J., Michael, V., Goeker, M., Klenk H.-P. 2012 Visualization and curve parameter estimation strategies for efficient exploration of Phenotype Microarray kinetics. PLoS ONE 7, e34846.

Examples

# OPM method

# Run a fast estimate of A and AUC without bootstrapping
copy <- do_aggr(vaas_1, method = "opm-fast", boot = 0,
  options = list(as.pe = "pe"))
aggr_settings(vaas_1)

## $method
## [1] "grofit"
## 
## $options
## $options$neg.nan.act
## [1] FALSE
## 
## $options$clean.bootstrap
## [1] TRUE
## 
## $options$suppress.messages
## [1] TRUE
## 
## $options$fit.opt
## [1] "s"
## 
## $options$log.x.gc
## [1] FALSE
## 
## $options$log.y.gc
## [1] FALSE
## 
## $options$interactive
## [1] FALSE
## 
## $options$nboot.gc
## [1] 100
## 
## $options$smooth.gc
## NULL
## 
## $options$smooth.dr
## NULL
## 
## $options$have.atleast
## [1] 6
## 
## $options$parameter
## [1] 9
## 
## $options$log.x.dr
## [1] FALSE
## 
## $options$log.y.dr
## [1] FALSE
## 
## $options$nboot.dr
## [1] 0
## 
## $options$model.type
## [1] "logistic"     "richards"     "gompertz"     "gompertz.exp"
## 
## 
## $software
## [1] "opm"
## 
## $version
## [1] "0.1-0"

aggr_settings(copy)

## $method
## [1] "opm-fast"
## 
## $options
## $options$as.pe
## [1] "pe"
## 
## $options$boot
## [1] 0
## 
## $options$preceding_transformation
## [1] "none"
## 
## 
## $software
## [1] "opm"
## 
## $version
## [1] "1.3.63"

stopifnot(has_aggr(vaas_1), has_aggr(copy))

# Compare the results to the ones precomputed with grofit
# (1) A
a.grofit <- aggregated(vaas_1, "A", ci = FALSE)
a.fast <- aggregated(copy, "A", ci = FALSE)
plot(a.grofit, a.fast)

plot of chunk unnamed-chunk-1

stopifnot(cor.test(a.fast, a.grofit)$estimate > 0.999)
# (2) AUC
auc.grofit <- aggregated(vaas_1, "AUC", ci = FALSE)
auc.fast <- aggregated(copy, "AUC", ci = FALSE)
plot(auc.grofit, auc.fast)