do_aggr {opm}R Documentation

Aggregate kinetics using curve-parameter estimation

Description

Aggregate the kinetic data using curve-parameter estimation, i.e. infer parameters from the kinetic data stored in an OPM object using either the grofit package or the built-in method. Optionally include the aggregated values in a novel OPMA object together with previously collected information.

Usage

  ## S4 method for signature 'MOPMX'
do_aggr(object, ...) 
  ## S4 method for signature 'OPM'
do_aggr(object, boot = 0L, verbose = FALSE,
    cores = 1L, options = if (identical(method, "splines"))
      set_spline_options()
    else
      list(), method = "splines", plain = FALSE, logt0 = FALSE) 
  ## S4 method for signature 'OPMS'
do_aggr(object, ...) 
  ## S4 method for signature 'matrix'
do_aggr(object, what = c("AUC", "A"),
    boot = 100L, ci = 0.95, as.pe = "median", ci.type = "norm",
    time.pos = 1L, transposed = FALSE, raw = FALSE, ...) 

Arguments

object

OPM, OPMS or MOPMX object or matrix as output by measurements, i.e. with the time points in the first columns and the measurements in the remaining columns (there must be at least two). For deviations from this scheme see time.pos and transposed.

boot

Integer scalar. Number of bootstrap replicates used to estimate 95-percent confidence intervals (CIs) for the parameters. Set this to zero to omit bootstrapping, resulting in NA entries for the CIs. Note that under the default settings of the matrix method for as.pe, bootstrapping is also necessary to obtain the point estimate.

verbose

Logical scalar. Print progress messages?

cores

Integer scalar. Number of cores to use. Setting this to a value larger than 1 requires that mclapply from the parallel package can be run with more than 1 core, which is impossible under Windows. The cores argument has no effect if opm-fast is chosen (see below). If cores is zero or negative, the overall number of cores on the system as determined by detectCores from the parallel package is used after addition of the original cores argument. For instance, if the system has eight cores, -1 means using seven cores.

options

List. For its use in grofit mode, see grofit.control in that package. The boot and verbose settings, as the most important ones, are added separately (see above). The verbose mode is not very useful in parallel processing. With method "splines", options can be specified using the function set_spline_options.

method

Character scalar. The aggregation method to use. Currently only the following methods are supported:

splines

Fit various splines (smoothing splines and P-splines from mgcv and smoothing splines via smooth.spline) to PM data. Recommended.

grofit

The grofit function in the eponymous package, with spline fitting as default.

opm-fast

The native, faster parameter estimation implemented in the matrix method. This will only yield two of the four parameters, the area under the curve and the maximum height. The area under the curve is estimated as the sum of the areas given by the trapezoids defined by each pair of adjacent time points. The maximum height is just the result of max. By default the median of the bootstrap values is used as point estimate. For details see as.pe.

plain

Logical scalar. If TRUE, only the aggregated values are returned (as a matrix, for details see below). Otherwise they are integrated in an OPMA object together with object.

logt0

Logical scalar passed to measurements.

what

Character scalar. Which parameter to estimate. Currently only two are supported.

ci

Confidence interval to use in the output. Ignored if boot is not positive.

as.pe

Character scalar determining what to output as the point estimate. Either median, mean or pe; the first two calculate the point estimate from the bootstrapping replicates, the third one use the point estimate from the raw data. If boot is 0, as.pe is reset to pe, if necessary, and a warning is issued.

ci.type

Character scalar determining the way the confidence intervals are calculated. Either norm, basic or perc; see boot.ci from the boot package for details.

time.pos

Character or integer scalar indicating the position of the column (or row, see next argument) with the time points.

transposed

Character or integer scalar indicating whether the matrix is transposed compared to the default.

raw

Logical scalar. Return the raw bootstrapping result without CI estimation and construction of the usually resulting matrix?

...

Optional arguments passed between the methods or to boot from the eponymous package.

Details

Behaviour is special if the plate_type is one of those that have to be explicitly set using gen_iii and there is just one point measurement. Because this behaviour is usual for plates measured either in Generation-III (identification) mode or on a MicroStation(TM), the point estimate is simply regarded as ‘A’ parameter (maximum height) and all other parameters are set to NA.

The OPMS method just applies the OPM method to each contained plate in turn; there are no inter-dependencies. The same holds for the MOPMX method.

Note that some spline-fitting methods would crash with constant input data (horizontal lines instead of curves). As it is not entirely clear that those input data always represent artefacts, spline-fitting is skipped in such cases and replaced by reading the maximum height and the area under the curve directly from the data but setting the slope and the lag phase to NA, with a warning.

Examples with plain = TRUE are not given, as only the return value is different: Let x be the normal result of do_aggr(). The matrix returned if plain is TRUE could then be received using aggregated(x), whereas the ‘method’ and the ‘settings’ attributes could be obtained as components of the list returned by aggr_settings(x).

The matrix method quickly estimates the curve parameters AUC (area under the curve) or A (maximum height). This is not normally directly called by an opm user but via the other do_aggr methods.

The aggregated values can be queried for using has_aggr and received using aggregated.

Value

If plain is FALSE, an OPMA object. Otherwise a numeric matrix of the same structure than the one returned by aggregated but with an additional ‘settings’ attribute containing the (potentially modified) list proved via the settings argument, and a ‘method’ attribute corresponding to the method argument.

The matrix method returns a numeric matrix with three rows (point estimate, lower and upper CI) and as many columns as data columns (or rows) in object. If raw is TRUE, it returns an object of the class ‘boot’.

References

Brisbin, I. L., Collins, C. T., White, G. C., McCallum, D. A. 1986 A new paradigm for the analysis and interpretation of growth data: the shape of things to come. The Auk 104, 552–553.

Efron, B. 1979 Bootstrap methods: another look at the jackknife. Annals of Statistics 7, 1–26.

Kahm, M., Hasenbrink, G., Lichtenberg-Frate, H., Ludwig, J., Kschischo, M. grofit: Fitting biological growth curves with R. Journal of Statistical Software 33, 1–21.

Vaas, L. A. I., Sikorski, J., Michael, V., Goeker, M., Klenk H.-P. 2012 Visualization and curve parameter estimation strategies for efficient exploration of Phenotype Microarray kinetics. PLoS ONE 7, e34846.

See Also

grofit::grofit parallel::detectCores

Other aggregation-functions: set_spline_options

Examples

# OPM method

# Run a fast estimate of A and AUC without bootstrapping
copy <- do_aggr(vaas_1, method = "opm-fast", boot = 0,
  options = list(as.pe = "pe"))
aggr_settings(vaas_1)
## $method
## [1] "grofit"
## 
## $options
## $options$neg.nan.act
## [1] FALSE
## 
## $options$clean.bootstrap
## [1] TRUE
## 
## $options$suppress.messages
## [1] TRUE
## 
## $options$fit.opt
## [1] "s"
## 
## $options$log.x.gc
## [1] FALSE
## 
## $options$log.y.gc
## [1] FALSE
## 
## $options$interactive
## [1] FALSE
## 
## $options$nboot.gc
## [1] 100
## 
## $options$smooth.gc
## NULL
## 
## $options$smooth.dr
## NULL
## 
## $options$have.atleast
## [1] 6
## 
## $options$parameter
## [1] 9
## 
## $options$log.x.dr
## [1] FALSE
## 
## $options$log.y.dr
## [1] FALSE
## 
## $options$nboot.dr
## [1] 0
## 
## $options$model.type
## [1] "logistic"     "richards"     "gompertz"     "gompertz.exp"
## 
## 
## $software
## [1] "opm"
## 
## $version
## [1] "0.1-0"
aggr_settings(copy)
## $method
## [1] "opm-fast"
## 
## $options
## $options$as.pe
## [1] "pe"
## 
## $options$boot
## [1] 0
## 
## $options$preceding_transformation
## [1] "none"
## 
## 
## $software
## [1] "opm"
## 
## $version
## [1] "1.3.63"
stopifnot(has_aggr(vaas_1), has_aggr(copy))

# Compare the results to the ones precomputed with grofit
# (1) A
a.grofit <- aggregated(vaas_1, "A", ci = FALSE)
a.fast <- aggregated(copy, "A", ci = FALSE)
plot(a.grofit, a.fast)

plot of chunk unnamed-chunk-1

stopifnot(cor.test(a.fast, a.grofit)$estimate > 0.999)
# (2) AUC
auc.grofit <- aggregated(vaas_1, "AUC", ci = FALSE)
auc.fast <- aggregated(copy, "AUC", ci = FALSE)
plot(auc.grofit, auc.fast)

plot of chunk unnamed-chunk-1

stopifnot(cor.test(auc.fast, auc.grofit)$estimate > 0.999)

## Not run: 
##D  # Without confidence interval (CI) estimation
##D   x <- do_aggr(vaas_1, boot = 0, verbose = TRUE)
##D   aggr_settings(x)
##D   aggregated(x)
##D 
##D   # Calculate CIs with 100 bootstrap (BS) replicates, using 4 cores
##D   # (do not try to use > 1 core on Windows)
##D   x <- do_aggr(vaas_1, boot = 100, verbose = TRUE, cores = 4)
##D   aggr_settings(x)
##D   aggregated(x)
## End(Not run)

# matrix method
(x <- do_aggr(measurements(vaas_1)))[, 1:3]
##                    A01      A02      A03
## AUC.point.est 8939.750 18439.62 21999.06
## AUC.ci.low    8648.864 17788.22 21243.89
## AUC.ci.high   9166.651 18946.98 22618.26
stopifnot(identical(dim(x), c(3L, 96L)))

[Package opm version 1.3.63 Index]