explode_dir {opm}R Documentation

Helper functions for file input and output

Description

Batch-collect information from a series of input files or batch-convert data from input files to data in output files. Alternatively, turn a mixed file/directory list into a list of files or create a regular expression matching certain file extensions, or convert a wildcard pattern to a regular expression, or split files. These functions are not normally directly called by an opm user but by the other IO functions of the package such as collect_template or batch_opm. One can use their demo argument directly for testing the results of the applied file name patterns.

Usage

  explode_dir(names, include = NULL, exclude = NULL,
    ignore.case = TRUE, wildcard = TRUE, recursive = TRUE,
    missing.error = TRUE, remove.dups = TRUE)

  batch_collect(names, fun, fun.args = list(), proc = 1L,
    ..., use.names = TRUE, simplify = FALSE, demo = FALSE)

  batch_process(names, out.ext, io.fun, fun.args = list(),
    proc = 1L, outdir = NULL,
    overwrite = c("yes", "older", "no"), in.ext = "any",
    compressed = TRUE,
    literally = inherits(in.ext, "AsIs"), ...,
    verbose = TRUE, demo = FALSE)

  file_pattern(type = c("both", "csv", "yaml", "json", "yorj", "lims", "nolims",
    "any", "empty"),
    compressed = TRUE, literally = inherits(type, "AsIs"))

  split_files(files, pattern, outdir = "", demo = FALSE,
    single = TRUE, wildcard = FALSE, invert = FALSE,
    include = TRUE, format = opm_opt("file.split.tmpl"),
    compressed = TRUE, ...)

  glob_to_regex(object)

  ## S3 method for class 'character'
 glob_to_regex(object)

  ## S3 method for class 'factor'
 glob_to_regex(object)

Arguments

names

Character vector containing file names or directories, or convertible to such.

object

Character vector or factor.

include

If a character scalar, used as regular expression or wildcard (see the wildcard argument) for selecting from the input files. If NULL, ignored. If a list, used as arguments of file_pattern and its result used as regular expression. Note that selection is done after expanding the directory names to file names.

For split_files a logical scalar. TRUE means to also include the separator lines in the output files.

exclude

Like include, but for excluding matching input files. Note that exclusion is done after applying include.

ignore.case

Logical scalar. Ignore differences between uppercase and lowercase when using include and exclude? Has no effect for NULL values for include or exclude, respectively.

wildcard

Logical scalar. Are include, exclude or pattern wildcards (as used by UNIX shells) that first need to be concerted to regular expressions? Has no effect if lists are used for include or exclude, respectively. See below for details on such wildcards (a.k.a. globbing patterns).

recursive

Logical scalar. Traverse directories recursively and also consider all subdirectories? See list.files from the base package for details.

missing.error

Logical scalar. If a file/directory does not exist, raise an error or only a warning?

remove.dups

Logical scalar. Remove duplicates from names? Note that if requested this is done before expanding the names of directories, if any.

fun

Collecting function. Should use the file name as first argument.

fun.args

Optional list of arguments to fun or io.fun.

...

Optional further arguments passed from batch_process or batch_collect to explode_dir. For split_files, optional arguments passed to grepl, which is used for matching the separator lines. See also invert listed above.

proc

Integer scalar. The number of processes to spawn. Cannot be set to more than 1 core if running under Windows. See the cores argument of do_aggr for details.

simplify

Logical scalar. Should the resulting list be simplified to a vector or matrix if possible?

use.names

Logical scalar. Should names be used for naming the elements of the result?

out.ext

Character scalar. The extension of the output file names (without the dot).

outdir

Character vector. Directories in which to place the output files. If empty or only containing empty strings, the directory of each input file is used.

in.ext

Character scalar. Passed through file_pattern, then used for the replacement of old file extensions with new ones.

type

Character scalar indicating the file types to be matched by extension. For historical reasons, both means either CSV or YAML or JSON, whereas yorj means either YAML or JSON. CSV also includes the LIMS CSV format introduced in 2014, which can be specifically selected using lims or excluded using nolims. Alternatively, directly the extension or extensions, or a list of file names (not NA).

compressed

Logical scalar. Shall compressed files also be matched? This affects the returned pattern as well as the pattern used for extracting file extensions from complete file names (if literally is TRUE).

split_files passes this argument to file_pattern, but here it only affects the way file names are split in extensions and base names. Should only be set to FALSE if input files are not compressed (and have according file extensions).

literally

Logical scalar. Interpret type literally? This also allows for vectors with more than a single element, as well as the extraction of file extensions from file names.

demo

Logical scalar. In the case of batch_process, if TRUE do not convert files, but print the attempted input file-output file conversions and invisibly return a matrix with input files in the first and output files in the second column? For the other functions, the effect is equivalent.

For split_files, do not create files, just return the usual list containing all potentially created files. Note that in contrast to the demo arguments of other IO functions, this requires the input files to be read.

files

Character vector or convertible to such. Names of the files to be split. In contrast to functions such as read_opm, names of directories are not supported (will not be expanded to lists of files).

pattern

Regular expression or shell globbing pattern for matching the separator lines if invert is FALSE (the default) or matching the non-separator lines if otherwise.

Conceptually each of the sections into which a file is split comprises a separator line followed by non-separator lines. That is, separator lines followed by another separator line are ignored. Non-separator lines not preceded by a separator line are treated as a section of their own, however.

single

Logical scalar. If there is only one group per file, i.e. only one output file would result from the splitting, create this file anyway? Such cases could be recognised by empty character vectors as values of the returned list (see below).

invert

Logical scalar. Invert pattern matching, i.e. treat all lines that not match pattern as separators?

format

Character scalar determining the output file name format. It is passed to sprintf and expects three placeholders:

  • the base name of the file;

  • the index of the section;

  • the file extension.

Getting format wrong might result in non-unique file names and thus probably in overwritten files; accordingly, it should be used with care.

io.fun

Conversion function. Should accept infile and outfile as the first two arguments.

overwrite

Character scalar. If ‘yes’, conversion is always tried if infile exists and is not empty. If ‘no’, conversion is not tried if outfile exists and is not empty. If ‘older’, conversion is tried if outfile does not exist or is empty or is older than infile (with respect to the modification time).

verbose

Logical scalar. Print conversion and success/failure information?

Details

Other functions that call explode_dir have a demo argument which, if set to TRUE, caused the respective function to do no real work but print the names of the files that it would process in normal running mode.

glob_to_regex changes a shell globbing wildcard into a regular expression. This is just a slightly extended version of glob2rx from the utils package, but more conversion steps might need to be added here in the future. Particularly explode_dir and the IO functions calling that function internally use glob_to_regex. Some hints when using globbing patterns are given in the following.

The here used globbing search patterns contain only two special characters, ‘?’ and ‘*’, and are thus more easy to master than regular expressions. ‘?’ matches a single arbitrary character, whereas ‘*’ matches zero to an arbitrary number of arbitrary characters. Some examples:

a?c

Matches abc, axc, a c etc. but not abbc, abbbc, ac etc.

a*c

Matches abc, abbc, ac etc. but not abd etc.

ab*

Matches abc, abcdefg, abXYZ etc. but not acdefg etc.

?bc

Matches abc, Xbc, bc etc. but not aabc, abbc, bc etc.

Despite their simplicity, globbing patterns are often sufficient for selecting file names.

split_files subdivides each file into sections which are written individually to newly generated files. Sections are determined with patterns that match the start of a section. This function might be useful for splitting OmniLog(R) multiple-plate CSV files before inputting them with read_opm, even though that function could also input such files directly. It is used in one of the running modes of by batch_opm for splitting files. See also the ‘Examples’. The newly generated files are numbered accordingly; they are not named after any csv_data entry because there is no guarantee that it is present.

Value

explode_dir returns a character vector (which would be empty if all existing files, if any, had been unselected).

batch_collect returns a list, potentially simplified to a vector, depending on the output of fun and the value of simplify. See also demo.

In normal mode, batch_process creates an invisibly returned character matrix in which each row corresponds to a named character vector with the keys infile, outfile, before and after. The latter two describe the result of the action(s) before and after attempting to convert infile to outfile. The after entry is the empty string if no conversion was tried (see overwrite), ok if conversion was successful and a message describing the problems otherwise. For the results of the demo mode see above.

file_pattern yields a character scalar, holding a regular expression. glob_to_regex yields a vector of regular expressions.

split_files creates a list of character vectors, each vector containing the names of the newly generated files. The names of the list are the input file names. The list is returned invisibly.

See Also

base::list.files base::Sys.glob utils::glob2rx base::regex base::split base::strsplit base::file.rename

Other io-functions: batch_opm, collect_template, read_opm, read_single_opm, to_metadata

Examples

# explode_dir()
# Example with temporary directory
td <- tempdir()
tf <- tempfile()
(x <- explode_dir(td))
##   [1] "/tmp/RtmpFh9b6n/file2fca104a0301" "/tmp/RtmpFh9b6n/file2fca104cd350"
##   [3] "/tmp/RtmpFh9b6n/file2fca1063ef43" "/tmp/RtmpFh9b6n/file2fca115b149e"
##   [5] "/tmp/RtmpFh9b6n/file2fca14940993" "/tmp/RtmpFh9b6n/file2fca1749559d"
##   [7] "/tmp/RtmpFh9b6n/file2fca186403e8" "/tmp/RtmpFh9b6n/file2fca197c8e3a"
##   [9] "/tmp/RtmpFh9b6n/file2fca1c5bfb7c" "/tmp/RtmpFh9b6n/file2fca1d66e4d9"
##  [11] "/tmp/RtmpFh9b6n/file2fca1ec4a0e1" "/tmp/RtmpFh9b6n/file2fca1edadfad"
##  [13] "/tmp/RtmpFh9b6n/file2fca1ede8e60" "/tmp/RtmpFh9b6n/file2fca1f09a18d"
##  [15] "/tmp/RtmpFh9b6n/file2fca1f23fa83" "/tmp/RtmpFh9b6n/file2fca1f8c9582"
##  [17] "/tmp/RtmpFh9b6n/file2fca20b6c94a" "/tmp/RtmpFh9b6n/file2fca214a0a4b"
##  [19] "/tmp/RtmpFh9b6n/file2fca21fe224"  "/tmp/RtmpFh9b6n/file2fca22003349"
##  [21] "/tmp/RtmpFh9b6n/file2fca22b7e5c0" "/tmp/RtmpFh9b6n/file2fca240cffba"
##  [23] "/tmp/RtmpFh9b6n/file2fca241952e4" "/tmp/RtmpFh9b6n/file2fca243b8441"
##  [25] "/tmp/RtmpFh9b6n/file2fca2483f120" "/tmp/RtmpFh9b6n/file2fca24a641ee"
##  [27] "/tmp/RtmpFh9b6n/file2fca250238a6" "/tmp/RtmpFh9b6n/file2fca256c5626"
##  [29] "/tmp/RtmpFh9b6n/file2fca26ec2332" "/tmp/RtmpFh9b6n/file2fca2a0b4acc"
##  [31] "/tmp/RtmpFh9b6n/file2fca2a60e4e"  "/tmp/RtmpFh9b6n/file2fca2ba58bf3"
##  [33] "/tmp/RtmpFh9b6n/file2fca2cfd1eb1" "/tmp/RtmpFh9b6n/file2fca2e97095b"
##  [35] "/tmp/RtmpFh9b6n/file2fca303ab12e" "/tmp/RtmpFh9b6n/file2fca312cc31b"
##  [37] "/tmp/RtmpFh9b6n/file2fca313727c1" "/tmp/RtmpFh9b6n/file2fca31420e10"
##  [39] "/tmp/RtmpFh9b6n/file2fca31efa6f5" "/tmp/RtmpFh9b6n/file2fca336628f8"
##  [41] "/tmp/RtmpFh9b6n/file2fca3488b5f"  "/tmp/RtmpFh9b6n/file2fca3665757e"
##  [43] "/tmp/RtmpFh9b6n/file2fca3750e569" "/tmp/RtmpFh9b6n/file2fca388a8962"
##  [45] "/tmp/RtmpFh9b6n/file2fca38fb2e48" "/tmp/RtmpFh9b6n/file2fca3944d426"
##  [47] "/tmp/RtmpFh9b6n/file2fca3a81ea5f" "/tmp/RtmpFh9b6n/file2fca3b47b66e"
##  [49] "/tmp/RtmpFh9b6n/file2fca3b51bf18" "/tmp/RtmpFh9b6n/file2fca3b89a60b"
##  [51] "/tmp/RtmpFh9b6n/file2fca3dff0f36" "/tmp/RtmpFh9b6n/file2fca3e25dd2" 
##  [53] "/tmp/RtmpFh9b6n/file2fca3fc8c05d" "/tmp/RtmpFh9b6n/file2fca43725836"
##  [55] "/tmp/RtmpFh9b6n/file2fca43a276e9" "/tmp/RtmpFh9b6n/file2fca43cf2840"
##  [57] "/tmp/RtmpFh9b6n/file2fca46b5fecb" "/tmp/RtmpFh9b6n/file2fca476986c3"
##  [59] "/tmp/RtmpFh9b6n/file2fca4771e531" "/tmp/RtmpFh9b6n/file2fca479aeed6"
##  [61] "/tmp/RtmpFh9b6n/file2fca4b6c3530" "/tmp/RtmpFh9b6n/file2fca4bb546ae"
##  [63] "/tmp/RtmpFh9b6n/file2fca4bdbad78" "/tmp/RtmpFh9b6n/file2fca4c405ff" 
##  [65] "/tmp/RtmpFh9b6n/file2fca4ca8f0b8" "/tmp/RtmpFh9b6n/file2fca4e379ef8"
##  [67] "/tmp/RtmpFh9b6n/file2fca4f17a58d" "/tmp/RtmpFh9b6n/file2fca4f23425f"
##  [69] "/tmp/RtmpFh9b6n/file2fca503f7394" "/tmp/RtmpFh9b6n/file2fca50db1ebe"
##  [71] "/tmp/RtmpFh9b6n/file2fca519f287e" "/tmp/RtmpFh9b6n/file2fca523c2e97"
##  [73] "/tmp/RtmpFh9b6n/file2fca524e17b9" "/tmp/RtmpFh9b6n/file2fca5338e7f9"
##  [75] "/tmp/RtmpFh9b6n/file2fca53765438" "/tmp/RtmpFh9b6n/file2fca545e2225"
##  [77] "/tmp/RtmpFh9b6n/file2fca5511dcd8" "/tmp/RtmpFh9b6n/file2fca55e92889"
##  [79] "/tmp/RtmpFh9b6n/file2fca57035836" "/tmp/RtmpFh9b6n/file2fca57bfecb4"
##  [81] "/tmp/RtmpFh9b6n/file2fca5adca7e"  "/tmp/RtmpFh9b6n/file2fca5b3d0bd7"
##  [83] "/tmp/RtmpFh9b6n/file2fca5b448779" "/tmp/RtmpFh9b6n/file2fca5c05d09b"
##  [85] "/tmp/RtmpFh9b6n/file2fca5c2ef869" "/tmp/RtmpFh9b6n/file2fca5c9b153a"
##  [87] "/tmp/RtmpFh9b6n/file2fca5cb6dcf3" "/tmp/RtmpFh9b6n/file2fca5e439a59"
##  [89] "/tmp/RtmpFh9b6n/file2fca611fc18c" "/tmp/RtmpFh9b6n/file2fca627f1ecc"
##  [91] "/tmp/RtmpFh9b6n/file2fca65963d3f" "/tmp/RtmpFh9b6n/file2fca66e9299c"
##  [93] "/tmp/RtmpFh9b6n/file2fca6712b379" "/tmp/RtmpFh9b6n/file2fca67b5f12f"
##  [95] "/tmp/RtmpFh9b6n/file2fca67bc7863" "/tmp/RtmpFh9b6n/file2fca68a9018" 
##  [97] "/tmp/RtmpFh9b6n/file2fca69c6ff64" "/tmp/RtmpFh9b6n/file2fca6b3ca195"
##  [99] "/tmp/RtmpFh9b6n/file2fca6b5efab3" "/tmp/RtmpFh9b6n/file2fca6bb8bcd1"
## [101] "/tmp/RtmpFh9b6n/file2fca6cac32df" "/tmp/RtmpFh9b6n/file2fca6dee9b1" 
## [103] "/tmp/RtmpFh9b6n/file2fca6f74bb3a" "/tmp/RtmpFh9b6n/file2fca7002cde9"
## [105] "/tmp/RtmpFh9b6n/file2fca704c0bc5" "/tmp/RtmpFh9b6n/file2fca7056320a"
## [107] "/tmp/RtmpFh9b6n/file2fca77a1afa8" "/tmp/RtmpFh9b6n/file2fca78130371"
## [109] "/tmp/RtmpFh9b6n/file2fca78e2e46f" "/tmp/RtmpFh9b6n/file2fca7a13d2b4"
## [111] "/tmp/RtmpFh9b6n/file2fca7a24c5ed" "/tmp/RtmpFh9b6n/file2fca7a9bb91" 
## [113] "/tmp/RtmpFh9b6n/file2fca7b6cc343" "/tmp/RtmpFh9b6n/file2fca7d44bac2"
## [115] "/tmp/RtmpFh9b6n/file2fca7f35c4dd" "/tmp/RtmpFh9b6n/file2fca9082e5c" 
## [117] "/tmp/RtmpFh9b6n/file2fcaa014277"  "/tmp/RtmpFh9b6n/file2fcaba1b326" 
## [119] "/tmp/RtmpFh9b6n/file2fcac5d2aaf"  "/tmp/RtmpFh9b6n/file2fcadf352f5" 
## [121] "/tmp/RtmpFh9b6n/file2fcaeff1760"  "/tmp/RtmpFh9b6n/file2fcafee0ee5"
write(letters, tf)
(y <- explode_dir(td))
##   [1] "/tmp/RtmpFh9b6n/file2fca104a0301" "/tmp/RtmpFh9b6n/file2fca104cd350"
##   [3] "/tmp/RtmpFh9b6n/file2fca1063ef43" "/tmp/RtmpFh9b6n/file2fca115b149e"
##   [5] "/tmp/RtmpFh9b6n/file2fca14940993" "/tmp/RtmpFh9b6n/file2fca1749559d"
##   [7] "/tmp/RtmpFh9b6n/file2fca186403e8" "/tmp/RtmpFh9b6n/file2fca197c8e3a"
##   [9] "/tmp/RtmpFh9b6n/file2fca1c5bfb7c" "/tmp/RtmpFh9b6n/file2fca1d66e4d9"
##  [11] "/tmp/RtmpFh9b6n/file2fca1e6cdef9" "/tmp/RtmpFh9b6n/file2fca1ec4a0e1"
##  [13] "/tmp/RtmpFh9b6n/file2fca1edadfad" "/tmp/RtmpFh9b6n/file2fca1ede8e60"
##  [15] "/tmp/RtmpFh9b6n/file2fca1f09a18d" "/tmp/RtmpFh9b6n/file2fca1f23fa83"
##  [17] "/tmp/RtmpFh9b6n/file2fca1f8c9582" "/tmp/RtmpFh9b6n/file2fca20b6c94a"
##  [19] "/tmp/RtmpFh9b6n/file2fca214a0a4b" "/tmp/RtmpFh9b6n/file2fca21fe224" 
##  [21] "/tmp/RtmpFh9b6n/file2fca22003349" "/tmp/RtmpFh9b6n/file2fca22b7e5c0"
##  [23] "/tmp/RtmpFh9b6n/file2fca240cffba" "/tmp/RtmpFh9b6n/file2fca241952e4"
##  [25] "/tmp/RtmpFh9b6n/file2fca243b8441" "/tmp/RtmpFh9b6n/file2fca2483f120"
##  [27] "/tmp/RtmpFh9b6n/file2fca24a641ee" "/tmp/RtmpFh9b6n/file2fca250238a6"
##  [29] "/tmp/RtmpFh9b6n/file2fca256c5626" "/tmp/RtmpFh9b6n/file2fca26ec2332"
##  [31] "/tmp/RtmpFh9b6n/file2fca2a0b4acc" "/tmp/RtmpFh9b6n/file2fca2a60e4e" 
##  [33] "/tmp/RtmpFh9b6n/file2fca2ba58bf3" "/tmp/RtmpFh9b6n/file2fca2cfd1eb1"
##  [35] "/tmp/RtmpFh9b6n/file2fca2e97095b" "/tmp/RtmpFh9b6n/file2fca303ab12e"
##  [37] "/tmp/RtmpFh9b6n/file2fca312cc31b" "/tmp/RtmpFh9b6n/file2fca313727c1"
##  [39] "/tmp/RtmpFh9b6n/file2fca31420e10" "/tmp/RtmpFh9b6n/file2fca31efa6f5"
##  [41] "/tmp/RtmpFh9b6n/file2fca336628f8" "/tmp/RtmpFh9b6n/file2fca3488b5f" 
##  [43] "/tmp/RtmpFh9b6n/file2fca3665757e" "/tmp/RtmpFh9b6n/file2fca3750e569"
##  [45] "/tmp/RtmpFh9b6n/file2fca388a8962" "/tmp/RtmpFh9b6n/file2fca38fb2e48"
##  [47] "/tmp/RtmpFh9b6n/file2fca3944d426" "/tmp/RtmpFh9b6n/file2fca3a81ea5f"
##  [49] "/tmp/RtmpFh9b6n/file2fca3b47b66e" "/tmp/RtmpFh9b6n/file2fca3b51bf18"
##  [51] "/tmp/RtmpFh9b6n/file2fca3b89a60b" "/tmp/RtmpFh9b6n/file2fca3dff0f36"
##  [53] "/tmp/RtmpFh9b6n/file2fca3e25dd2"  "/tmp/RtmpFh9b6n/file2fca3fc8c05d"
##  [55] "/tmp/RtmpFh9b6n/file2fca43725836" "/tmp/RtmpFh9b6n/file2fca43a276e9"
##  [57] "/tmp/RtmpFh9b6n/file2fca43cf2840" "/tmp/RtmpFh9b6n/file2fca46b5fecb"
##  [59] "/tmp/RtmpFh9b6n/file2fca476986c3" "/tmp/RtmpFh9b6n/file2fca4771e531"
##  [61] "/tmp/RtmpFh9b6n/file2fca479aeed6" "/tmp/RtmpFh9b6n/file2fca4b6c3530"
##  [63] "/tmp/RtmpFh9b6n/file2fca4bb546ae" "/tmp/RtmpFh9b6n/file2fca4bdbad78"
##  [65] "/tmp/RtmpFh9b6n/file2fca4c405ff"  "/tmp/RtmpFh9b6n/file2fca4ca8f0b8"
##  [67] "/tmp/RtmpFh9b6n/file2fca4e379ef8" "/tmp/RtmpFh9b6n/file2fca4f17a58d"
##  [69] "/tmp/RtmpFh9b6n/file2fca4f23425f" "/tmp/RtmpFh9b6n/file2fca503f7394"
##  [71] "/tmp/RtmpFh9b6n/file2fca50db1ebe" "/tmp/RtmpFh9b6n/file2fca519f287e"
##  [73] "/tmp/RtmpFh9b6n/file2fca523c2e97" "/tmp/RtmpFh9b6n/file2fca524e17b9"
##  [75] "/tmp/RtmpFh9b6n/file2fca5338e7f9" "/tmp/RtmpFh9b6n/file2fca53765438"
##  [77] "/tmp/RtmpFh9b6n/file2fca545e2225" "/tmp/RtmpFh9b6n/file2fca5511dcd8"
##  [79] "/tmp/RtmpFh9b6n/file2fca55e92889" "/tmp/RtmpFh9b6n/file2fca57035836"
##  [81] "/tmp/RtmpFh9b6n/file2fca57bfecb4" "/tmp/RtmpFh9b6n/file2fca5adca7e" 
##  [83] "/tmp/RtmpFh9b6n/file2fca5b3d0bd7" "/tmp/RtmpFh9b6n/file2fca5b448779"
##  [85] "/tmp/RtmpFh9b6n/file2fca5c05d09b" "/tmp/RtmpFh9b6n/file2fca5c2ef869"
##  [87] "/tmp/RtmpFh9b6n/file2fca5c9b153a" "/tmp/RtmpFh9b6n/file2fca5cb6dcf3"
##  [89] "/tmp/RtmpFh9b6n/file2fca5e439a59" "/tmp/RtmpFh9b6n/file2fca611fc18c"
##  [91] "/tmp/RtmpFh9b6n/file2fca627f1ecc" "/tmp/RtmpFh9b6n/file2fca65963d3f"
##  [93] "/tmp/RtmpFh9b6n/file2fca66e9299c" "/tmp/RtmpFh9b6n/file2fca6712b379"
##  [95] "/tmp/RtmpFh9b6n/file2fca67b5f12f" "/tmp/RtmpFh9b6n/file2fca67bc7863"
##  [97] "/tmp/RtmpFh9b6n/file2fca68a9018"  "/tmp/RtmpFh9b6n/file2fca69c6ff64"
##  [99] "/tmp/RtmpFh9b6n/file2fca6b3ca195" "/tmp/RtmpFh9b6n/file2fca6b5efab3"
## [101] "/tmp/RtmpFh9b6n/file2fca6bb8bcd1" "/tmp/RtmpFh9b6n/file2fca6cac32df"
## [103] "/tmp/RtmpFh9b6n/file2fca6dee9b1"  "/tmp/RtmpFh9b6n/file2fca6f74bb3a"
## [105] "/tmp/RtmpFh9b6n/file2fca7002cde9" "/tmp/RtmpFh9b6n/file2fca704c0bc5"
## [107] "/tmp/RtmpFh9b6n/file2fca7056320a" "/tmp/RtmpFh9b6n/file2fca77a1afa8"
## [109] "/tmp/RtmpFh9b6n/file2fca78130371" "/tmp/RtmpFh9b6n/file2fca78e2e46f"
## [111] "/tmp/RtmpFh9b6n/file2fca7a13d2b4" "/tmp/RtmpFh9b6n/file2fca7a24c5ed"
## [113] "/tmp/RtmpFh9b6n/file2fca7a9bb91"  "/tmp/RtmpFh9b6n/file2fca7b6cc343"
## [115] "/tmp/RtmpFh9b6n/file2fca7d44bac2" "/tmp/RtmpFh9b6n/file2fca7f35c4dd"
## [117] "/tmp/RtmpFh9b6n/file2fca9082e5c"  "/tmp/RtmpFh9b6n/file2fcaa014277" 
## [119] "/tmp/RtmpFh9b6n/file2fcaba1b326"  "/tmp/RtmpFh9b6n/file2fcac5d2aaf" 
## [121] "/tmp/RtmpFh9b6n/file2fcadf352f5"  "/tmp/RtmpFh9b6n/file2fcaeff1760" 
## [123] "/tmp/RtmpFh9b6n/file2fcafee0ee5"
stopifnot(length(y) > length(x))
unlink(tf)
(y <- explode_dir(td))
##   [1] "/tmp/RtmpFh9b6n/file2fca104a0301" "/tmp/RtmpFh9b6n/file2fca104cd350"
##   [3] "/tmp/RtmpFh9b6n/file2fca1063ef43" "/tmp/RtmpFh9b6n/file2fca115b149e"
##   [5] "/tmp/RtmpFh9b6n/file2fca14940993" "/tmp/RtmpFh9b6n/file2fca1749559d"
##   [7] "/tmp/RtmpFh9b6n/file2fca186403e8" "/tmp/RtmpFh9b6n/file2fca197c8e3a"
##   [9] "/tmp/RtmpFh9b6n/file2fca1c5bfb7c" "/tmp/RtmpFh9b6n/file2fca1d66e4d9"
##  [11] "/tmp/RtmpFh9b6n/file2fca1ec4a0e1" "/tmp/RtmpFh9b6n/file2fca1edadfad"
##  [13] "/tmp/RtmpFh9b6n/file2fca1ede8e60" "/tmp/RtmpFh9b6n/file2fca1f09a18d"
##  [15] "/tmp/RtmpFh9b6n/file2fca1f23fa83" "/tmp/RtmpFh9b6n/file2fca1f8c9582"
##  [17] "/tmp/RtmpFh9b6n/file2fca20b6c94a" "/tmp/RtmpFh9b6n/file2fca214a0a4b"
##  [19] "/tmp/RtmpFh9b6n/file2fca21fe224"  "/tmp/RtmpFh9b6n/file2fca22003349"
##  [21] "/tmp/RtmpFh9b6n/file2fca22b7e5c0" "/tmp/RtmpFh9b6n/file2fca240cffba"
##  [23] "/tmp/RtmpFh9b6n/file2fca241952e4" "/tmp/RtmpFh9b6n/file2fca243b8441"
##  [25] "/tmp/RtmpFh9b6n/file2fca2483f120" "/tmp/RtmpFh9b6n/file2fca24a641ee"
##  [27] "/tmp/RtmpFh9b6n/file2fca250238a6" "/tmp/RtmpFh9b6n/file2fca256c5626"
##  [29] "/tmp/RtmpFh9b6n/file2fca26ec2332" "/tmp/RtmpFh9b6n/file2fca2a0b4acc"
##  [31] "/tmp/RtmpFh9b6n/file2fca2a60e4e"  "/tmp/RtmpFh9b6n/file2fca2ba58bf3"
##  [33] "/tmp/RtmpFh9b6n/file2fca2cfd1eb1" "/tmp/RtmpFh9b6n/file2fca2e97095b"
##  [35] "/tmp/RtmpFh9b6n/file2fca303ab12e" "/tmp/RtmpFh9b6n/file2fca312cc31b"
##  [37] "/tmp/RtmpFh9b6n/file2fca313727c1" "/tmp/RtmpFh9b6n/file2fca31420e10"
##  [39] "/tmp/RtmpFh9b6n/file2fca31efa6f5" "/tmp/RtmpFh9b6n/file2fca336628f8"
##  [41] "/tmp/RtmpFh9b6n/file2fca3488b5f"  "/tmp/RtmpFh9b6n/file2fca3665757e"
##  [43] "/tmp/RtmpFh9b6n/file2fca3750e569" "/tmp/RtmpFh9b6n/file2fca388a8962"
##  [45] "/tmp/RtmpFh9b6n/file2fca38fb2e48" "/tmp/RtmpFh9b6n/file2fca3944d426"
##  [47] "/tmp/RtmpFh9b6n/file2fca3a81ea5f" "/tmp/RtmpFh9b6n/file2fca3b47b66e"
##  [49] "/tmp/RtmpFh9b6n/file2fca3b51bf18" "/tmp/RtmpFh9b6n/file2fca3b89a60b"
##  [51] "/tmp/RtmpFh9b6n/file2fca3dff0f36" "/tmp/RtmpFh9b6n/file2fca3e25dd2" 
##  [53] "/tmp/RtmpFh9b6n/file2fca3fc8c05d" "/tmp/RtmpFh9b6n/file2fca43725836"
##  [55] "/tmp/RtmpFh9b6n/file2fca43a276e9" "/tmp/RtmpFh9b6n/file2fca43cf2840"
##  [57] "/tmp/RtmpFh9b6n/file2fca46b5fecb" "/tmp/RtmpFh9b6n/file2fca476986c3"
##  [59] "/tmp/RtmpFh9b6n/file2fca4771e531" "/tmp/RtmpFh9b6n/file2fca479aeed6"
##  [61] "/tmp/RtmpFh9b6n/file2fca4b6c3530" "/tmp/RtmpFh9b6n/file2fca4bb546ae"
##  [63] "/tmp/RtmpFh9b6n/file2fca4bdbad78" "/tmp/RtmpFh9b6n/file2fca4c405ff" 
##  [65] "/tmp/RtmpFh9b6n/file2fca4ca8f0b8" "/tmp/RtmpFh9b6n/file2fca4e379ef8"
##  [67] "/tmp/RtmpFh9b6n/file2fca4f17a58d" "/tmp/RtmpFh9b6n/file2fca4f23425f"
##  [69] "/tmp/RtmpFh9b6n/file2fca503f7394" "/tmp/RtmpFh9b6n/file2fca50db1ebe"
##  [71] "/tmp/RtmpFh9b6n/file2fca519f287e" "/tmp/RtmpFh9b6n/file2fca523c2e97"
##  [73] "/tmp/RtmpFh9b6n/file2fca524e17b9" "/tmp/RtmpFh9b6n/file2fca5338e7f9"
##  [75] "/tmp/RtmpFh9b6n/file2fca53765438" "/tmp/RtmpFh9b6n/file2fca545e2225"
##  [77] "/tmp/RtmpFh9b6n/file2fca5511dcd8" "/tmp/RtmpFh9b6n/file2fca55e92889"
##  [79] "/tmp/RtmpFh9b6n/file2fca57035836" "/tmp/RtmpFh9b6n/file2fca57bfecb4"
##  [81] "/tmp/RtmpFh9b6n/file2fca5adca7e"  "/tmp/RtmpFh9b6n/file2fca5b3d0bd7"
##  [83] "/tmp/RtmpFh9b6n/file2fca5b448779" "/tmp/RtmpFh9b6n/file2fca5c05d09b"
##  [85] "/tmp/RtmpFh9b6n/file2fca5c2ef869" "/tmp/RtmpFh9b6n/file2fca5c9b153a"
##  [87] "/tmp/RtmpFh9b6n/file2fca5cb6dcf3" "/tmp/RtmpFh9b6n/file2fca5e439a59"
##  [89] "/tmp/RtmpFh9b6n/file2fca611fc18c" "/tmp/RtmpFh9b6n/file2fca627f1ecc"
##  [91] "/tmp/RtmpFh9b6n/file2fca65963d3f" "/tmp/RtmpFh9b6n/file2fca66e9299c"
##  [93] "/tmp/RtmpFh9b6n/file2fca6712b379" "/tmp/RtmpFh9b6n/file2fca67b5f12f"
##  [95] "/tmp/RtmpFh9b6n/file2fca67bc7863" "/tmp/RtmpFh9b6n/file2fca68a9018" 
##  [97] "/tmp/RtmpFh9b6n/file2fca69c6ff64" "/tmp/RtmpFh9b6n/file2fca6b3ca195"
##  [99] "/tmp/RtmpFh9b6n/file2fca6b5efab3" "/tmp/RtmpFh9b6n/file2fca6bb8bcd1"
## [101] "/tmp/RtmpFh9b6n/file2fca6cac32df" "/tmp/RtmpFh9b6n/file2fca6dee9b1" 
## [103] "/tmp/RtmpFh9b6n/file2fca6f74bb3a" "/tmp/RtmpFh9b6n/file2fca7002cde9"
## [105] "/tmp/RtmpFh9b6n/file2fca704c0bc5" "/tmp/RtmpFh9b6n/file2fca7056320a"
## [107] "/tmp/RtmpFh9b6n/file2fca77a1afa8" "/tmp/RtmpFh9b6n/file2fca78130371"
## [109] "/tmp/RtmpFh9b6n/file2fca78e2e46f" "/tmp/RtmpFh9b6n/file2fca7a13d2b4"
## [111] "/tmp/RtmpFh9b6n/file2fca7a24c5ed" "/tmp/RtmpFh9b6n/file2fca7a9bb91" 
## [113] "/tmp/RtmpFh9b6n/file2fca7b6cc343" "/tmp/RtmpFh9b6n/file2fca7d44bac2"
## [115] "/tmp/RtmpFh9b6n/file2fca7f35c4dd" "/tmp/RtmpFh9b6n/file2fca9082e5c" 
## [117] "/tmp/RtmpFh9b6n/file2fcaa014277"  "/tmp/RtmpFh9b6n/file2fcaba1b326" 
## [119] "/tmp/RtmpFh9b6n/file2fcac5d2aaf"  "/tmp/RtmpFh9b6n/file2fcadf352f5" 
## [121] "/tmp/RtmpFh9b6n/file2fcaeff1760"  "/tmp/RtmpFh9b6n/file2fcafee0ee5"
stopifnot(length(y) == length(x))

# Example with R installation directory
(x <- explode_dir(R.home(), include = "*/doc/html/*"))
##  [1] "/usr/local/lib/R/doc/html/NEWS.2.html"            
##  [2] "/usr/local/lib/R/doc/html/NEWS.html"              
##  [3] "/usr/local/lib/R/doc/html/R.css"                  
##  [4] "/usr/local/lib/R/doc/html/Rlogo.pdf"              
##  [5] "/usr/local/lib/R/doc/html/Rlogo.svg"              
##  [6] "/usr/local/lib/R/doc/html/Search.html"            
##  [7] "/usr/local/lib/R/doc/html/SearchOn.html"          
##  [8] "/usr/local/lib/R/doc/html/about.html"             
##  [9] "/usr/local/lib/R/doc/html/favicon.ico"            
## [10] "/usr/local/lib/R/doc/html/index.html"             
## [11] "/usr/local/lib/R/doc/html/left.jpg"               
## [12] "/usr/local/lib/R/doc/html/logo.jpg"               
## [13] "/usr/local/lib/R/doc/html/logosm.jpg"             
## [14] "/usr/local/lib/R/doc/html/packages-head-utf8.html"
## [15] "/usr/local/lib/R/doc/html/packages.html"          
## [16] "/usr/local/lib/R/doc/html/resources.html"         
## [17] "/usr/local/lib/R/doc/html/right.jpg"              
## [18] "/usr/local/lib/R/doc/html/up.jpg"
(y <- explode_dir(R.home(), include = "*/doc/html/*", exclude = "*.html"))
## [1] "/usr/local/lib/R/doc/html/R.css"      
## [2] "/usr/local/lib/R/doc/html/Rlogo.pdf"  
## [3] "/usr/local/lib/R/doc/html/Rlogo.svg"  
## [4] "/usr/local/lib/R/doc/html/favicon.ico"
## [5] "/usr/local/lib/R/doc/html/left.jpg"   
## [6] "/usr/local/lib/R/doc/html/logo.jpg"   
## [7] "/usr/local/lib/R/doc/html/logosm.jpg" 
## [8] "/usr/local/lib/R/doc/html/right.jpg"  
## [9] "/usr/local/lib/R/doc/html/up.jpg"
stopifnot(length(x) == 0L || length(x) > length(y))

# batch_collect()
# Read the first line from each of the OPM test data set files
f <- opm_files("testdata")
if (length(f) > 0) { # if the files are found
  x <- batch_collect(f, fun = readLines, fun.args = list(n = 1L))
  # yields a list with the input files as names and the result from each
  # file as values (exactly one line)
  stopifnot(is.list(x), identical(names(x), f))
  stopifnot(sapply(x, is.character), sapply(x, length) == 1L)
} else {
  warning("test files not found")
}
# For serious tasks, consider to first try the function in 'demo' mode.

# batch_process()
# Read the first line from each of the OPM test data set files and store it
# in temporary files
pf <- function(infile, outfile) write(readLines(infile, n = 1), outfile)
infiles <- opm_files("testdata")
if (length(infiles) > 0) { # if the files are found
  x <- batch_process(infiles, out.ext = "tmp", io.fun = pf,
    outdir = tempdir())
  stopifnot(is.matrix(x), identical(x[, 1], infiles))
  stopifnot(file.exists(x[, 2]))
  unlink(x[, 2])
} else {
  warning("test files not found")
}
## infile: /usr/local/lib/R/library/opm/testdata/Example_1.csv.xz
## outfile: /tmp/RtmpFh9b6n/Example_1.tmp
## before: attempt to create outfile
## after: ok
## 
## infile: /usr/local/lib/R/library/opm/testdata/Example_2.csv.xz
## outfile: /tmp/RtmpFh9b6n/Example_2.tmp
## before: attempt to create outfile
## after: ok
## 
## infile: /usr/local/lib/R/library/opm/testdata/Example_3.csv.xz
## outfile: /tmp/RtmpFh9b6n/Example_3.tmp
## before: attempt to create outfile
## after: ok
## 
## infile:
##        /usr/local/lib/R/library/opm/testdata/Example_Ecoplate.csv.xz
## outfile: /tmp/RtmpFh9b6n/Example_Ecoplate.tmp
## before: attempt to create outfile
## after: ok
## 
## infile:
##        /usr/local/lib/R/library/opm/testdata/Example_ID_run.csv.xz
## outfile: /tmp/RtmpFh9b6n/Example_ID_run.tmp
## before: attempt to create outfile
## after: ok
## 
## infile:
##        /usr/local/lib/R/library/opm/testdata/Example_LIMS_Export.exl.xz
## outfile: /tmp/RtmpFh9b6n/Example_LIMS_Export.tmp
## before: attempt to create outfile
## after: ok
## 
## infile:
##        /usr/local/lib/R/library/opm/testdata/Example_Old_Style_1.csv.xz
## outfile: /tmp/RtmpFh9b6n/Example_Old_Style_1.tmp
## before: attempt to create outfile
## after: ok
## 
## infile:
##        /usr/local/lib/R/library/opm/testdata/Example_Old_Style_2.csv.xz
## outfile: /tmp/RtmpFh9b6n/Example_Old_Style_2.tmp
## before: attempt to create outfile
## after: ok
## 
## infile:
##        /usr/local/lib/R/library/opm/testdata/Example_Old_Style_3.csv.xz
## outfile: /tmp/RtmpFh9b6n/Example_Old_Style_3.tmp
## before: attempt to create outfile
## after: ok
## 
## infile:
##        /usr/local/lib/R/library/opm/testdata/Example_Old_Style_Multiple.csv.xz
## outfile: /tmp/RtmpFh9b6n/Example_Old_Style_Multiple.tmp
## before: attempt to create outfile
## after: ok
## 
## infile:
##        /usr/local/lib/R/library/opm/testdata/Example_Perkin_Elmer.txt.xz
## outfile: /tmp/RtmpFh9b6n/Example_Perkin_Elmer.tmp
## before: attempt to create outfile
## after: ok
## 
## infile: /usr/local/lib/R/library/opm/testdata/Example_TECAN.txt.xz
## outfile: /tmp/RtmpFh9b6n/Example_TECAN.tmp
## before: attempt to create outfile
## after: ok
# For serious tasks, consider to first try the function in 'demo' mode.

# file_pattern()
(x <- file_pattern())
## [1] "\\.(csv|exl|ya?ml|json)(\\.(bz2|gz|lzma|xz))?$"
(y <- file_pattern(type = "csv", compressed = FALSE))
## [1] "\\.(csv|exl)$"
stopifnot(nchar(x) > nchar(y))
# constructing pattern from existing files
(files <- list.files(pattern = "[.]"))
##  [1] "OPM.html"              "OPM_DB.html"          
##  [3] "R.css"                 "WMD.html"             
##  [5] "aggregated.html"       "annotated.html"       
##  [7] "as.data.frame.html"    "batch_opm.html"       
##  [9] "boccuto_et_al.html"    "bracket.html"         
## [11] "bracket.set.html"      "c.html"               
## [13] "ci_plot.html"          "collect_template.html"
## [15] "csv_data.html"         "dim.html"             
## [17] "discrete.html"         "discretized.html"     
## [19] "do_aggr.html"          "do_disc.html"         
## [21] "duplicated.html"
(x <- file_pattern(I(files))) # I() causes 'literally' to be TRUE
## [1] "\\.(html|css)(\\.(bz2|gz|lzma|xz))?$"
stopifnot(grepl(x, files, ignore.case = TRUE))

# glob_to_regex()
x <- "*what glob2rx() can't handle because a '+' is included*"
(y <- glob_to_regex(x))
## [1] "^.*what glob2rx\\() can't handle because a '\\+' is included"
(z <- glob2rx(x))
## [1] "^.*what glob2rx\\() can't handle because a '+' is included"
stopifnot(!identical(y, z))
# Factor method
(z <- glob_to_regex(as.factor(x)))
## [1] ^.*what glob2rx\\() can't handle because a '\\+' is included
## Levels: ^.*what glob2rx\\() can't handle because a '\\+' is included
stopifnot(identical(as.factor(y), z))

## split_files()

# Splitting an old-style CSV file containing several plates
(x <- opm_files("multiple"))
## [1] "/usr/local/lib/R/library/opm/testdata/Example_Old_Style_Multiple.csv.xz"
if (length(x) > 0) {
  outdir <- tempdir()
  # For old-style CSV, use either "^Data File" as pattern or "Data File*"
  # with 'wildcard' set to TRUE:
  (result <- split_files(x, pattern = "^Data File", outdir = outdir))
  stopifnot(is.list(result), length(result) == length(x))
  stopifnot(sapply(result, length) == 3)
  result <- unlist(result)
  stopifnot(file.exists(result))
  unlink(result) # tidy up
} else {
  warning("opm example files not found")
}
## One could split new-style CSV as follows (if x is a vector of file names):
# split_files(x, pattern = '^"Data File",')
## note the correct setting of the quotes
## A pattern that covers both old and new-style CSV is:
# split_files(x, pattern = '^("Data File",|Data File)')
## This is used by batch_opm() in 'split' mode any by the 'run_opm.R' script

[Package opm version 1.3.63 Index]