substrate_info {opm} | R Documentation |
Return information on substrates such as their CAS number or other database ID or convert substrate names.
## S4 method for signature 'MOPMX' substrate_info(object, ...) ## S4 method for signature 'OPM' substrate_info(object, ...) ## S4 method for signature 'OPMS' substrate_info(object, ...) ## S4 method for signature 'character' substrate_info(object, what = c("cas", "kegg", "drug", "metacyc", "chebi", "mesh", "seed", "downcase", "greek", "concentration", "html", "peptide", "peptide2", "all"), browse = 0L, download = FALSE, ...) ## S4 method for signature 'factor' substrate_info(object, ...) ## S4 method for signature 'list' substrate_info(object, ...) ## S4 method for signature 'substrate_match' substrate_info(object, ...)
object |
Query character vector, factor or list, S3
object of class ‘substrate_match’,
|
what |
Character scalar indicating which kind of information to output.
See the references for information on the databases. |
browse |
Numeric scalar. If non-zero, an
URL is generated from each ID. If
positive, this number of URLs (counted from the
beginning) is also opened in the default web browser; if
negative, the URLs are only returned. It is an
error to try this with those values of |
download |
Logical scalar indicating whether, using
the available IDs, substrate information should be
queried from the according web services and returned in
customised objects. Note that this is unavailable for
most values of |
... |
Optional other arguments passed between the methods. |
The query names must be written exactly as used in the
stored plate annotations. To determine their spelling,
use find_substrate
. Each spelling might
include a concentration indicator, but the same
underlying substrate name yielded the same ID
irrespective of the concentration.
Note that the information is only partially complete, depending on the well and the database. While it is possible to link almost all substrates to, say, CAS numbers, they are not necessarily contained in the other databases. Thanks to the work of the ChEBI staff, which is gratefully acknowledged, ChEBI information is complete as far as possible (large molecules such as proteins or other polymers are not covered by ChEBI).
For some wells, even a main substrate cannot be identified, causing all its IDs to be missing. This holds for all control wells, for all wells that contain a mixture of (usually two) substrates, and for all wells that are only specified by a certain pH.
The generated URLs should provide plenty of information on the respective substrate. In the case of ChEBI, KEGG and MetaCyc, much information is directly displayed on the page itself, whereas the chosen CAS site contains a number of links providing additional chemical details. The MeSH web pages directly link to according PubMed searches.
The character method returns a character vector with
object
used as names and either a matched entry or
NA
as value. Only if what
is set to
‘peptide’ a named list is returned instead. The
factor method works like the character method, whereas
the list method traverses a list and calls
substrate_info
on suitable elements, leaving
others unchanged. The OPM
and
OPMS
methods work like the character
method, using their own substrates.
Depending on the browse
argument, the returned
IDs might have been converted to
URLs, and as a side effect tabs in the default
web browser might have been opened. For suitable values
of what
, setting download
to TRUE
yielded special objects as described above.
The MOPMX
method yield a list with one
element of one of the kinds described above per element
of object
.
Bochner, B. R., pers. comm.
http://www.cas.org/content/chemical-substances/faqs
Kanehisa, M., Goto, S., Furumichi, M., Tanabe, M., and Hirakawa, M. 2010 KEGG for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acids Research 38: D355–D360.
Caspi, R., Altman, T., Dreher, K., Fulcher, C.A., Subhraveti, P., Keseler, I.M., Kothari, A., Krummenacker, M., Latendresse, M., Mueller, L.A., Ong, Q., Paley, S., Pujar, A., Shearer, A.G., Travers, M., Weerasinghe, D., Zhang, P., Karp, P.D. 2012 The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Research 40: D742–D753.
http://www.ncbi.nlm.nih.gov/mesh
Coletti, M.H., Bleich, H.L 2001 Medical subject headings used to search the biomedical literature. Journal of the American Medical Informatics Association 8: 317–323.
Hastings, J., de Matos, P., Dekker, A., Ennis, M., Harsha, B., Kale, N., Muthukrishnan, V., Owen, G., Turner, S., Williams, M., Steinbeck, C. 2013 The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013. Nucleic Acids Research 41: D456–D463.
Overbeek, R., Begley, T., Butler, R., Choudhuri, J., Chuang, H., Cohoon, M., de Crecy-Lagard, V., Diaz, N., Disz, T., Edwards, R., Fonstein, M., Frank, E., Gerdes, S., Glass, E., Goesmann, A., Hanson, A., Iwata-Reuyl, D., Jensen, R., Jamshidi, N., Krause, L., Kubal, M., Larsen, N., Linke, B., McHardy, A., Meyer, F., Neuweger, H., Olsen, G., Olson, R., Osterman, A., Portnoy, V., Pusch, G., Rodionov, D., Rueckert, C., Steiner, J., Stevens, R., Thiele, I., Vassieva, O., Ye, Y., Zagnitko, O., Vonstein, V. 2005 The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acids Research 33: 5691–5702.
utils::browseURL
Other naming-functions: find_positions
,
find_substrate
, gen_iii
,
listing
, opm_files
,
plate_type
, register_plate
,
select_colors
, wells
# Character method; compare correct and misspelled substrate name
(x <- substrate_info(c("D-Glucose", "D-Gloucose")))
## D-Glucose D-Gloucose
## "50-99-7" NA
stopifnot(anyNA(x), !all(is.na(x)))
stopifnot(identical(x, # Factor method yields same result
substrate_info(as.factor(c("D-Glucose", "D-Gloucose")))))
# Now with generation of URLs
(y <- substrate_info(c("D-Glucose", "D-Gloucose"), browse = -1))
## D-Glucose
## "http://chem.sis.nlm.nih.gov/chemidplus/direct.jsp?regno=50-99-7"
## D-Gloucose
## NA
stopifnot(is.na(y) | nchar(y) > nchar(x))
# NA remains NA (and the function would not try to open it in the browser)
# Character method, safe conversion to lower case
(x <- substrate_info(c("a-D-Glucose", "a-D-Gloucose"), "downcase"))
## a-D-Glucose a-D-Gloucose
## "alpha-D-glucose" "alpha-D-gloucose"
stopifnot(nchar(x) > nchar(c("a-D-Glucose", "a-D-Gloucose")))
# note the protection of 'D' and the conversion of 'a'
# whether or not substrate names are known does not matter here
# Peptide extraction (note treatment of non-standard amino acids)
(x <- substrate_info(c("Ala-b-Ala-D-Glu", "Glucose", "Trp-Val"), "peptide"))
## $`Ala-b-Ala-D-Glu`
## [1] "Ala" "b-Ala" "D-Glu"
##
## $Glucose
## character(0)
##
## $`Trp-Val`
## [1] "Trp" "Val"
stopifnot(is.list(x), sapply(x, length) == c(3, 0, 2))
# List method
(x <- substrate_info(find_substrate(c("D-Glucose", "D-Gloucose"))))
## D-Glucose:
## 1-Thio-b-D-Glucose: 10593-29-0
## '2-Deoxy-D-Glucose #1': 154-17-6
## '2-Deoxy-D-Glucose #2': 154-17-6
## '2-Deoxy-D-Glucose #3': 154-17-6
## '2-Deoxy-D-Glucose #4': 154-17-6
## 2-Deoxy-D-Glucose-6-Phosphate: 33068-19-8
## 3-O-Methyl-D-Glucose: 146-72-5
## D-Glucose: 50-99-7
## 'D-Glucose #1': 50-99-7
## 'D-Glucose #10': 50-99-7
## 'D-Glucose #11': 50-99-7
## 'D-Glucose #12': 50-99-7
## 'D-Glucose #2': 50-99-7
## 'D-Glucose #3': 50-99-7
## 'D-Glucose #4': 50-99-7
## 'D-Glucose #5': 50-99-7
## 'D-Glucose #6': 50-99-7
## 'D-Glucose #7': 50-99-7
## 'D-Glucose #8': 50-99-7
## 'D-Glucose #9': 50-99-7
## D-Glucose-6-Phosphate: 3671-99-6
## a-D-Glucose-1-Phosphate: 56401-20-8
## 'a-D-Glucose-1-Phosphate #1': 56401-20-8
## 'a-D-Glucose-1-Phosphate #10': 56401-20-8
## 'a-D-Glucose-1-Phosphate #11': 56401-20-8
## 'a-D-Glucose-1-Phosphate #12': 56401-20-8
## 'a-D-Glucose-1-Phosphate #2': 56401-20-8
## 'a-D-Glucose-1-Phosphate #3': 56401-20-8
## 'a-D-Glucose-1-Phosphate #4': 56401-20-8
## 'a-D-Glucose-1-Phosphate #5': 56401-20-8
## 'a-D-Glucose-1-Phosphate #6': 56401-20-8
## 'a-D-Glucose-1-Phosphate #7': 56401-20-8
## 'a-D-Glucose-1-Phosphate #8': 56401-20-8
## 'a-D-Glucose-1-Phosphate #9': 56401-20-8
## D-Gloucose: {}
stopifnot(length(x[[1]]) > length(x[[2]]))
# OPM and OPMS methods
(x <- substrate_info(vaas_1[, 1:3], "all"))
## Negative Control: []
## Dextrin:
## CAS: 9004-53-9
## ChEBI: '28675'
## KEGG compound: C00721
## KEGG drug: D00084
## MetaCyc: Dextrins
## SEED: cpd11594
## MeSH: Quaternary Ammonium Compounds
## D-Maltose:
## CAS: 6363-53-7
## ChEBI: '17306'
## KEGG compound: C00208
## KEGG drug: D00044
## MetaCyc: MALTOSE
## SEED: cpd00179
## MeSH: Maltose
stopifnot(inherits(x, "substrate_data"))
stopifnot(identical(x, substrate_info(vaas_4[, , 1:3], "all")))
## Not run:
##D
##D # this would open up to 96 tabs in your browser...
##D substrate_info(vaas_4, "kegg", browse = 100)
## End(Not run)