to_kmeans {opm} | R Documentation |
Calculate or plot the Calinski-Harabasz statistics from
kmeans
results. The result of plot
is a
simple scatter plot which can be modified with arguments
passed to plot
from the graphics package.
Alternatively, determine the borders between clusters of
one-dimensional data, create a histogram in which these
borders are plotted, or convert an object to one of class
kmeans
.
to_kmeans(x, ...) ## S3 method for class 'kmeans' to_kmeans(x, ...) ## S3 method for class 'kmeanss' to_kmeans(x, y, ...) ## S3 method for class 'Ckmeans.1d.dp' to_kmeans(x, y, ...) calinski(x, ...) ## S3 method for class 'kmeans' calinski(x, ...) ## S3 method for class 'Ckmeans.1d.dp' calinski(x, y, ...) ## S3 method for class 'kmeanss' calinski(x, ...) ## S3 method for class 'kmeanss' plot(x, xlab = "Number of clusters", ylab = "Calinski-Harabasz statistics", ...) borders(x, ...) ## S3 method for class 'kmeans' borders(x, y, ...) ## S3 method for class 'Ckmeans.1d.dp' borders(x, y, ...) ## S3 method for class 'kmeanss' borders(x, ...) ## S3 method for class 'kmeans' hist(x, y, col = "black", lwd = 1L, lty = 1L, main = NULL, xlab = "Clustered values", ...) ## S3 method for class 'Ckmeans.1d.dp' hist(x, y, ...) ## S3 method for class 'kmeanss' hist(x, k = NULL, col = "black", lwd = 1L, lty = 1L, main = NULL, xlab = "Clustered values", ...)
x |
Object of class |
y |
Vector of original data subjected to clustering.
Automatically determined for the |
k |
Numeric vector or |
col |
Graphical parameter passed to |
lwd |
Like |
lty |
Like |
main |
Passed to |
xlab |
Character scalar passed to
|
ylab |
Character scalar passed to |
... |
Optional arguments passed to and from other
methods. For the |
The borders are calculated as the mean of the maximum of
the cluster with the lower values and the minimum of the
neighbouring cluster with the higher values. The
hist
method plots a histogram of one-dimensional
data subjected to k-means partitioning in which these
borders can be drawn.
y
must also be in the order it has been when
subjected to clustering, but this is not checked. Using
kmeanss
objects thus might preferable in most
cases because they contain a copy of the input data.
to_kmeans
creates an object of class
kmeans
.
borders
creates a numeric vector or list of such
vectors.
The return value of the hist
method is like
hist.default
; see there for details.
calinksi
returns a numeric vector with one element
per kmeans
object. plot
returns it
invisibly. Its ‘names’ attribute indicates the
original numbers of clusters requested.
graphics::hist graphics::abline Ckmeans.1d.dp::Ckmeans.1d.dp
Other kmeans-functions: run_kmeans
x <- as.vector(extract(vaas_4, as.labels = NULL, subset = "A"))
x.km <- run_kmeans(x, k = 1:10)
# plot() method
# the usual arguments of plot() are available
show(y <- plot(x.km, col = "blue", pch = 19))
## 1 2 3 4 5 6 7 8
## -Inf 3507.297 3765.857 3879.576 4438.684 4779.887 5473.626 6056.848
## 9 10
## 6632.574 7411.174
stopifnot(is.numeric(y), names(y) == 1:10)
# borders() method
(x.b <- borders(x.km)) # => list of numeric vectors
## $`1`
## numeric(0)
##
## $`2`
## [1] 171.4392
##
## $`3`
## [1] 111.8538 230.9125
##
## $`4`
## [1] 100.0824 204.8658 283.0464
##
## $`5`
## [1] 70.74111 143.81037 230.91248 295.18021
##
## $`6`
## [1] 65.17332 127.05117 204.86583 261.09713 301.35835
##
## $`7`
## [1] 48.31749 89.73480 150.11714 223.42479 274.91820 306.03570
##
## $`8`
## [1] 40.65184 69.02271 111.85380 171.43923 230.91248 274.91820 306.03570
##
## $`9`
## [1] 40.65184 69.02271 111.85380 166.05988 223.42479 272.72290 301.35835
## [8] 335.07573
##
## $`10`
## [1] 40.65184 69.02271 111.85380 162.09539 217.67328 257.53834 283.04643
## [8] 305.18844 335.07573
stopifnot(is.list(x.b), length(x.b) == 10, sapply(x, is.numeric))
stopifnot(sapply(x.b, length) == as.numeric(names(x.b)) - 1)
# hist() methods
y <- hist(x.km[[2]], x, col = "blue", lwd = 2)
stopifnot(inherits(y, "histogram"))
y <- hist(x.km, 3:4, col = c("blue", "red"), lwd = 2)
stopifnot(inherits(y, "histogram"))
# to_kmeans() methods
x <- c(1, 2, 4, 5, 7, 8)
summary(y <- kmeans(x, 3))
## Length Class Mode
## cluster 6 -none- numeric
## centers 3 -none- numeric
## totss 1 -none- numeric
## withinss 3 -none- numeric
## tot.withinss 1 -none- numeric
## betweenss 1 -none- numeric
## size 3 -none- numeric
## iter 1 -none- numeric
## ifault 1 -none- numeric
stopifnot(identical(y, to_kmeans(y)))
# see particularly run_kmeans() which uses this internally if clustering is
# done with Ckmeans.1d.dp::Ckmeans.1d.dp()