Skip to contents

Given a numeric vector classify into numeric intervals. classify_intervals() is a wrapper of both classIntervals() and findCols().

Usage

classify_intervals(var, n, style = "quantile", rtimes = 3, ...,
 intervalClosure = c("left", "right"), dataPrecision = NULL,
 warnSmallN = TRUE, warnLargeN = TRUE, largeN = 3000L, samp_prop = 0.1,
 gr = c("[", "]"), factor = TRUE)

Arguments

var

a continuous numerical variable

n

number of classes required, if missing, nclass.Sturges is used; see also the "dpih" and "headtails" styles for automatic choice of the number of classes

style

chosen style: one of "fixed", "sd", "equal", "pretty", "quantile", "kmeans", "hclust", "bclust", "fisher", "jenks", "dpih", "headtails", or "maximum"

rtimes

number of replications of var to catenate and jitter; may be used with styles "kmeans" or "bclust" in case they have difficulties reaching a classification

intervalClosure

default “left”, allows specification of whether partition intervals are closed on the left or the right (added by Richard Dunlap). Note that the sense of interval closure is hard-coded as “right”-closed whenstyle="jenks" (see Details below).

dataPrecision

default NULL, permits rounding of the interval endpoints (added by Richard Dunlap). The data precision used for printing interval values in the legend returned by findColours, and in the print method for classIntervals objects. If intervalClosure is “left”, the value returned is ceiling of the data value multiplied by 10 to the dataPrecision power, divided by 10 to the dataPrecision power. The argument does not round var, the input variable.

warnSmallN

default TRUE, if FALSE, quietens warning for n >= nobs

warnLargeN

default TRUE, if FALSE large data handling not used

largeN

default 3000L, the QGIS sampling threshold; over 3000, the observations presented to "fisher" and "jenks" are either a samp_prop= sample or a sample of 3000, whichever is larger

samp_prop

default 0.1, QGIS 10% sampling proportion

gr

default c("[", "]"), if the units package is available, units::units_options("group") may be used directly to give the enclosing bracket style

...

arguments to be passed to the functions called in each style

factor

default "TRUE", if "TRUE" returns cols as a factor with intervals as labels rather than integers

Value

A vector of same length as var. When factor = FALSE returns a factor where the levels are the interval of the observation.

See also

Examples

xvar <- c(22361, 9573, 4836, 5309, 10384, 4359, 11016, 4414, 3327, 3408, 
  17816, 6909, 6936, 7990, 3758, 3569, 21965, 3605, 2181, 1892, 
  2459, 2934, 6399, 8578, 8537, 4840, 12132, 3734, 4372, 9073, 
  7508, 5203)
classIntervals(xvar, 5, "sd")
#> style: sd
#>   one of 31,465 possible partitions of this variable into 5 classes
#> [-3179.375,2025.578)  [2025.578,7230.531)  [7230.531,12435.48) 
#>                    1                   19                    9 
#>  [12435.48,17640.44)  [17640.44,22845.39] 
#>                    0                    3 
classify_intervals(xvar, 5, "sd", factor = FALSE)
#>  [1] 5 3 2 2 3 2 3 2 2 2 5 2 2 3 2 2 5 2 2 1 2 2 2 3 3 2 3 2 2 3 3 2
classify_intervals(xvar, 5, "sd", factor = TRUE)
#>  [1] [17640.44,22845.39]  [7230.531,12435.48)  [2025.578,7230.531) 
#>  [4] [2025.578,7230.531)  [7230.531,12435.48)  [2025.578,7230.531) 
#>  [7] [7230.531,12435.48)  [2025.578,7230.531)  [2025.578,7230.531) 
#> [10] [2025.578,7230.531)  [17640.44,22845.39]  [2025.578,7230.531) 
#> [13] [2025.578,7230.531)  [7230.531,12435.48)  [2025.578,7230.531) 
#> [16] [2025.578,7230.531)  [17640.44,22845.39]  [2025.578,7230.531) 
#> [19] [2025.578,7230.531)  [-3179.375,2025.578) [2025.578,7230.531) 
#> [22] [2025.578,7230.531)  [2025.578,7230.531)  [7230.531,12435.48) 
#> [25] [7230.531,12435.48)  [2025.578,7230.531)  [7230.531,12435.48) 
#> [28] [2025.578,7230.531)  [2025.578,7230.531)  [7230.531,12435.48) 
#> [31] [7230.531,12435.48)  [2025.578,7230.531) 
#> 5 Levels: [-3179.375,2025.578) [2025.578,7230.531) ... [17640.44,22845.39]

if (!require("spData", quietly=TRUE)) {
  message("spData package needed for examples")
  run <- FALSE
} else {
  run <- TRUE
}

if (run) {
  data("jenks71", package = "spData")
  x <- jenks71$jenks71
  classify_intervals(x, n = 5, style = "fisher")
}
#>   [1] [43.3,61.36)    [78.475,105.95) [61.36,78.475)  [61.36,78.475) 
#>   [5] [78.475,105.95) [78.475,105.95) [78.475,105.95) [78.475,105.95)
#>   [9] [105.95,155.3]  [105.95,155.3]  [105.95,155.3]  [105.95,155.3] 
#>  [13] [78.475,105.95) [78.475,105.95) [61.36,78.475)  [78.475,105.95)
#>  [17] [61.36,78.475)  [61.36,78.475)  [78.475,105.95) [105.95,155.3] 
#>  [21] [43.3,61.36)    [61.36,78.475)  [61.36,78.475)  [61.36,78.475) 
#>  [25] [78.475,105.95) [61.36,78.475)  [61.36,78.475)  [43.3,61.36)   
#>  [29] [61.36,78.475)  [78.475,105.95) [43.3,61.36)    [43.3,61.36)   
#>  [33] [43.3,61.36)    [61.36,78.475)  [61.36,78.475)  [61.36,78.475) 
#>  [37] [43.3,61.36)    [43.3,61.36)    [61.36,78.475)  [43.3,61.36)   
#>  [41] [15.57,43.3)    [43.3,61.36)    [43.3,61.36)    [15.57,43.3)   
#>  [45] [43.3,61.36)    [61.36,78.475)  [43.3,61.36)    [61.36,78.475) 
#>  [49] [43.3,61.36)    [15.57,43.3)    [43.3,61.36)    [61.36,78.475) 
#>  [53] [43.3,61.36)    [43.3,61.36)    [43.3,61.36)    [43.3,61.36)   
#>  [57] [43.3,61.36)    [43.3,61.36)    [43.3,61.36)    [43.3,61.36)   
#>  [61] [43.3,61.36)    [43.3,61.36)    [43.3,61.36)    [43.3,61.36)   
#>  [65] [15.57,43.3)    [15.57,43.3)    [43.3,61.36)    [43.3,61.36)   
#>  [69] [15.57,43.3)    [15.57,43.3)    [15.57,43.3)    [15.57,43.3)   
#>  [73] [43.3,61.36)    [43.3,61.36)    [15.57,43.3)    [15.57,43.3)   
#>  [77] [15.57,43.3)    [15.57,43.3)    [43.3,61.36)    [43.3,61.36)   
#>  [81] [15.57,43.3)    [15.57,43.3)    [15.57,43.3)    [15.57,43.3)   
#>  [85] [15.57,43.3)    [15.57,43.3)    [15.57,43.3)    [15.57,43.3)   
#>  [89] [15.57,43.3)    [15.57,43.3)    [15.57,43.3)    [15.57,43.3)   
#>  [93] [15.57,43.3)    [15.57,43.3)    [15.57,43.3)    [15.57,43.3)   
#>  [97] [15.57,43.3)    [15.57,43.3)    [15.57,43.3)    [15.57,43.3)   
#> [101] [15.57,43.3)    [15.57,43.3)   
#> 5 Levels: [15.57,43.3) [43.3,61.36) [61.36,78.475) ... [105.95,155.3]