Missing entries in any given column of the matrix are replaced by the column means or the values in a supplied vector.

na.replace(x, m = rowSums(x, na.rm = TRUE))

Arguments

x

A matrix with potentially missing values, and also potentially in sparse matrix format (i.e. inherits from "sparseMatrix")

m

Optional argument. A vector of values used to replace the missing entries, columnwise. If missing, the column means of 'x' are used

Value

A version of 'x' is returned with the missing values replaced.

Details

This is a simple imputation scheme. This function is called by makeX if the na.impute=TRUE option is used, but of course can be used on its own. If 'x' is sparse, the result is sparse, and the replacements are done so as to maintain sparsity.

See also

makeX and glmnet

Author

Trevor Hastie
Maintainer: Trevor Hastie hastie@stanford.edu

Examples


set.seed(101)
### Single data frame
X = matrix(rnorm(20), 10, 2)
X[3, 1] = NA
X[5, 2] = NA
X3 = sample(letters[1:3], 10, replace = TRUE)
X3[6] = NA
X4 = sample(LETTERS[1:3], 10, replace = TRUE)
X4[9] = NA
dfn = data.frame(X, X3, X4)

x = makeX(dfn)
m = rowSums(x, na.rm = TRUE)
na.replace(x, m)
#>            X1         X2      X3a       X3b      X3c      X4A      X4B      X4C
#> 1  -0.3260365  0.5264481 0.000000 1.0000000 0.000000 0.000000 0.000000 1.000000
#> 2   0.5524619 -0.7948444 0.000000 0.0000000 1.000000 0.000000 1.000000 0.000000
#> 3   2.2004116  1.4277555 1.000000 0.0000000 0.000000 0.000000 1.000000 0.000000
#> 4   0.2143595 -1.4668197 1.000000 0.0000000 0.000000 1.000000 0.000000 0.000000
#> 5   0.3107692  1.7576174 1.000000 0.0000000 0.000000 0.000000 1.000000 0.000000
#> 6   1.1739663 -0.1933380 3.427756 0.7475398 2.310769 1.000000 0.000000 0.000000
#> 7   0.6187899 -0.8497547 1.000000 0.0000000 0.000000 1.000000 0.000000 0.000000
#> 8  -0.1127343  0.0584655 0.000000 1.0000000 0.000000 1.000000 0.000000 0.000000
#> 9   0.9170283 -0.8176704 0.000000 1.0000000 0.000000 1.980628 1.769035 1.945731
#> 10 -0.2232594 -2.0503078 0.000000 0.0000000 1.000000 0.000000 0.000000 1.000000

x = makeX(dfn, sparse = TRUE)
na.replace(x, m)
#> 10 x 8 sparse Matrix of class "dgCMatrix"
#>            X1         X2      X3a       X3b      X3c      X4A      X4B      X4C
#> 1  -0.3260365  0.5264481 .        1.0000000 .        .        .        1.000000
#> 2   0.5524619 -0.7948444 .        .         1.000000 .        1.000000 .       
#> 3   2.2004116  1.4277555 1.000000 .         .        .        1.000000 .       
#> 4   0.2143595 -1.4668197 1.000000 .         .        1.000000 .        .       
#> 5   0.3107692  1.7576174 1.000000 .         .        .        1.000000 .       
#> 6   1.1739663 -0.1933380 3.427756 0.7475398 2.310769 1.000000 .        .       
#> 7   0.6187899 -0.8497547 1.000000 .         .        1.000000 .        .       
#> 8  -0.1127343  0.0584655 .        1.0000000 .        1.000000 .        .       
#> 9   0.9170283 -0.8176704 .        1.0000000 .        1.980628 1.769035 1.945731
#> 10 -0.2232594 -2.0503078 .        .         1.000000 .        .        1.000000