## What is a Grnn? Not a neural net!

In a recent article ( , ) a type of neural net called a Generalized Regression Nueral Net was used. What is it? It turned out to be no more than a simple smoothing algoritm. In the following code my variables that start with a Caapital letter are vectors or matrices. I can write the whole “GRNN” in three lines of R- code :

```My_grnn <-
function(Y, Xtrain, Xtest, sigma, m=nrow(Xtrain)){
D<- as.matrix(dist(rbind(Xtrain, Xtest)))[1:m, -(1:m)]
W<- exp(- D^2/((2*sigma^2)))
return( Y %*% W / colSums(W))
}
```

Here Xtrain are observation with known results Y, and Xtest are observation for wchich we want to return the predicted values. In this simple algoritm D are the Euclidian distances between Xtrain and Xtest, W is the weight attached to those distances and then we can return the weighted average of Y as predicions.

As an algorithm this does not deserve the name neural network. It is also doing very badly as a machine learning algorithm compared to other generalized regressions. Compare it to the workhorse xgboost on the Boston Housing data.

```# ------------- real example
require(MASS)
data(Boston)
B <- scale(Boston, center=TRUE, scale = TRUE)
B[,14]<- Boston[,14]
# the medv column has median value of owner-occupied homes in \$1000
# 506 rows,lets take 80 to predict
set.seed(17*41)
h<- sample(nrow(Boston), 80)

Xtrain<- as.matrix(B[-h, -14])
Ytrain<- B[-h, 14]
Xtest<- as.matrix(B[h, -14])
Y_test<- B[h, 14]

# determine best sigma for grnn
range <- seq(0.1, 3, by=0.01)
result <- sapply(range, function(s){
Y_hat<- My_grnn(Ytrain, Xtrain, Xtest, sigma=s)
return(Metrics::rmse(Y_hat, Y_test))
})
best<- range[which(result==min(result))]
pred_grnn <- My_grnn(Ytrain, Xtrain, Xtest, sigma=best)

require(xgboost)
param <- list(
eta = 0.005,
subsample = 0.95,
max_depth = 6,
min_child_weight = 1,
colsample_bytree = 0.7,
)

dmodel <- xgb.DMatrix(as.matrix(Xtrain), label = Ytrain, missing=NA)
dvalid <- xgb.DMatrix(as.matrix(Xtest), label = Y_test, missing=NA)

model <- xgb.train(
data = dmodel,
nrounds = 10000,
params = param,
watchlist = list(val = dvalid),
early_stopping_rounds = 1
)
pred_xgb <- predict(model, dvalid, ntree_limit = model\$best_iteration)
## compare predictions with root mean squared error
Metrics::rmse(pred_grnn, Y_test)
Metrics::rmse(pred_xgb, Y_test)
##  6.086455
##  2.993994
## yes -- grnn is very bad, a factor 2 worse than xgboost

```

Notes

 J. Abbot & J. Marohasy,  “The application of machine learning for evaluating anthropogenic versus natural climate change”, GeoResJ, dec. 2017,  https://doi.org/10.1016/j.grj.2017.08.001

 For fundamental criticism see f.i.  https://andthentheresphysics.wordpress.com/2017/08/22/machine-unlearning/

This entry was posted in Uncategorized. Bookmark the permalink.