Error when ploting the original date with the predictions of the best models using curve()

Question

I'm trying to plot the original date with the predictions of the best models using curve() because I'm working with polynomial models, but I keep getting this error:

Error in curve(predict(best_models\[\[i\]\]$best_model, newdata = 
  data.frame(get(normalized_columns\[i\]))),  :
  'expr' must be a function, or a call or an expression containing 'x

This is the part of the code that it is generating the error:

par(mfrow=c(2, 3))   
for (i in 1:6) {  
  curve(predict(best_models[[i]]$best_model, 
                newdata=data.frame(get(normalized_columns[i]))),   
        from=min(filtered_resp_data[[normalized_columns[i]]]),        
        to=max(filtered_resp_data[[normalized_columns[i]]]),         
        col="blue", lwd=2,      
        xlab=normalized_columns[i], ylab="xxx",    
        main=paste("Real Data vs. Best Model (Degree", 
                   best_models[[i]]$best_degree, ")"))
}

If you want to use predict with curve the proper idiom is curve(predict(fit, newdata = data.frame(nameofyourpredictor = x)). Note how the x is supplied and matches the requirement of there being an x as part of the expression passed to curve. — Roland, Commented Jul 5 at 14:43
if you need to pass the predictor name programmatically, do something like setNames(data.frame(x), variablecontainingthename). — Roland, Commented Jul 5 at 14:47

jay.sf · Accepted Answer · 2024-07-17 11:22:26Z

If you want to use the predicted y_hat, you'd rather use lines. For curve you can derive a model_fun() that takes x as an argument and uses the coefficients to create a polynomial, where each coefficient is multiplied by x raised to the respective power.

Let's fit a small polynomial model with raw polynomials.

fit <- lm(mpg ~ poly(wt, 2, raw=TRUE), mtcars)

To care for the lines() solution first, we do a sufficient amount of predictions to get the line smooth.

ndat <- data.frame(wt=seq.int(0, 6, length.out=1e2))
pred <- predict(fit, newdata=ndat)

## plot data
plot(mpg ~ wt, data=mtcars)

## plot lines.
lines(ndat$wt, pred, col='red', lwd=2)

For curve() we take a look at the coefficients;

fit$coefficients |> unname()
# [1]  49.930811 -13.380337   1.171087

the model formula would be sth like 49.93x^0 - 13.38x^1 + 1.17x^2, right? So let's write a small Vectorized function that zips coefficients and powers of x together,

> model_fun <- Vectorize(
+   \(x, cf) {sum(cf*x^(seq_along(cf) - 1L))},
+   vectorize.args='x')

and put it into curve() which we will add to the plot.

> curve(model_fun(x, cf=fit$coefficients), col='blue', add=TRUE, lty=3, lwd=3)
> legend('topr', lty=c(1, 3), lw=2, co=c('red', 'blue'), le=c('lines', 'curve'))

As we can see, both ways match. You will know best which one is better for you.

To put this in a loop with your models, I leave you as an exercise.

Collectives™ on Stack Overflow

Error when ploting the original date with the predictions of the best models using curve()

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
r
plot
regression
curve
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Not the answer you're looking for? Browse other questions tagged rplotregressioncurve or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
r
plot
regression
curve
or ask your own question.