1

I'm trying to plot the original date with the predictions of the best models using curve() because I'm working with polynomial models, but I keep getting this error:

Error in curve(predict(best_models\[\[i\]\]$best_model, newdata = 
  data.frame(get(normalized_columns\[i\]))),  :
  'expr' must be a function, or a call or an expression containing 'x

This is the part of the code that it is generating the error:

par(mfrow=c(2, 3))   
for (i in 1:6) {  
  curve(predict(best_models[[i]]$best_model, 
                newdata=data.frame(get(normalized_columns[i]))),   
        from=min(filtered_resp_data[[normalized_columns[i]]]),        
        to=max(filtered_resp_data[[normalized_columns[i]]]),         
        col="blue", lwd=2,      
        xlab=normalized_columns[i], ylab="xxx",    
        main=paste("Real Data vs. Best Model (Degree", 
                   best_models[[i]]$best_degree, ")"))
}
3
  • You need an x somewhere, e.g. curve(1 + 2*x + 3*x^2).
    – jay.sf
    Commented Jul 5 at 14:17
  • If you want to use predict with curve the proper idiom is curve(predict(fit, newdata = data.frame(nameofyourpredictor = x)). Note how the x is supplied and matches the requirement of there being an x as part of the expression passed to curve.
    – Roland
    Commented Jul 5 at 14:43
  • if you need to pass the predictor name programmatically, do something like setNames(data.frame(x), variablecontainingthename).
    – Roland
    Commented Jul 5 at 14:47

1 Answer 1

1

If you want to use the predicted y_hat, you'd rather use lines. For curve you can derive a model_fun() that takes x as an argument and uses the coefficients to create a polynomial, where each coefficient is multiplied by x raised to the respective power.

Let's fit a small polynomial model with raw polynomials.

fit <- lm(mpg ~ poly(wt, 2, raw=TRUE), mtcars)

To care for the lines() solution first, we do a sufficient amount of predictions to get the line smooth.

ndat <- data.frame(wt=seq.int(0, 6, length.out=1e2))
pred <- predict(fit, newdata=ndat)

## plot data
plot(mpg ~ wt, data=mtcars)

## plot lines.
lines(ndat$wt, pred, col='red', lwd=2)

For curve() we take a look at the coefficients;

fit$coefficients |> unname()
# [1]  49.930811 -13.380337   1.171087

the model formula would be sth like 49.93x^0 - 13.38x^1 + 1.17x^2, right? So let's write a small Vectorized function that zips coefficients and powers of x together,

> model_fun <- Vectorize(
+   \(x, cf) {sum(cf*x^(seq_along(cf) - 1L))},
+   vectorize.args='x')

and put it into curve() which we will add to the plot.

> curve(model_fun(x, cf=fit$coefficients), col='blue', add=TRUE, lty=3, lwd=3)
> legend('topr', lty=c(1, 3), lw=2, co=c('red', 'blue'), le=c('lines', 'curve'))

enter image description here

As we can see, both ways match. You will know best which one is better for you.

To put this in a loop with your models, I leave you as an exercise.

Not the answer you're looking for? Browse other questions tagged or ask your own question.