0

I am trying to add a regression line into my ggplot. Below is the code I am using to make the plot. I am wanting to plot the regression line in the first plot (L5andDownTempAvgPlot) but keep running into the following error. I am also wondering if there is a way to a line of best fit of the two lines in the ggplot (the one from DailyAvgL5 and DailyDownGaugeAvg). I'm not sure if this is possible, but if so that's ultimately what I would like to be able to do.

#create plot of daily avgs
library(ggplot)
library(dplyr)

L5andDownTempAvgPlot <- (ggplot(NULL, aes(Date, mean_temp)) +
    geom_line(data = DailyAvgL5, color = "black") +
    geom_line(data = DailyDownGaugeAvg, color = "red") +
    geom_smooth(data = DailyDownGaugeAvg, method = 'lm', formula = DailyDownGaugeAvg$Date~DailyDownGaugeAvg$mean_temp)
)

DownGaugeHeightAvgPlot <- (ggplot(DailyDownGaugeAvg, aes(Date, DailyGaugeHeightAvg)) +
                             geom_line(data = DailyDownGaugeAvg, color = "blue") +
                             geom_smooth(method = 'lm')
)

DownGaugeHeightAvgPlot + L5andDownTempAvgPlot

`geom_smooth()` using formula = 'y ~ x'
Warning message:
Failed to fit group -1.
Caused by error in `Ops.Date()`:
! * not defined for "Date" objects 
dput(head(DailyDownGaugeAvg))
structure(list(Date = structure(c(19789, 19790, 19791, 19792, 
19793, 19794), class = "Date"), mean_temp = c(7.94, 8.465625, 
7.965625, 7.64673913043478, 8.63645833333333, 9.146875), DailyGaugeHeightAvg = c(15.5787368421053, 
12.3515625, 9.34770833333333, 12.2685869565217, 15.6577083333333, 
16.4609375)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", 
"data.frame"))

> dput(head(DailyAvgL5))
structure(list(Date = structure(c(19791, 19792, 19793, 19794, 
19795, 19796), class = "Date"), mean_temp = c(9.98765502929687, 
9.884833984375, 8.01781209309896, 8.70198394775391, 9.21991678873698, 
9.69807739257812)), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))

I think this has something to do with the fact that I am trying to fit a regression line to data that is not detailed in the first argument in ggplot (data,...), but honestly not sure. I am really new to r so I am asking a lot of questions but I appreciate the help. I am also unable to display the correlation between the downstream gauge mean_temp and the L5 mean_temp but that is beyond this I believe. Thanks.

1
  • 1
    Remove your formula as it is incorrect. eg you SHOULD NOT use data$var within your formula when data is already passed into the function, also you are using x~y instead of y~x
    – Onyambu
    Commented Jun 26 at 19:43

1 Answer 1

0

for the formula, don't use values from a dataframe

# this will fail
ggplot(data)+
  geom_smooth(
    aes(x = Date, y = mean_temp.x), 
    method = 'lm', 
    formula = data$Date ~ data$mean_temp.x,
    se = TRUE)

# this works
ggplot(data)+
  geom_smooth(
    aes(x = Date, y = mean_temp.x), 
    method = 'lm', 
    formula = y ~ x,
    se = TRUE)

# this works too
ggplot(data)+
  geom_smooth(
    aes(x = Date, y = mean_temp.x), 
    method = 'lm', 
    formula = y ~ poly(x, 2),
    se = TRUE)

# so does this    
ggplot(data)+
  geom_smooth(
    aes(x = Date, y = mean_temp.x), 
    method = 'lm', 
    formula = y ~ log(x),
    se = TRUE)

Not the answer you're looking for? Browse other questions tagged or ask your own question.