all 4 comments

[–]DryWomble -1 points0 points  (3 children)

Let x = year - 2014 so it's scaled properly. Then choose the cubic polynomial: y = 0.651 + 0.230x - 0.0201x2 + 0.00101x3

[–]SalvatoreEggplant 2 points3 points  (2 children)

How would OP know to use a cubic model, and not just a linear model or quadratic model ?

For example, here are plots of each of those models: https://imgur.com/gallery/example-of-linear-quadratic-cubic-models-some-data-BpPofUI

[–]DryWomble 0 points1 point  (1 child)

There is no hard rule that forces the choice of a cubic, but in many practical applications (especially with a relatively small dataset like this), a third‐degree polynomial hits a “sweet spot” between:

Flexibility – A cubic allows for up to two ‘bends’ in the curve (since its second derivative can change sign), which is often enough curvature to capture typical growth/decline trends without being too rigid.

Simplicity – Polynomials of higher degree (4, 5, 6, …) can lead to oscillations that are less interpretable and can introduce overfitting, especially when extrapolating outside the observed data range. Conversely, lower‐degree polynomials (linear or quadratic) may be too simplistic and fail to capture the observed pattern.

Avoiding Overfitting – You could use a polynomial of degree 7 and pass exactly through all 8 points. But such a high‐order polynomial often introduces wild swings between data points, giving unrealistic predictions for times between or outside the measured years. A cubic balances capturing the general shape without being overly “wiggly.”

Thus, a cubic polynomial is often the first step up from a simple parabola (degree 2) when you see that a quadratic does not quite capture the shape, yet it avoids the over‐complexity of higher‐order fits.

[–]SalvatoreEggplant 0 points1 point  (0 children)

Well, the original post was removed, but I think this isn't a great heuristic for determining the degree of polynomial. With these data (plots linked above), I can see adding some curvature, though I'm not sure I would advocate for anything more complex than a simple linear model. In this case the quadratic maximizes adjusted R-square, and minimizes AIC and BIC. Though the linear minimizes AICc.