If I recall FPI actually holds up pretty well. And this calibration while not perfect is pretty good. There are all sorts of other statistical measures you could use, but I think the main takeaway is that FPI is a fine model even if there might be better options out there.
I decided to look at this myself myself because I was curious if the model predictions are better later in the season. I expected they would be but it turns out they aren't. This is mostly because of the 90-100 bucket. If you remove the 90-100 bucket from the analysis then early and later season predictions are about the same from a statistical perspective overall. What happens is the early season model runs up the statistical score by predicting confidently a whole lot of games like BYU vs SIU. So it's not the best comparison.