Using a higher order polynomial like this (or using any curve with too many parameters in it) is called overfitting. In essence, the term is capturing patterns in the noise from the original data set that don't really matter, not telling us something important about our data. ![]() Since our data really did come from a line ( ), any curve with an term will be worse than the original at predicting new data points. Higher order polynomial fits and overfittingīecause the new point is random, you will see a different final point every time you run this code, but it will essentially always be much closer to the line (in blue) than to the quadratic (in red). We will talk more about the ramifications of this issue in the next section. That is, the best fit line and the best fit quadratic are not really being measured on the same scale. Remember that our definition of "best" depends both on the choice of error function (we will always choose RMS error unless I specify otherwise, but other versions are possible) and on the choice of function. In any case, the curve we find that minimizes the RMS error will be called the best fit curve for our data set. For other curves, we can use fminsearch to minimize the RMS error directly. For some special types of best fit functions, there are builtin MATLAB commands that will find best fit curves for us. ![]() This means that we will almost always resort to MATLAB commands to find best fit curves. For most other functions f, the resulting system will be nonlinear and we will have to resort to some other method. It turns out that if f is an n th degree polynomial (such as or ) then we get an linear system of equations, which we could solve with backslash just like before. Unfortunately, the resulting system of equations is not usually linear and will often be impossible to solve by hand.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |