by Bill Soo (4 Submissions)
Category: Complete Applications
Compatability: Visual Basic 5.0
Difficulty: Intermediate
Date Added: Wed 3rd February 2021
Rating:
(6 Votes)

The line of best fit is the line that best fits a set of data that is more or less linear. There are a number of ways to calculate this but I'll use the method of sum of least squares. In this method, we find a line where the sum of the square of the distances from the line to each point is minimized. While that sounds complicated to do, the execution is actually quite simple.
If you are interested in the derivation of the math, I found a website through GOOGLE that has the required formulas:
https://people.hofstra.edu/faculty/S...regression.html
For the line of best fit Y = M * X + B:
M = (n *Sxy - Sx * Sy) / (n * Sx2 - Sx * Sx)
B = (Sy - M * Sx) / n
Where:
n = number of points
Sx = the SUM of all the X coordinates
Sy = the SUM of all the Y coordinates
Sxy = the SUM of all (x * y)
Sx2 = the SUM of all (x * x)
So to calculate the line of best fit, set up a bunch of variables like above. Then, everytime you get a new datapoint, update the variables and calculate a line using the equations.
You can then check the coefficient of correlation using a formula found on the URL.
In the Demo program, you simply click on the form to place data points. The program will then draw a line of best fit based on those data points. Double click on the form to clear it and start anew.
Side Effects
If your data is close to a vertical line, you may experience overflow or division by zero errors.