As specified in last blog post, I have explored and understood the concepts about Least square linear regression method and Kurtosis.
Least square linear regression method represents the relation between variables in a scatterplot. The procedure fits the line to the data points in a way that minimizes the sum of the squares and vertical distance between the line and the points. It is also known as the line of best fit or trendline.
The linear equation, y = b + mx
Where, y = dependent variable
X = Independent variable
B = Y intercept
M = Slope of the line
To get the value of m, below formula is required
M = NΣ(xy) – ΣxΣy/ NΣ(x2) – (Σx)2
To get the value of m, below formula is required
B = Σy – mΣx/N
Where, N = Number of observations
Kurtosis it is a statistical measure that quantifies the shape of a probability distribution. It provides information about the tails, and peakedness of the distribution. Kurtosis helps in analyzing the outleirs of a data set. Peakedness in a data distribution is the degree to which data values are concentrated around the mean.
- Positive Kurtosis indicates heavier tails and more peak distribution, which means Kurtosis is more than normal distribution, which has Kurtosis > 3.
- Negative Kurtosis indicates lighter tails, and a flatter distribution, which means Kurtosis is less than normal distribution, which has Kurtosis < 3.
- Zero Kurtosis indicates moderate tails and curves are medium peaked height, which means Kurtosis is same as the normal distribution, which has Kurtosis = 3.
Heteroscedasticity in Regression Analysis
Heteroscedasticity means unequal scatter; In the analysis, we talk about heteroscedasticity in the context of residuals. Heteroscedasticity is a change in the spread of the residuals over the range of measured values. Heteroscedasticity produces a cone or funnel shape in residual plots.
Breusch-Pagan test is used to test for heteroscedasticity in regression analysis.
Breusch-Pagan test applied to a fair coin toss scenario as discussed in today’s class, testing whether a coin is fair, meaning it has an equal probability of landing heads or tails. If we decide to flip the coin 100 times and record the outcomes.
Hypothesis consists of two types as below
Null Hypothesis (H0): There is no heteroscedasticity in the coin toss data, implying that the variance of the outcomes (heads or tails) remains constant across all tosses.
Alternative Hypothesis (Ha): There is heteroscedasticity in the coin toss data, suggesting that the variance of the outcomes (heads or tails) is not constant and may vary across the tosses.
- If the p-value is greater than significance level (e.g., 0.05), we do not have enough evidence to reject the null hypothesis. This suggests that the variance of the coin toss outcomes remains relatively constant, and there is no significant heteroscedasticity.
- If the p-value is less than significance level, we may reject the null hypothesis in favor of the alternative hypothesis. This implies that there is evidence of heteroscedasticity, indicating that the variance of coin toss outcomes may not be constant, and there could be variations in the coin’s behavior across the tosses.