# Linear Regression Modeling of GoldCorp Share Price vs. Spot Gold

Subject: Finance 14 2000 7 min PhD

## Introduction

The study sought to determine if Spot Gold prices and Treasury bill returns can be used to predict the returns on Goldcorp’s share price. Goldcorp’s share prices were the dependent variable while spot gold prices and Treasury bill returns were the independent variables. The daily prices for all assets were used in this analysis. A regression analysis was used to determine the relationship between the three assets (Leech, Caplovitz & Morgan 2005). The following tests and statistics were also used in the analysis:

1. Descriptive statistics and tests. These included the mean, median, minimum, maximum, standard deviation, skewness, Jacque-Bera, and probability.
2. Residual diagnostics. These included normality tests, heteroscedasticity (white test), serial correlation, Quant Andrews break test, and Ramsey reset test.

The analysis was done using Eviews statistical software. Eviews was used, because it is user friendly and gives comprehensive output. It is also recommended for such rigorous analysis.

## Descriptive statistics

Daily data for Goldcorp, Spot Gold prices, and Treasury bill prices were used. The data was collected for the period between 10/13/2003 and 4/11/2014. This gave a sample of 2643 days.

### Series 1: Goldcorp

The mean of Goldcorp was 31.00746. It gave a single figure to represent a data set. On average, the price of Goldcorp was 31.00746 for the period covered by the sample. The median of Goldcorp was 30.33. It was below the mean and, therefore, the data was not normally distributed. The median is the figure in a data set that lies at the middle, when the data is arranged in increasing or decreasing order. In this case, the median was too close to the mean. This was an indication that the Goldcorp series was near normal distribution. The maximum value for this series was 55.84. The minimum value was 10.50. These were the extreme values on the right-hand and the left-hand side of the mean respectively. When the maximum value is too big or minimum value is too small, they tend to pull the mean towards themselves. The maximum and minimum values were 24.83 and 20.5 points above and below the mean respectively. The maximum value pushed the mean slightly above the median. However, the difference between mean and median is too small and may not affect normality of the series. The prices of Goldcorp on average lied about 11.44 standard deviations away from the mean.

Jarque-Bera test was used to test if the series were normally distributed. The null hypothesis for normality test claimed that the series was normally distributed. The value of Jaque-Bera test statistic was 112.7109 and had a probability of 0.000. The decision criterion was that if the probability (p-value) of the test statistic was less than the level of significance (0.05), the null hypothesis was rejected (Lehmann, 2005). For Goldcorp, the null hypothesis was rejected. Therefore, it was assertained that the series was normally distributed. Skewness was less than zero (< 0) meaning that the distribution was skewed to the left. This made the median to be on the left-hand side of the mean.

### Series 2: Spot gold

Spot Gold had a mean of 983.0670. The median was 921.2185 which indicated that the distribution of the series was not normal, because it was different from the mean. The maximum and minimum figures were 921.2185 and 371.145 respectively. The minimum value seemed to be too small compared to other figures in the series. This affected the mean until it fell below the median. The distribution will be skewed to the right due to this effect. The standard deviation was small (446.71) compared to the mean. This indicated that most of the values in the distribution were around the mean. The skewness was greater than zero meaning that the distribution was skewed to the right.

The Jarque-Bera test statistic was 204.1004 and had a probability of 0.000. The probability (p-value) was less than the level of significance (0.05). The null hypothesis was, therefore, rejected leading to a conclusion that the series was not normally distributed.

### Series 3: LT Treasury

The mean for this series was 3.489769. The median was 3.66. The median was greater than the mean and it lied on the right-hand side of the mean. The maximum and minimum values were 5.25 and 1.4 respectively. The minimum value was so extreme that it pulled the mean below the median. The coefficient of skewness was -0.330282. This meant that the distribution was skewed to the left. The Jarque-Bera test statistic was 169.1327 and had a probability of 0.000. The probability (p-value) was less than the level of significance (0.05) meaning that the null hypothesis was rejected. Consequently, it was deduced that the series was not normally distributed.

## Residual Disgnostics

These tests are done for model fit validation. The residuals were computed by subtracting the actual values from the fitted values. The following residual diagnostic tests were conducted:

### Heteroscedasticity (White) test

The white test statistic (normally indicted as Obs*R-squared in Eviews output) was 88.16511. The null hypothesis for this test claimed that there was no heteroscedsaticity. The test was conducted at 5% level of significance. The White test statistic was calculated as the number of observation multiplied by R2 (coefficient of determination). The critical value of White test statistic at 5% level of significance was 11.0704976935. The statistic obtained was greater than critical value leading to rejection of null hypothesis. The conclusion was made that there was heteroscedasticity.

### Ramsey Reset test

This is a test for model specification. The coefficients of the variables in the model were assumed to be jointly zero. This formed the null hypothesis.

The F-statistic in the Eviews output table indicated the Ramsey Reset test statistic. It was equal to 1.871992. The critical F-statistic at 5% level of significance was 2.85. The calculated value was less than the critical value meaning that the null hypothesis could not be rejected. This meant that the variable coefficients were jointly zero. This does not mean that individual variables will be insignificant.

### Breusch-Godfrey Serial correlation LM Test

This test was used to find out the presence of serial correlation among the variables used in linear regression analysis. For Eviews LM test output table, the test statistic was denoted by Obs*R-squared. Its value and corresponding probability were 136.4770 and 0.000000 respectively. The test was conducted at 5% level of significance. It was assumed for the null hypothesis that serial correlation does not exist. Since the p-value of the test was less than the level of significance, the null hypothesis was rejected. Therefore, it was assertained that there was serial correlation.

### Quandt Andrews break test

The null hypothesis tested was that no breakpoints within 15% trimmed data. However, the p-value of Wald F-statistic was 0.0001 and was less than 5% level of significance. This led to rejection of null hypothesis. It was concluded that there were breakpoints within 15% trimmed data.

### Normality tests

The test of normality was done using Jarque-Bera test. The test was conducted on the residuals. The Jarque-Bera test statistic was 2909.711 and its corresponding probability was 0.000. The null hypothesis claimed that the residual series was normally distributed. Since the p-value was less than the level of significance (0.05), the null hypothesis was rejected (Mann, 2006). It was concluded that the residuals were not normally distributed.

## Regression analysis

Linear regression analysis was used to find out if Spot Gold prices and Treasury bill returns can be used to predict the returns on Goldcorp’s share price. To avoid the violation of assumptions of Ordinary Least Squares method detected at the diagnostic tests stage, the variables were expressed in logarithmic form. The regression equation estimated was as follows:

• RLRGC = β0 + β1RLRSG + β2RLRLTT + ε………………………………………..

Where:

• RLRGC = Goldcorp logarithmic returns
• RLRSG = Spot Gold Logarithmic returns
• RLRLTT = Treasury bill Logarithmic returns
• ε = the stochastic term or the error term
• β0 = is the intercept of the regression line
• βi = The coefficients of the right-hand variables. They indicate the rate at which the dependent variable changes, when independent variables they represent are changed.

The estimated equation obtained was as follows:

• RLRGC = – 1.67E-05 + 1.543579 RLRSG + 0.010740 RLRLTT…………………….
• Std. Error = 0.000606 0.033906 0.021126
• T-statistic = -0.027597 45.52486 0.508373
• P-value = 0.9780 0.0000 0.6112

The coefficients of the independent variables were assumed to be zero in the null hypothesis. The variables were, thus, assumed to be insignificant. The p-values showed that Spot Gold was a significant determinant of Goldcorp prices. It also showed further that Treasury bill returns are not significant determinants of Goldcorp prices. As a rule, if the p-value was less than the level of significance, the null hypothesis was rejected (Belle, 2008). It was concluded that Spot Gold can be used to predict Goldcorp prices. The positive sign of the coefficient showed that an increase in Spot Gold and Treasury bill returns would lead to an increase in Goldcorp prices equivalent to the size of the corresponding coefficient.

The value of the coefficient of determination (R2) was 44.0072%. This meant that 44.0072% of variation in Goldcorp prices was jointly explained by Spot Gold and Treasury bill returns. The other variation (about 56%) was explained by factors not included in the model (Nick, 2007). There were more important factors which were left out of the model and which are responsible for 56% of the variation in Goldcorp. According to Gelman (2005), if they were included, the value of coefficient of determination would increase.

## Conclusion

The study conducted revealed that Spot Gold was a strong determinant of Goldcorp share prices. It was concluded based on the analysis that Spot gold can be used to predict Goldcorp share prices. Treasury bill returns were not found to be significant. A rigorous regression analysis was done. Descriptive statistics were used to determine the characteristics of each data series. There were a number of tests which were conducted using the residuals. These were used to show the readiness of data series to be used in model estimation. The data was analysed using Eviews software. The coefficients of the two independent variables were positive. This showed that, even though one of them was not statistically significant, an increase in any of the variables would also increase the level of Goldcorp prices. The value of the coefficient of determination showed that there were many variables which were left out of the model. If more variables are added to model, the value of R2 would increase. Prediction of Goldcorp cannot be done using Spot Gold alone. There seem to be more variables that were excluded from the model.

## Recommendations

The analysis revealed that Spot Gold is a significant determinant of Goldcorp. The null hypothesis was rejected, because the p-value of the t-statistic was less than the level of significance. According to the regression analysis conducted, a unit increase in Spot Gold will increase Goldcorp by 1.543579. It can be used to predict Goldcorp, because it will contribute greatly to its variation. If the analyst wishes to determine what the value of Goldcorp will be at a given point in future, he would need to have information about Spot Gold and insert it in the regression analysis.

Treasury bill returns do not have a significant impact on Goldcorp. It may not be necessary in the prediction of Goldcorp. Its null hypothesis could not be rejected because its p-value was greater than the level of significance. If Treasury bill returns are increased by one unit, Goldcorp would increase by 0.010740. This effect was found to be insignificant.

There is need to include more variables in the regression model when predicting Goldcorp. The coefficient of determination showed that the two variables on the right-hand side of the regression equation contributed to 44% of the variation in Goldcorp share prices. The other percentage (56%) was explained by factors which were not included in the model. Since one of the variables is not significant, more variables need to be added to the model when predicting the prices of Goldcorp shares. This will improve the results.

## Reference

Belle, G 2008, Statistical rules of thumb, 2nd edn, Wiley, Hoboken.

Gelman, A 2005, ‘Analysis of variance: Why it is more important than ever’, The Annals of Statistics, vol. 33 no. 1, pp. 1–53.

Leech, L, Caplovitz, K & Morgan, G 2005, SPSS for Intermediate Statistics: Use and Interpretation, London, Psychology Press.

Lehmann, E 2005, Testing Statistical Hypotheses, John Wiley & Sons, New York.

Mann, S 2006, Introductory Statistics, 2nd edn, Wiley, New York.

Nick, G 2007, ‘Descriptive Statistics’, in WT Ambrosius (eds), Topics in Biostatistics, Springer, New York, pp. 33-53.