Overview
This section involves comparing the difference between tradition and Bayesian simple kriging. Because of the complexity of doing Bayesian kriging by hand, the spBayes package in R were used. In addition to comparing the point estimates between the two methods, uncertainty was also analyzed for a certain point. The code written to get the following results can be found here.
Data Simulation
To simulate data, the grf() function from the geoR package was used. This function simulates 2D data from a Gaussian random field. 100 points were simulated from [0,1] with the mean of the process set to be 5 and nugget effect to be zero. An exponential covariance function was used with σ^2 = 2 and the range parameter, a = 0.15. The data can be see in Figure 1. Knowing the true σ^2 and range was useful when looking at the Bayesian estimates. For kriging, a single point was selected to krige on, which is seen in Figure 1 as the black asterisk.
Traditional Simple Kriging
Since the mean of the process was set when simulating and the data were generated with a specific covariance function, there was no need to fit a semi-variogram since the true parameters were known. An image of the simulated data with the single simple kriged estimate can be seen in Figure 2. The value of the estimate at this point was 5.953. Overall, the kriged estimate seemed to follow the spatial pattern seen in the simulated data. Later on, the uncertainty in the prediction was analyzed.
Bayesian Estimation/Kriging
When doing traditional simple kriging, the covariance matrices were calculated based on the way the data were simulated. In real life, it is very unlikely that one knows the true variance and range of the spatial process. Semi-variograms can help to estimate these parameters, but most of the time this is done based off of an eye test. This can lead to good results, but there is a lot of room for improvement. It is best to estimate these parameters using Bayesian methods since the data and our prior knowledge is taken into account. As stated at the beginning of this project, using Bayesian methods in the spatial space can get tricky. Luckily there are packages in R to help with this and eliminate the need to derive posteriors and likelihoods by hand. The package used in this analysis was spBayes, but there are others options, such as the krige.bayes() function.
The spBayes package performs a geostatistical analysis on the data while allowing for uncertainty for certain parameters. In the code, the assumption that the nugget effect was known to be zero was used, but priors were set on σ^2, the mean (β), and the range parameter. The function used to get the posteriors of these parameters was spLM(). There are three components of the spLM() function: starting, tuning, and the priors. The starting argument indicates where the starting value for each parameter should be. The tuning argument indicates the variance that should be used for each parameter in the Metropolis-Hastings normal proposal distributions. Lastly, the priors arguments take in the priors for each parameter, including β which represents the mean. A normal prior with a mean of 0 and variance of 100 was used for the mean. The large variance indicates that this is an uninformative prior. A uniform prior was set on the range parameter and an inverse gamma prior was set on the variance. A snapshot of the setup for this function can be seen in Figure 3. This setup also shows that 20,000 iterations of Metropolis-Hastings were conducted. Also note the spRecover() function. This function will clip out a burn in period, which is useful because it is common to take a few iterations to get to the bulk of the distribution.
After running this function, the output showed images of the posterior distributions of the parameters along with trace plots. Figure 4 shows the posterior distributions for the range (denoted phi in this package) and the variance. Overall, the trace plots seemed to converge, but there is a pretty wide peak when looking at the distribution of the range. Figure 5 shows the posterior distribution and trace plot of the intercept term, which in this case was just the mean of the process. The trace plot for the mean really converged well, and it seemed to be centered around the true mean, which makes sense. Both the range and variance parameters had posteriors that were centered a little higher than the true values that were used to generate the data. While spLM() ran, it showed the acceptance rate of the Metropolis-Hastings algorithm at different intervals. The overall acceptance rate was 44.38% which was really good. The good acceptance value for that algorithm is anywhere from 20%-60%, so an acceptance rate of 44.38% indicates that the prior distributions were good and the distributions were accurately explored.
After calculating the posterior models, the next step was to do prediction on the same point predicted on above. The spBayes package has a function called spPredict() which allows for spatial prediction. Figure 7 shows the setup of this function. All that was needed was the spLM object, the location to be predicted at, and the predicted point covariance. Since only one point was being predicted, the covariance was set to be one.
The output of the prediction function contained 20,000 samples using the posteriors generated above. A histogram of the samples at that particular point can be seen in Figure 6, which is known as the posterior predictive distribution. The mode of the function was approximately 5.462, which is pretty close to the 5.953 that was calculated by kriging the traditional way. Having a distribution of predicted values for one point is a big benefit of using Bayesian methods to krige. Doing traditional kriging, unless the covariance function parameters are changed, there is only one estimate. In real life, it is best to have multiple values and a distribution of possible predictions for one point.
Comparing Kriging Uncertainty Between the Two Methods
Above it was observed that both Bayesian and traditional kriging yielded very similar results. It was interesting to look at the uncertainty of the predictions to see which method performed better. A 95% confidence interval was created for traditional kriging using the formula seen in Figure 7 to find the standard error of the estimate. Seen there is a distribution of values for the point using Bayesian methods, the credible interval was much easier to get. All that was needed was to grab the specific quantiles of the distribution using the quantile() function. Table 1 shows 95% confidence/credible intervals comparisons for one point, and the results were very interesting. As expected, both point estimates were very close to each other, but the uncertainty intervals between the methods were a little different. The width of the confidence interval was about 4.35 units wide and the width of the credible interval was about 4.30 unis wide. Ever so slightly the credible interval was a bit more narrow. Although this may not seem like a big difference between the two methods, the way confidence intervals are interpreted are different between frequentists and Bayesians. A frequentist would say, “in repeated sampling, 95% of all confidence intervals would contain the true point estimate”. A Bayesian would say, “we are 95% confident the true value at this point given the data is between 3.918 and 8.217”. Overall, the Bayesian interpretation of a credible interval is what most people would want to know/conclude after doing prediction.
Method | 2.5% | Estimate | 97.5% |
Traditional Kriging | 3.780 | 5.953 | 8.126 |
Bayesian Kriging | 3.918 | 5.462 | 8.217 |
There is one more important thing to mention about uncertainty between traditional and Bayesian kriging. When traditional kriging is done, usually the maximum likelihood estimator (MLE) is used at the parameter in the covariance function. Although this is a good guess and may be close to the true parameters, it is set to be fixed in the model. Traditional kriging is not taking into account the uncertainty in the parameters used in the covariance function, it is only taking in the uncertainty in the distance between the points. There is a difference when looking at Bayesian kriging. When Bayesian kriging is done, the posterior predictive distribution in Figure 8 is what is trying to be found. Since this is almost never available analytically, it is estimated using the formula in Figure 9. Not only is the posterior predictive distribution taking into consideration the uncertainty in the distance between the points, but it is also integrating out θ, which means that it is taking into account the uncertainty in the parameters in the covariance matrix. So although the confidence/credible intervals in Table 1 are very similar, the Bayesian way of kriging is taking integrating out the uncertainty in θ, and this is not the case in traditional kriging.
Conclusions
When doing Bayesian kriging, it was nearly impossible to do by hand without making many assumptions. Even when making certain assumptions, the computational time was very long for a small amount of data. Thankfully, built in R packages and functions made the process easy and produced great results. The built in R functions to Bayesian krige took only a few seconds to run, compared to almost an hour when doing it by hand. When doing traditional and Bayesian simple kriging, the overall estimates between the two methods were very similar. Digging deeper into uncertainty proved that Bayesian kriging comes out on top. Not only are the credible intervals a little narrower, but the interpretation of them are more natural and easier for people to understand. Bayesian kriging also accounts for the fact that the true values of the covariance function are not known. Traditional kriging is definitely a great way to estimate a spatial process at new locations, but if you understand Bayesian statistics and want a more accurate estimate, Bayesian is the way to go.