Sunteți pe pagina 1din 1

Applications of Geospatial Analysis:

Using Point Separation and Hypothesis Testing to Observe and Better Understand Celestial Objects
Background Introduction
Numerical analysis seldom is an esthetically-pleasing aspect of Geomatics, especially when compared to the plentiful and numerous Geographic Information System (GIS) products which fill our everyday life. In spite of this, basic geostatistcal analysis continues to play an important aspect of discovery and understanding and continues to have tangible benefits well after John Snows spot map of London which identified a spatial relation that led to a practical understanding on the transmission of Cholera in the 1850s.
Figure 1: Relative coordinates of the galaxies within the Shapley Supercluster as observed by the UK Schmidt Telescope

Brian Bancroft, Student, University of Ottawa

Methods
This test was conducted completely using R software, with the splancs package, whose documentation is found both within the program as well as online through multiple websites. For ghat, a sample block of code is provided, which will duplicate what is seen on this poster if directly pasted into R with the correct packages loaded, except for a couple of colored lines on the legend. The Monte Carlo analysis was conducted by adding the mean nearest neighbor distance of all the galaxies within the study area to 99 simulated means. These simulated means each took the region of interest, and randomly created the same amount of objects within the region as galaxies in this cluster, and took their nearest neighbor distance mean. There, a histogram was created of all the means with a reference both of the expected mean as well as the observed nearest neighbor mean of the cluster in order to show a direct comparison the could disprove or fail to disprove a null hypothesis.

To analyze phenomena such as this supercluster, pattern description, point density and point separation methods of analysis are all available as tools for analysis. Standard distance, a multidimensional form of standard deviation will tell about the nature of how far the galaxies are spread, while quadrat counts and kernel densities can indicate how heavily populated a region is in terms of galaxies. Nearest neighbor techniques such as ghat and fhat look at how many points have nearest neighbor distances to each other, or to a random set of points overlaid over certain distances. In this analysis, we want to look at whether these galaxies are clustered, randomly placed, or placed in what appears to be a grid-like pattern. In this project, the ghat technique of analysis will be used. A Monte Carlo p-value and analysis is also an effective tool, as it provides a direct way of disproving a null hypothesis, such as the one used in this study. It does so by directly comparing the mean nearest neighbor distance to that of some n complete spatial randomness(CSR) simulations which randomly create the same amount of events as objects in the study area, within the same region.

Moving from Biology to Physics, this poster intends to demonstrate how numerical techniques can definitively demonstrate relations between celestial objects. This is important as each advance in how we understand the way that objects in space interact lead to practical discoveries ranging from gravitational models, to advances in remote sensing. These methods are important to researchers as they give them the means to differentiate between what could lead to a discovery and something that is insignificant.

Results Through observation of both ghat of the nearest neighbors of our point set along with a monte-carlo simulation, we can disregard the null hypothesis (H0)

data(shapley) maxx = max(shapley$x) minx = min(shapley$x) maxy = max(shapley$y) miny = min(shapley$y) mypoly = cbind(c(minx, maxx, maxx, minx), c(miny,miny,maxy,maxy)) shapley.pts = as.points(shapley)

Hypotheses
If someone was to guess that there was a positional relation between these galaxies, they would need to make a null hypothesis, which effectively states that there is no relation between the position of these galaxies, then rule that out.
#CREATE A REGULAR AND CLUSTERED POINT PATTERN xy.regular=gridpts(mypoly,npts(shapley.pts)) xy.pcpcluster=pcp.sim(0.5, 39.5, .0000000000001, mypoly) #GET GHAT FOR EACH PATTERN shapley.ghat=Ghat(shapley.pts,seq(0,2,0.1)) regular.ghat<-Ghat(xy.regular,seq(0,2,0.1)) pcpcluster.ghat<-Ghat(xy.pcpcluster,seq(0,2,0.1)) #PLOT GHAT AND ADD LINES FOR EACH plot(seq(0,2, 0.1),pcpcluster.ghat,type="l", col="red", lwd=3, main = "Ghat of Shapley Supercluster", bg = "white", xlab = "Seperation Between Nearest Neighbours (Degrees)", ylab = "G", ) lines(seq(0,2, 0.1), shapley.ghat,col="green",lwd=3) lines(seq(0,2,0.1),regular.ghat,col="blue",lwd=3) legend(1,0.6,legend = c("Clustered Distribution of Points", "Shapley Supercluster Points", "A Grid Pattern of Points") , lty=1:2, col=2:3, adj = c(0, .6))

These two hypotheses are as follows: H0: The positions of the galaxies within the Shapley Supercluster are no different from a CSR. H1: The galaxies are positioned in a pattern much different from a CSR process.
Figure 2: Kernel Density Map of the Shapley Supercluster (kernel = 0.85 degrees) Figure 3: A histogram of 99 Monte-Carlo simulations. For this observation, the Monte-Carlo p-value is 0.99.

Data
The data which is used in this project represents a two-dimensional map of the Shapley Supercluster of Galaxies, about 650 Million Light Years away from Earth. This data comes included in the spatstat package for R, a freely-available statistical computing program. Each point represents a Galaxy, observed through spectroscopy and measured by Right Ascension degrees for the x-axis, and declination degrees for the y-axis, as observed by the UK Schmidt Telescope (UKST) as a 2-d representation. There are some things that can be immediately understood from this data. As seen in the Kernel Density Estimate (KDE) map, the mean center of 201.59 by -31.54 degrees lies in the upper left of one of the two main groupings of galaxies within the supercluster. The standard distance itself lies about 5% of a degree, which indicates how tightly grouped these galaxies are to one and another on an x-y plane. The nearest neighbor distances of this cluster are compact. They vary between nil to 0.804 degrees, which indicates that most of these galaxies are relatively closely grouped, assuming they are on a 2-dimensional plane.

Conclusion
With these results, we can point out that the galaxy cluster is in fact, a cluster and that the null hypothesis (H0) can be rejected, if these galaxies actually were on a two-dimensional plane. However, as the universe exists spatially within three dimensions, such a study as the one depicted on this poster cannot prove whether the Shapley Superclusters formation or nearest neighbor index can be replicated through a CSR, but it is entirely possible to use these methods to answer the same questions given: 1. That for each galaxy, a distance from the observatory is given with a small enough degree of experimental error which allows the researcher to differentiate between distances of the separate galaxies 2. That these angles and distances are projected into a cartesian coordinate system With the two above conditions met, a reliable and useful interpretation of nearest neighbor analysis could be effectively used to better understand system of galaxies, or other celestial objects on a smaller scale.

(d ) #(d min ( si ) d ) G n
Ghat (G hat of d) is equal to the number of nearest neighbours(dmin) within the study area(si) which are equal or less than a certain distance(d) over the total number of points with a study region.

Bibliography:
Images: Background: http://www.ifa.hawaii.edu/info/press-releases/kocevski-1-06/Fig1grayscale300dpi.jpg Galaxy map: http://www.atlasoftheuniverse.com/superc/shapley.html R packages and documentation: R Software: http://www.r-project.org/ Spatstat: http://www.spatstat.org Splancs: http://cran.r-project.org/web/packages/splancs/index.html

Figure 4: A ghat analysis of the cluster dataset in contrast with a random set of points and a grid-like set of points over the same area

S-ar putea să vă placă și