In the field of transiting body astronomy, one thing we look for are high-energy pulses. These pulses generally emit several things, sometimes it’s light, sometimes it’s matter we can see, and in very rare and special cases, thee events emit matter we can’t see. This is dark matter, one form of which is the neutrino.
Neutrino detection is a large field, with observatories all over the world. There are observatories in Japan, Australia, and that one I will focus on, the IceCube observatory in Antarctica. With all the information we have about neutrinos, we need to ask a question about how, why, where, and when these events occur. So here’s the question we’re aiming to answer: Are all neutrino detections made under the same conditions or are neutrino detections not reliant on the observatory’s conditions? My hypothesis is that no, neutrino detections do not occur under the same conditions meaning that there’s no ‘wrong way’ to observe them once you have a proper observatory set up. The null and alternative are presented like so:
\(H_0\): All \(\mu_i\) between observations are the
same
\(H_a\): Not all \(\mu_i\) between observations are the
same
To begin analyzing the data we require some exploration and formatting. IceCube publishes it’s data as .txt files, so I converted them to CSV files by replacing whitespace with commas. After this is done and those files are loaded, exploration can begin.
# In the interest of making this as non-convoluted as possible I will make the data cleaner
aeff_data <- read_csv('Data/IceCube/aeffs.csv', show_col_types=FALSE)
event_data <- read_csv('Data/IceCube/events.csv', show_col_types=FALSE)
head(aeff_data)
## # A tibble: 6 × 4
## File `log10(Emin/GeV)` `log10(Emax/GeV)` `Aeff[m2]`
## <chr> <dbl> <dbl> <dbl>
## 1 Aeff_IC40.csv 2 2.1 0
## 2 Aeff_IC40.csv 2.1 2.2 0.0000242
## 3 Aeff_IC40.csv 2.2 2.3 0.000110
## 4 Aeff_IC40.csv 2.3 2.4 0.000392
## 5 Aeff_IC40.csv 2.4 2.5 0.000889
## 6 Aeff_IC40.csv 2.5 2.6 0.00262
summary(aeff_data)
## File log10(Emin/GeV) log10(Emax/GeV) Aeff[m2]
## Length:420 Min. :2.00 Min. :2.10 Min. : 0.000
## Class :character 1st Qu.:3.70 1st Qu.:3.80 1st Qu.: 3.437
## Mode :character Median :5.45 Median :5.55 Median : 236.500
## Mean :5.45 Mean :5.55 Mean : 761.100
## 3rd Qu.:7.20 3rd Qu.:7.30 3rd Qu.:1593.500
## Max. :8.90 Max. :9.00 Max. :3369.000
head(event_data)
## # A tibble: 6 × 6
## File MJD Ra_deg Dec_deg Unc_deg `log10(Ereco)`
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 events_IC40.csv 54567. 79.4 7.64 0.48 3.19
## 2 events_IC40.csv 54582. 76.2 6.91 0.79 3.69
## 3 events_IC40.csv 54596. 80.2 5.93 1.13 3.32
## 4 events_IC40.csv 54600. 78.8 5.09 0.51 3.5
## 5 events_IC40.csv 54635. 79.0 3.88 0.46 4.63
## 6 events_IC40.csv 54666. 78.1 6.91 1.51 5.5
summary(event_data)
## File MJD Ra_deg Dec_deg
## Length:1257 Min. :54567 Min. :74.37 Min. :2.730
## Class :character 1st Qu.:55917 1st Qu.:76.28 1st Qu.:4.520
## Mode :character Median :57014 Median :77.47 Median :5.740
## Mean :56759 Mean :77.44 Mean :5.708
## 3rd Qu.:57586 3rd Qu.:78.64 3rd Qu.:6.870
## Max. :58057 Max. :80.35 Max. :8.690
## Unc_deg log10(Ereco)
## Min. :0.1500 Min. :1.58
## 1st Qu.:0.4900 1st Qu.:2.88
## Median :0.7400 Median :3.04
## Mean :0.9197 Mean :3.10
## 3rd Qu.:1.1500 3rd Qu.:3.26
## Max. :8.0300 Max. :5.50
We can now see that the data is formatted into tibbles that contain filenames we can recode to use as the identifiers for each event.
# Factor recoding
event_data_recoded <- event_data |>
mutate(File = factor(File)) |>
mutate(File = fct_recode(File,
"IC40" = "events_IC40.csv",
"IC59" = "events_IC59.csv",
"IC79" = "events_IC79.csv",
"IC86a" = "events_IC86a.csv",
"IC86b" = "events_IC86b.csv",
"IC86c" = "events_IC86c.csv"))
aeff_data_recoded <- aeff_data |>
mutate(File = factor(File)) |>
mutate(File = fct_recode(File,
"IC40" = "Aeff_IC40.csv",
"IC59" = "Aeff_IC59.csv",
"IC79" = "Aeff_IC79.csv",
"IC86a" = "Aeff_IC86a.csv",
"IC86b" = "Aeff_IC86b.csv",
"IC86c" = "Aeff_IC86c.csv"))
head(event_data_recoded)
## # A tibble: 6 × 6
## File MJD Ra_deg Dec_deg Unc_deg `log10(Ereco)`
## <fct> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 IC40 54567. 79.4 7.64 0.48 3.19
## 2 IC40 54582. 76.2 6.91 0.79 3.69
## 3 IC40 54596. 80.2 5.93 1.13 3.32
## 4 IC40 54600. 78.8 5.09 0.51 3.5
## 5 IC40 54635. 79.0 3.88 0.46 4.63
## 6 IC40 54666. 78.1 6.91 1.51 5.5
head(aeff_data_recoded)
## # A tibble: 6 × 4
## File `log10(Emin/GeV)` `log10(Emax/GeV)` `Aeff[m2]`
## <fct> <dbl> <dbl> <dbl>
## 1 IC40 2 2.1 0
## 2 IC40 2.1 2.2 0.0000242
## 3 IC40 2.2 2.3 0.000110
## 4 IC40 2.3 2.4 0.000392
## 5 IC40 2.4 2.5 0.000889
## 6 IC40 2.5 2.6 0.00262
Next to explore the means fo the data I’ll group and summarize by the event name itself for each dataframe, then project them together with mutate functions.
# Coding all of the datas means to a new tibble
mus_cond <- event_data_recoded |>
group_by(File) |>
rename("log10ereco" = `log10(Ereco)`) |>
summarize(across(everything(), mean, na.rm=TRUE))
## Warning: There was 1 warning in `summarize()`.
## ℹ In argument: `across(everything(), mean, na.rm = TRUE)`.
## ℹ In group 1: `File = IC40`.
## Caused by warning:
## ! The `...` argument of `across()` is deprecated as of dplyr 1.1.0.
## Supply arguments directly to `.fns` through an anonymous function instead.
##
## # Previously
## across(a:b, mean, na.rm = TRUE)
##
## # Now
## across(a:b, \(x) mean(x, na.rm = TRUE))
mus_aeff <- aeff_data_recoded |>
group_by(File) |>
summarize(across(everything(), mean, na.rm=TRUE))
mus <- mus_cond |>
mutate("log10emingev" = mus_aeff$`log10(Emin/GeV)`) |>
mutate("log10emaxgev" = mus_aeff$`log10(Emax/GeV)`) |>
mutate("aeffm2" = mus_aeff$`Aeff[m2]`)
mus # Display the mean tibble
## # A tibble: 6 × 9
## File MJD Ra_deg Dec_deg Unc_deg log10ereco log10emingev log10emaxgev
## <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 IC40 54753. 77.4 5.51 0.949 3.65 5.45 5.55
## 2 IC59 55176. 77.7 5.72 1.04 3.47 5.45 5.55
## 3 IC79 55523. 77.6 5.52 0.684 3.45 5.45 5.55
## 4 IC86a 55864. 77.4 5.65 0.857 3.08 5.45 5.55
## 5 IC86b 56604. 77.5 5.86 0.777 3.06 5.45 5.55
## 6 IC86c 57608. 77.3 5.68 1.02 2.97 5.45 5.55
## # ℹ 1 more variable: aeffm2 <dbl>
To analyze the neutrino data and determine variance in detection conditions, I set the hypotheses as:
\(H_0\): All \(\mu_i\) between observations are the same \(H_a\): Not all \(\mu_i\) between observations are the same
# We need to conduct tests on Ra, Dec, Unc, log10Ereco, log10(Emin/GeV), log10(Emax/GeV), and Aeff[m2]
ra = aov(Ra_deg ~ File, data=event_data_recoded)
dec = aov(Dec_deg ~ File, data=event_data_recoded)
unc = aov(Unc_deg ~ File, data=event_data_recoded)
ereco = aov(`log10(Ereco)` ~ File, data=event_data_recoded)
emin = aov(`log10(Emin/GeV)` ~ File, data=aeff_data_recoded)
emax = aov(`log10(Emax/GeV)` ~ File, data=aeff_data_recoded)
aeff = aov(`Aeff[m2]` ~ File, data=aeff_data_recoded)
There’s a lot of variables so I limited to the most valid ones only
summary(ra) # Right Ascencion
## Df Sum Sq Mean Sq F value Pr(>F)
## File 5 21.6 4.329 1.987 0.078 .
## Residuals 1251 2726.1 2.179
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(unc) # Angular Uncertainty
## Df Sum Sq Mean Sq F value Pr(>F)
## File 5 20.3 4.066 9.271 1.11e-08 ***
## Residuals 1251 548.7 0.439
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(ereco) # Reconstructed Muon Energy
## Df Sum Sq Mean Sq F value Pr(>F)
## File 5 47.29 9.459 115.7 <2e-16 ***
## Residuals 1251 102.26 0.082
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
These are the three most significant variables that can determine the detection of neutrino particles at the IceCube observatory.
Knowing that the variables that most correlate to detections at the IceCube observatory are based on the location (in Right-Ascension being similar and Angular Uncertainty being similar) we can infer that most of the neutrinos the observatory will detect in the future would come from the specific direction which is the mean of Ra_deg, or roughly 77 degrees of ascension with an uncertainty between 0,5 degrees and 1 degree.
The log10(Ereco) value, which is a representation of reconstructed muon energy, gives a glimpse at what can be expected. Ereco is likely between 3 and 3.64. More information on what Muon Energy Proxy is can be found in Berghaus 2009. This is a relatively low MEP, so detections of neutrinos will likely be made when MEP GeV figures are low.
Even due to the presence of correlative variables in the data, we can reject the null hypothesis meaning not all \(\mu_i\) are equal.
In the future, data like this can be used to identify where to put more neutrino detection experiments. Anything that can see along the 77 degree Ra area of the sky. This information can also be used to generate a basic alarm system when MEP values inside a detector rise to near the 3 to 3.64 GeV mark.
IceCube Collaboration (2018): IceCube data from 2008 to 2017 related to analysis of TXS 0506+056. Dataset. DOI:10.21234/B4QG92
Berghaus, P. (2009). Direct measurement of the atmospheric muon energy spectrum with IceCube. arXiv preprint arXiv:0909.0679.