Introduction

In the field of transiting body astronomy, one thing we look for are high-energy pulses. These pulses generally emit several things, sometimes it’s light, sometimes it’s matter we can see, and in very rare and special cases, thee events emit matter we can’t see. This is dark matter, one form of which is the neutrino.

Neutrino detection is a large field, with observatories all over the world. There are observatories in Japan, Australia, and that one I will focus on, the IceCube observatory in Antarctica. With all the information we have about neutrinos, we need to ask a question about how, why, where, and when these events occur. So here’s the question we’re aiming to answer: Are all neutrino detections made under the same conditions or are neutrino detections not reliant on the observatory’s conditions? My hypothesis is that no, neutrino detections do not occur under the same conditions meaning that there’s no ‘wrong way’ to observe them once you have a proper observatory set up. The null and alternative are presented like so:

\(H_0\): All \(\mu_i\) between observations are the same
\(H_a\): Not all \(\mu_i\) between observations are the same

Data Analysis

To begin analyzing the data we require some exploration and formatting. IceCube publishes it’s data as .txt files, so I converted them to CSV files by replacing whitespace with commas. After this is done and those files are loaded, exploration can begin.

# In the interest of making this as non-convoluted as possible I will make the data cleaner

aeff_data <- read_csv('Data/IceCube/aeffs.csv', show_col_types=FALSE)
event_data <- read_csv('Data/IceCube/events.csv', show_col_types=FALSE)

head(aeff_data)
## # A tibble: 6 × 4
##   File          `log10(Emin/GeV)` `log10(Emax/GeV)` `Aeff[m2]`
##   <chr>                     <dbl>             <dbl>      <dbl>
## 1 Aeff_IC40.csv               2                 2.1  0        
## 2 Aeff_IC40.csv               2.1               2.2  0.0000242
## 3 Aeff_IC40.csv               2.2               2.3  0.000110 
## 4 Aeff_IC40.csv               2.3               2.4  0.000392 
## 5 Aeff_IC40.csv               2.4               2.5  0.000889 
## 6 Aeff_IC40.csv               2.5               2.6  0.00262
summary(aeff_data)
##      File           log10(Emin/GeV) log10(Emax/GeV)    Aeff[m2]       
##  Length:420         Min.   :2.00    Min.   :2.10    Min.   :   0.000  
##  Class :character   1st Qu.:3.70    1st Qu.:3.80    1st Qu.:   3.437  
##  Mode  :character   Median :5.45    Median :5.55    Median : 236.500  
##                     Mean   :5.45    Mean   :5.55    Mean   : 761.100  
##                     3rd Qu.:7.20    3rd Qu.:7.30    3rd Qu.:1593.500  
##                     Max.   :8.90    Max.   :9.00    Max.   :3369.000
head(event_data)
## # A tibble: 6 × 6
##   File               MJD Ra_deg Dec_deg Unc_deg `log10(Ereco)`
##   <chr>            <dbl>  <dbl>   <dbl>   <dbl>          <dbl>
## 1 events_IC40.csv 54567.   79.4    7.64    0.48           3.19
## 2 events_IC40.csv 54582.   76.2    6.91    0.79           3.69
## 3 events_IC40.csv 54596.   80.2    5.93    1.13           3.32
## 4 events_IC40.csv 54600.   78.8    5.09    0.51           3.5 
## 5 events_IC40.csv 54635.   79.0    3.88    0.46           4.63
## 6 events_IC40.csv 54666.   78.1    6.91    1.51           5.5
summary(event_data)
##      File                MJD            Ra_deg         Dec_deg     
##  Length:1257        Min.   :54567   Min.   :74.37   Min.   :2.730  
##  Class :character   1st Qu.:55917   1st Qu.:76.28   1st Qu.:4.520  
##  Mode  :character   Median :57014   Median :77.47   Median :5.740  
##                     Mean   :56759   Mean   :77.44   Mean   :5.708  
##                     3rd Qu.:57586   3rd Qu.:78.64   3rd Qu.:6.870  
##                     Max.   :58057   Max.   :80.35   Max.   :8.690  
##     Unc_deg        log10(Ereco) 
##  Min.   :0.1500   Min.   :1.58  
##  1st Qu.:0.4900   1st Qu.:2.88  
##  Median :0.7400   Median :3.04  
##  Mean   :0.9197   Mean   :3.10  
##  3rd Qu.:1.1500   3rd Qu.:3.26  
##  Max.   :8.0300   Max.   :5.50

We can now see that the data is formatted into tibbles that contain filenames we can recode to use as the identifiers for each event.

# Factor recoding
event_data_recoded <- event_data |> 
  mutate(File = factor(File)) |> 
  mutate(File = fct_recode(File, 
                                 "IC40" = "events_IC40.csv",
                                 "IC59" = "events_IC59.csv",
                                 "IC79" = "events_IC79.csv",
                                 "IC86a" = "events_IC86a.csv",
                                 "IC86b" = "events_IC86b.csv",
                                 "IC86c" = "events_IC86c.csv"))

aeff_data_recoded <- aeff_data |> 
  mutate(File = factor(File)) |> 
  mutate(File = fct_recode(File,
                           "IC40" = "Aeff_IC40.csv",
                           "IC59" = "Aeff_IC59.csv",
                           "IC79" = "Aeff_IC79.csv",
                           "IC86a" = "Aeff_IC86a.csv",
                           "IC86b" = "Aeff_IC86b.csv",
                           "IC86c" = "Aeff_IC86c.csv"))
  
head(event_data_recoded)
## # A tibble: 6 × 6
##   File     MJD Ra_deg Dec_deg Unc_deg `log10(Ereco)`
##   <fct>  <dbl>  <dbl>   <dbl>   <dbl>          <dbl>
## 1 IC40  54567.   79.4    7.64    0.48           3.19
## 2 IC40  54582.   76.2    6.91    0.79           3.69
## 3 IC40  54596.   80.2    5.93    1.13           3.32
## 4 IC40  54600.   78.8    5.09    0.51           3.5 
## 5 IC40  54635.   79.0    3.88    0.46           4.63
## 6 IC40  54666.   78.1    6.91    1.51           5.5
head(aeff_data_recoded)
## # A tibble: 6 × 4
##   File  `log10(Emin/GeV)` `log10(Emax/GeV)` `Aeff[m2]`
##   <fct>             <dbl>             <dbl>      <dbl>
## 1 IC40                2                 2.1  0        
## 2 IC40                2.1               2.2  0.0000242
## 3 IC40                2.2               2.3  0.000110 
## 4 IC40                2.3               2.4  0.000392 
## 5 IC40                2.4               2.5  0.000889 
## 6 IC40                2.5               2.6  0.00262

Next to explore the means fo the data I’ll group and summarize by the event name itself for each dataframe, then project them together with mutate functions.

# Coding all of the datas means to a new tibble
mus_cond <- event_data_recoded |>
  group_by(File) |> 
  rename("log10ereco" = `log10(Ereco)`) |> 
  summarize(across(everything(), mean, na.rm=TRUE))
## Warning: There was 1 warning in `summarize()`.
## ℹ In argument: `across(everything(), mean, na.rm = TRUE)`.
## ℹ In group 1: `File = IC40`.
## Caused by warning:
## ! The `...` argument of `across()` is deprecated as of dplyr 1.1.0.
## Supply arguments directly to `.fns` through an anonymous function instead.
## 
##   # Previously
##   across(a:b, mean, na.rm = TRUE)
## 
##   # Now
##   across(a:b, \(x) mean(x, na.rm = TRUE))
mus_aeff <- aeff_data_recoded |> 
  group_by(File) |> 
  summarize(across(everything(), mean, na.rm=TRUE))

mus <- mus_cond |> 
  mutate("log10emingev" = mus_aeff$`log10(Emin/GeV)`) |> 
  mutate("log10emaxgev" = mus_aeff$`log10(Emax/GeV)`) |> 
  mutate("aeffm2" = mus_aeff$`Aeff[m2]`)

mus # Display the mean tibble
## # A tibble: 6 × 9
##   File     MJD Ra_deg Dec_deg Unc_deg log10ereco log10emingev log10emaxgev
##   <fct>  <dbl>  <dbl>   <dbl>   <dbl>      <dbl>        <dbl>        <dbl>
## 1 IC40  54753.   77.4    5.51   0.949       3.65         5.45         5.55
## 2 IC59  55176.   77.7    5.72   1.04        3.47         5.45         5.55
## 3 IC79  55523.   77.6    5.52   0.684       3.45         5.45         5.55
## 4 IC86a 55864.   77.4    5.65   0.857       3.08         5.45         5.55
## 5 IC86b 56604.   77.5    5.86   0.777       3.06         5.45         5.55
## 6 IC86c 57608.   77.3    5.68   1.02        2.97         5.45         5.55
## # ℹ 1 more variable: aeffm2 <dbl>

Statistical Analysis

To analyze the neutrino data and determine variance in detection conditions, I set the hypotheses as:

\(H_0\): All \(\mu_i\) between observations are the same \(H_a\): Not all \(\mu_i\) between observations are the same

# We need to conduct tests on Ra, Dec, Unc, log10Ereco, log10(Emin/GeV), log10(Emax/GeV), and Aeff[m2]

ra = aov(Ra_deg ~ File, data=event_data_recoded)
dec = aov(Dec_deg ~ File, data=event_data_recoded)
unc = aov(Unc_deg ~ File, data=event_data_recoded)
ereco = aov(`log10(Ereco)` ~ File, data=event_data_recoded)
emin = aov(`log10(Emin/GeV)` ~ File, data=aeff_data_recoded)
emax = aov(`log10(Emax/GeV)` ~ File, data=aeff_data_recoded)
aeff = aov(`Aeff[m2]` ~ File, data=aeff_data_recoded)

There’s a lot of variables so I limited to the most valid ones only

summary(ra) # Right Ascencion
##               Df Sum Sq Mean Sq F value Pr(>F)  
## File           5   21.6   4.329   1.987  0.078 .
## Residuals   1251 2726.1   2.179                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(unc) # Angular Uncertainty
##               Df Sum Sq Mean Sq F value   Pr(>F)    
## File           5   20.3   4.066   9.271 1.11e-08 ***
## Residuals   1251  548.7   0.439                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(ereco) # Reconstructed Muon Energy
##               Df Sum Sq Mean Sq F value Pr(>F)    
## File           5  47.29   9.459   115.7 <2e-16 ***
## Residuals   1251 102.26   0.082                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

These are the three most significant variables that can determine the detection of neutrino particles at the IceCube observatory.

Conclusion

Knowing that the variables that most correlate to detections at the IceCube observatory are based on the location (in Right-Ascension being similar and Angular Uncertainty being similar) we can infer that most of the neutrinos the observatory will detect in the future would come from the specific direction which is the mean of Ra_deg, or roughly 77 degrees of ascension with an uncertainty between 0,5 degrees and 1 degree.

The log10(Ereco) value, which is a representation of reconstructed muon energy, gives a glimpse at what can be expected. Ereco is likely between 3 and 3.64. More information on what Muon Energy Proxy is can be found in Berghaus 2009. This is a relatively low MEP, so detections of neutrinos will likely be made when MEP GeV figures are low.

Even due to the presence of correlative variables in the data, we can reject the null hypothesis meaning not all \(\mu_i\) are equal.

Future Direction

In the future, data like this can be used to identify where to put more neutrino detection experiments. Anything that can see along the 77 degree Ra area of the sky. This information can also be used to generate a basic alarm system when MEP values inside a detector rise to near the 3 to 3.64 GeV mark.

References

IceCube Collaboration (2018): IceCube data from 2008 to 2017 related to analysis of TXS 0506+056. Dataset. DOI:10.21234/B4QG92

Berghaus, P. (2009). Direct measurement of the atmospheric muon energy spectrum with IceCube. arXiv preprint arXiv:0909.0679.