PIC_Follower_SGE

Author

Andrea N

Meta data (column description)

Animal - List unique animal IDs and count how many records there are for each.
Follower_ID - List unique follower IDs and count occurrences, which might be useful to see if certain animals are more frequently following others.
ENTRY - Entry time at the feeder
EXIT - Exit time at the feeder
STAY_IN - Numerical data showing how long an animal stays.
Social_Group - Categorize by social group, counting the number of records per group.
L_time -Log time transformation
Hour_ENTRY - Hour of entry; hour of the day
time_between - Time between visits at the feeder
FEED_INTK - Feed intake per visit at the feeder
PEN - Pen number
sdL_time - Standard deviation of Log of the time length
Follower_IDpe - Permanent Environment of the follower effect
Animalpe - Permanent Environment of the direct effect
Unique_Animal_Count - Number of unique animals per some category or record; summarize with average, total, min, and max.
feeding_rec - Number of feeding records; summarize with total, average, min, and max per animal or per social group.

Packages

When you click the Render button a document will be generated that includes both content and the output of embedded code. You can embed code like this:

library(tidyr)
library(broom)
library(lme4)

Loading required package: Matrix


Attaching package: 'Matrix'

The following objects are masked from 'package:tidyr':

    expand, pack, unpack

library(dplyr)


Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

library(lme4)
library(lmerTest)


Attaching package: 'lmerTest'

The following object is masked from 'package:lme4':

    lmer

The following object is masked from 'package:stats':

    step

library(emmeans)
library(car)

Loading required package: carData


Attaching package: 'car'

The following object is masked from 'package:dplyr':

    recode

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ forcats   1.0.0     ✔ readr     2.1.5
✔ ggplot2   3.5.0     ✔ stringr   1.5.1
✔ lubridate 1.9.3     ✔ tibble    3.2.1
✔ purrr     1.0.2

── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ Matrix::expand() masks tidyr::expand()
✖ dplyr::filter()  masks stats::filter()
✖ dplyr::lag()     masks stats::lag()
✖ Matrix::pack()   masks tidyr::pack()
✖ car::recode()    masks dplyr::recode()
✖ purrr::some()    masks car::some()
✖ Matrix::unpack() masks tidyr::unpack()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(corrplot)

corrplot 0.92 loaded

library(RColorBrewer)
library(ggplot2)
library(MASS)


Attaching package: 'MASS'

The following object is masked from 'package:dplyr':

    select

library(agricolae)
library(vegan)

Loading required package: permute
Loading required package: lattice
This is vegan 2.6-4

library(dplyr)
library(readr)
library(DT)
library(ggplot2)
library(quantreg)

Loading required package: SparseM

Attaching package: 'SparseM'

The following object is masked from 'package:base':

    backsolve

library(broom.mixed)
library(pedigree)
library(pedigreemm)
library(pedtools)

Read data

rm(list = ls())

setwd("C:\\Users\\anunez\\OneDrive - Iowa State University\\Desktop\\PIC_DataAnalysis_files")

data_PIC <- read.csv("FINAL_FIRE_2019_2023_AN.csv")

summary(data_PIC)

       ID                 LINE    PED_IDENT_SIRE     PED_IDENT_DAM     
 Min.   : 80406526   Min.   :65   Min.   :72321895   Min.   :70838758  
 1st Qu.: 84007778   1st Qu.:65   1st Qu.:79257130   1st Qu.:79153747  
 Median : 88690351   Median :65   Median :83589571   Median :84025461  
 Mean   : 89253749   Mean   :65   Mean   :83724294   Mean   :83988626  
 3rd Qu.: 93986447   3rd Qu.:65   3rd Qu.:87591160   3rd Qu.:88191389  
 Max.   :100603759   Max.   :65   Max.   :95485445   Max.   :96751344  
 LIT_LITTER_ID          PEN              TEST_FARM    ENTRY_TIME       
 Min.   :68967039   Length:1048575     Min.   :774   Length:1048575    
 1st Qu.:71920248   Class :character   1st Qu.:774   Class :character  
 Median :74483118   Mode  :character   Median :774   Mode  :character  
 Mean   :74566106                      Mean   :774                     
 3rd Qu.:77074566                      3rd Qu.:774                     
 Max.   :80351636                      Max.   :774                     
  EXIT_TIME            STAY_IN            FEED_INTK       FEEDER_ENTRY_WT
 Length:1048575     Min.   :        2   Min.   :-3892.0   Min.   :-993   
 Class :character   1st Qu.:      513   1st Qu.:  231.0   1st Qu.: 786   
 Mode  :character   Median :     1103   Median :  521.0   Median :1051   
                    Mean   :     3582   Mean   :  575.4   Mean   :1106   
                    3rd Qu.:     1757   3rd Qu.:  848.0   3rd Qu.:1399   
                    Max.   :665125948   Max.   : 9280.0   Max.   :9282   
 FEEDER_EXIT_WT     FEEDER_NO      START_DAY           END_DAY         
 Min.   :-993.0   Min.   : 1.00   Length:1048575     Length:1048575    
 1st Qu.: 414.0   1st Qu.:14.00   Class :character   Class :character  
 Median : 525.0   Median :31.00   Mode  :character   Mode  :character  
 Mean   : 530.8   Mean   :30.19                                        
 3rd Qu.: 647.0   3rd Qu.:46.00                                        
 Max.   :8182.0   Max.   :66.00

data_PIC <- mutate(data_PIC, 

                   ENTRY_DATE = as_date(mdy_hm(ENTRY_TIME, tz = "UTC")),

                   ENTRY = mdy_hm(ENTRY_TIME, tz = "UTC"),
                   
                   
                   EXIT_DATE = as_date(mdy_hm(EXIT_TIME, tz = "UTC")),

                   EXIT = mdy_hm(EXIT_TIME, tz = "UTC"), 
                   
                   START_DAY = format(as.Date(START_DAY, format = "%d-%b-%y"), "%d%m%y"),

                   OFFTEST_DAY = format(as.Date(END_DAY, format = "%d-%b-%y"), "%d%m%y"),
                   

                   )

summary(data_PIC$PEN)

   Length     Class      Mode 
  1048575 character character

dim(data_PIC)

[1] 1048575      21

class(data_PIC)

[1] "data.frame"

data_PIC$PEN <- as.factor(data_PIC$PEN)

data_PIC$Social_Group <- paste(data_PIC$PEN, data_PIC$START_DAY, data_PIC$OFFTEST_DAY, sep = "")

head(data_PIC$Social_Group)

[1] "B0315181023181223" "B0315181023181223" "B0315181023181223"
[4] "B0315181023181223" "B0315181023181223" "B0315181023181223"

data_PIC <- group_by(data_PIC, Social_Group)
data_PIC

# A tibble: 1,048,575 × 22
# Groups:   Social_Group [309]
          ID  LINE PED_IDENT_SIRE PED_IDENT_DAM LIT_LITTER_ID PEN   TEST_FARM
       <int> <int>          <int>         <int>         <int> <fct>     <int>
 1 100554211    65       94531098      94402440      80310618 B0315       774
 2 100554211    65       94531098      94402440      80310618 B0315       774
 3 100554211    65       94531098      94402440      80310618 B0315       774
 4 100554211    65       94531098      94402440      80310618 B0315       774
 5 100554211    65       94531098      94402440      80310618 B0315       774
 6 100554211    65       94531098      94402440      80310618 B0315       774
 7 100554211    65       94531098      94402440      80310618 B0315       774
 8 100554211    65       94531098      94402440      80310618 B0315       774
 9 100554211    65       94531098      94402440      80310618 B0315       774
10 100554211    65       94531098      94402440      80310618 B0315       774
# ℹ 1,048,565 more rows
# ℹ 15 more variables: ENTRY_TIME <chr>, EXIT_TIME <chr>, STAY_IN <int>,
#   FEED_INTK <int>, FEEDER_ENTRY_WT <int>, FEEDER_EXIT_WT <int>,
#   FEEDER_NO <int>, START_DAY <chr>, END_DAY <chr>, ENTRY_DATE <date>,
#   ENTRY <dttm>, EXIT_DATE <date>, EXIT <dttm>, OFFTEST_DAY <chr>,
#   Social_Group <chr>

Arrange data SG

data_PIC.arrange <- arrange(data_PIC, Social_Group, ENTRY, EXIT, by_group = TRUE)%>%
  mutate(line = row_number())


data_PIC.arrange <- data_PIC.arrange %>%
  dplyr::select(ID, ENTRY, EXIT, Social_Group)

head(data_PIC.arrange)

# A tibble: 6 × 4
# Groups:   Social_Group [1]
        ID ENTRY               EXIT                Social_Group     
     <int> <dttm>              <dttm>              <chr>            
1 98782150 2023-05-24 05:40:00 2023-05-24 05:40:00 B0102240523240723
2 98782154 2023-05-24 05:48:00 2023-05-24 05:49:00 B0102240523240723
3 98782330 2023-05-24 05:58:00 2023-05-24 06:00:00 B0102240523240723
4 98782150 2023-05-24 06:03:00 2023-05-24 06:03:00 B0102240523240723
5 98753116 2023-05-24 06:05:00 2023-05-24 06:07:00 B0102240523240723
6 98782152 2023-05-24 07:13:00 2023-05-24 07:26:00 B0102240523240723

dim(data_PIC.arrange)

[1] 1048575       4

Creating variable time between feeders and log transformation but not filtering for intervarls

#| warning: true
#| echo: true
data_PIC <- data_PIC %>%
  arrange(Social_Group, ENTRY) %>%
  group_by(Social_Group) %>%
  mutate(Follower_ID = lead(ID),
         Follower_Time = lead(ENTRY),
         Follower_Social_Group = lead(Social_Group),
         line= row_number(),
         Hour_ENTRY = hour(ENTRY),
         time_between= as.numeric(Follower_Time - EXIT, unit="secs"))%>%
  filter(time_between < 36000,time_between>=0)

data_PIC

# A tibble: 1,038,897 × 28
# Groups:   Social_Group [309]
         ID  LINE PED_IDENT_SIRE PED_IDENT_DAM LIT_LITTER_ID PEN   TEST_FARM
      <int> <int>          <int>         <int>         <int> <fct>     <int>
 1 98782150    65       93543534      94380138      79399025 B0102       774
 2 98782154    65       93543534      94380138      79399025 B0102       774
 3 98782330    65       93561960      92841243      79408790 B0102       774
 4 98782150    65       93543534      94380138      79399025 B0102       774
 5 98753116    65       93513866      93085200      79399020 B0102       774
 6 98782152    65       93543534      94380138      79399025 B0102       774
 7 98782188    65       93704219      94286020      79385516 B0102       774
 8 98791463    65       93679672      94402421      79408797 B0102       774
 9 98782150    65       93543534      94380138      79399025 B0102       774
10 98793248    65       93679672      91289790      79408788 B0102       774
# ℹ 1,038,887 more rows
# ℹ 21 more variables: ENTRY_TIME <chr>, EXIT_TIME <chr>, STAY_IN <int>,
#   FEED_INTK <int>, FEEDER_ENTRY_WT <int>, FEEDER_EXIT_WT <int>,
#   FEEDER_NO <int>, START_DAY <chr>, END_DAY <chr>, ENTRY_DATE <date>,
#   ENTRY <dttm>, EXIT_DATE <date>, EXIT <dttm>, OFFTEST_DAY <chr>,
#   Social_Group <chr>, Follower_ID <int>, Follower_Time <dttm>,
#   Follower_Social_Group <chr>, line <int>, Hour_ENTRY <int>, …

data_PIC%>%
  mutate(time_between= as.numeric(Follower_Time - ENTRY, unit="secs"),
         lapse_Time = seconds(Follower_Time - ENTRY))%>%
  dplyr::select(time_between, lapse_Time)

Adding missing grouping variables: `Social_Group`

# A tibble: 1,038,897 × 3
# Groups:   Social_Group [309]
   Social_Group      time_between lapse_Time
   <chr>                    <dbl> <Period>  
 1 B0102240523240723          480 480S      
 2 B0102240523240723          600 600S      
 3 B0102240523240723          300 300S      
 4 B0102240523240723          120 120S      
 5 B0102240523240723         4080 4080S     
 6 B0102240523240723          900 900S      
 7 B0102240523240723         2280 2280S     
 8 B0102240523240723          240 240S      
 9 B0102240523240723         1020 1020S     
10 B0102240523240723          300 300S      
# ℹ 1,038,887 more rows

data_PIC$time_between <- as.numeric(data_PIC$time_between)

data_PIC_pvalues60 <- filter (data_PIC, time_between >= 0) %>%
  mutate(TIME_FEEDER = as.numeric(STAY_IN))
dim(data_PIC_pvalues60)

[1] 1038897      29

summary(data_PIC_pvalues60$TIME_FEEDER)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
      2     520    1107    1220    1759   11206

dim(data_PIC_pvalues60)

[1] 1038897      29

data_PIC_pvalues60

# A tibble: 1,038,897 × 29
# Groups:   Social_Group [309]
         ID  LINE PED_IDENT_SIRE PED_IDENT_DAM LIT_LITTER_ID PEN   TEST_FARM
      <int> <int>          <int>         <int>         <int> <fct>     <int>
 1 98782150    65       93543534      94380138      79399025 B0102       774
 2 98782154    65       93543534      94380138      79399025 B0102       774
 3 98782330    65       93561960      92841243      79408790 B0102       774
 4 98782150    65       93543534      94380138      79399025 B0102       774
 5 98753116    65       93513866      93085200      79399020 B0102       774
 6 98782152    65       93543534      94380138      79399025 B0102       774
 7 98782188    65       93704219      94286020      79385516 B0102       774
 8 98791463    65       93679672      94402421      79408797 B0102       774
 9 98782150    65       93543534      94380138      79399025 B0102       774
10 98793248    65       93679672      91289790      79408788 B0102       774
# ℹ 1,038,887 more rows
# ℹ 22 more variables: ENTRY_TIME <chr>, EXIT_TIME <chr>, STAY_IN <int>,
#   FEED_INTK <int>, FEEDER_ENTRY_WT <int>, FEEDER_EXIT_WT <int>,
#   FEEDER_NO <int>, START_DAY <chr>, END_DAY <chr>, ENTRY_DATE <date>,
#   ENTRY <dttm>, EXIT_DATE <date>, EXIT <dttm>, OFFTEST_DAY <chr>,
#   Social_Group <chr>, Follower_ID <int>, Follower_Time <dttm>,
#   Follower_Social_Group <chr>, line <int>, Hour_ENTRY <int>, …

data_PIC_pvalues60 <- data_PIC_pvalues60 %>%
  mutate(L_time = log(TIME_FEEDER))

Checking how many feeding records are intermediate and distant

umbral <- 60

total_counts1 <- data_PIC_pvalues60 %>%
  mutate(time_between_group = case_when(
    time_between <= umbral ~ "immediate",
    time_between > umbral ~ "distant"
  )) %>%
  group_by(time_between_group) %>%
  summarise(total = n())  

print(total_counts1)

# A tibble: 2 × 2
  time_between_group  total
  <chr>               <int>
1 distant            331674
2 immediate          707223

Checking number of social groups with no filter for interval time

head(data_PIC_pvalues60)

# A tibble: 6 × 30
# Groups:   Social_Group [1]
        ID  LINE PED_IDENT_SIRE PED_IDENT_DAM LIT_LITTER_ID PEN   TEST_FARM
     <int> <int>          <int>         <int>         <int> <fct>     <int>
1 98782150    65       93543534      94380138      79399025 B0102       774
2 98782154    65       93543534      94380138      79399025 B0102       774
3 98782330    65       93561960      92841243      79408790 B0102       774
4 98782150    65       93543534      94380138      79399025 B0102       774
5 98753116    65       93513866      93085200      79399020 B0102       774
6 98782152    65       93543534      94380138      79399025 B0102       774
# ℹ 23 more variables: ENTRY_TIME <chr>, EXIT_TIME <chr>, STAY_IN <int>,
#   FEED_INTK <int>, FEEDER_ENTRY_WT <int>, FEEDER_EXIT_WT <int>,
#   FEEDER_NO <int>, START_DAY <chr>, END_DAY <chr>, ENTRY_DATE <date>,
#   ENTRY <dttm>, EXIT_DATE <date>, EXIT <dttm>, OFFTEST_DAY <chr>,
#   Social_Group <chr>, Follower_ID <int>, Follower_Time <dttm>,
#   Follower_Social_Group <chr>, line <int>, Hour_ENTRY <int>,
#   time_between <dbl>, TIME_FEEDER <dbl>, L_time <dbl>

dim(data_PIC_pvalues60)

[1] 1038897      30

# Arrange the data and add a row number
data_PIC_arranged <- data_PIC_pvalues60 %>%
  arrange(Social_Group, ENTRY, EXIT, by_group = TRUE) %>%
  mutate(line = row_number())
dim(data_PIC_arranged)

[1] 1038897      30

# Filter the data and select specific columns
data_PIC_filtered <- data_PIC_pvalues60 %>%
  filter(ID != Follower_ID) %>%
  dplyr::select(ID, Follower_ID, ENTRY, EXIT, STAY_IN, Social_Group, L_time, Hour_ENTRY, time_between, FEED_INTK, PEN)


data_PIC_pvalues60

# A tibble: 1,038,897 × 30
# Groups:   Social_Group [309]
         ID  LINE PED_IDENT_SIRE PED_IDENT_DAM LIT_LITTER_ID PEN   TEST_FARM
      <int> <int>          <int>         <int>         <int> <fct>     <int>
 1 98782150    65       93543534      94380138      79399025 B0102       774
 2 98782154    65       93543534      94380138      79399025 B0102       774
 3 98782330    65       93561960      92841243      79408790 B0102       774
 4 98782150    65       93543534      94380138      79399025 B0102       774
 5 98753116    65       93513866      93085200      79399020 B0102       774
 6 98782152    65       93543534      94380138      79399025 B0102       774
 7 98782188    65       93704219      94286020      79385516 B0102       774
 8 98791463    65       93679672      94402421      79408797 B0102       774
 9 98782150    65       93543534      94380138      79399025 B0102       774
10 98793248    65       93679672      91289790      79408788 B0102       774
# ℹ 1,038,887 more rows
# ℹ 23 more variables: ENTRY_TIME <chr>, EXIT_TIME <chr>, STAY_IN <int>,
#   FEED_INTK <int>, FEEDER_ENTRY_WT <int>, FEEDER_EXIT_WT <int>,
#   FEEDER_NO <int>, START_DAY <chr>, END_DAY <chr>, ENTRY_DATE <date>,
#   ENTRY <dttm>, EXIT_DATE <date>, EXIT <dttm>, OFFTEST_DAY <chr>,
#   Social_Group <chr>, Follower_ID <int>, Follower_Time <dttm>,
#   Follower_Social_Group <chr>, line <int>, Hour_ENTRY <int>, …

data_PIC_filtered

# A tibble: 998,001 × 11
# Groups:   Social_Group [306]
         ID Follower_ID ENTRY               EXIT                STAY_IN
      <int>       <int> <dttm>              <dttm>                <int>
 1 98782150    98782154 2023-05-24 05:40:00 2023-05-24 05:40:00       7
 2 98782154    98782330 2023-05-24 05:48:00 2023-05-24 05:49:00      45
 3 98782330    98782150 2023-05-24 05:58:00 2023-05-24 06:00:00     122
 4 98782150    98753116 2023-05-24 06:03:00 2023-05-24 06:03:00      24
 5 98753116    98782152 2023-05-24 06:05:00 2023-05-24 06:07:00     131
 6 98782152    98782188 2023-05-24 07:13:00 2023-05-24 07:26:00     814
 7 98782188    98791463 2023-05-24 07:28:00 2023-05-24 07:38:00     552
 8 98791463    98782150 2023-05-24 08:06:00 2023-05-24 08:07:00      53
 9 98782150    98793248 2023-05-24 08:10:00 2023-05-24 08:25:00     905
10 98793248    98793431 2023-05-24 08:27:00 2023-05-24 08:27:00      25
# ℹ 997,991 more rows
# ℹ 6 more variables: Social_Group <chr>, L_time <dbl>, Hour_ENTRY <int>,
#   time_between <dbl>, FEED_INTK <int>, PEN <fct>

dim(data_PIC_filtered)

[1] 998001     11

dim(data_PIC_pvalues60)

[1] 1038897      30

Checking that data has the filter for the follower follows itself and frequency of Social groups

#| warning: true
#| echo: true

unique_ids_per_group <- data_PIC_filtered %>%
  group_by(Social_Group) %>%
  summarise(Unique_IDs = n_distinct(ID),
            .groups = 'drop')  

print(unique_ids_per_group)

# A tibble: 306 × 2
   Social_Group      Unique_IDs
   <chr>                  <int>
 1 B0102240523240723         16
 2 B0104090519160719         15
 3 B0104180719250919         15
 4 B0104201218260219         14
 5 B0104240523240723         16
 6 B0104240621300821         14
 7 B0104280219070519         15
 8 B0106090519160719          2
 9 B0106090720140920         16
10 B0106130220130420         16
# ℹ 296 more rows

##no filte4r
unique_ids_per_group1 <- data_PIC_pvalues60 %>%
  group_by(Social_Group) %>%
  summarise(Unique_IDs = n_distinct(ID),
            .groups = 'drop')  

print(unique_ids_per_group1)

# A tibble: 309 × 2
   Social_Group      Unique_IDs
   <chr>                  <int>
 1 B0102240523240723         16
 2 B0104090519160719         15
 3 B0104180719250919         15
 4 B0104201218260219         14
 5 B0104240523240723         16
 6 B0104240621300821         14
 7 B0104280219070519         15
 8 B0106090519160719          2
 9 B0106090720140920         16
10 B0106130220130420         16
# ℹ 299 more rows

unique_ids_per_group <- data_PIC_filtered %>%
  group_by(Social_Group) %>%
  summarise(Unique_IDs = n_distinct(ID),
            .groups = 'drop') %>%
  # Calculate the mean of Unique_IDs across all groups
  summarise(Mean_Unique_IDs = mean(Unique_IDs))


print(unique_ids_per_group)

# A tibble: 1 × 1
  Mean_Unique_IDs
            <dbl>
1            14.4

table(unique_ids_per_group1$Unique_IDs)


  1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16 
  3   3   1   1   3   2   3   6   1   3   3   6  19  39  92 124

table(unique_ids_per_group$Unique_IDs)

Warning: Unknown or uninitialised column: `Unique_IDs`.

< table of extent 0 >

data_PIC_filtered0 <- data_PIC_filtered %>%
  group_by(Social_Group) %>%
  summarise(Unique_Animal_Count = n_distinct(ID), 
            FEED_INTK_SUM = sum(FEED_INTK, na.rm = TRUE), .groups = 'drop')
###ggplot x vs y 
ggplot(data_PIC_filtered0, aes(x = Unique_Animal_Count, y = FEED_INTK_SUM)) +
  geom_point() +
  geom_smooth()

`geom_smooth()` using method = 'loess' and formula = 'y ~ x'

Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
: pseudoinverse used at 16.07

Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
: neighborhood radius 2.07

Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
: reciprocal condition number 1.0661e-15

Warning in simpleLoess(y, x, w, span, degree = degree, parametric = parametric,
: There are other near singularities as well. 1

Warning in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x
else if (is.data.frame(newdata))
as.matrix(model.frame(delete.response(terms(object)), : pseudoinverse used at
16.07

Warning in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x
else if (is.data.frame(newdata))
as.matrix(model.frame(delete.response(terms(object)), : neighborhood radius
2.07

Warning in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x
else if (is.data.frame(newdata))
as.matrix(model.frame(delete.response(terms(object)), : reciprocal condition
number 1.0661e-15

Warning in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x
else if (is.data.frame(newdata))
as.matrix(model.frame(delete.response(terms(object)), : There are other near
singularities as well. 1

self_interaction_count <- data_PIC_pvalues60 %>%
  filter(ID == Follower_ID) %>%
  nrow()

self_interaction_count

[1] 40896

Creating PE for direct and social effect columns

#| warning: true
#| echo: true

data_PIC_filtered

# A tibble: 998,001 × 11
# Groups:   Social_Group [306]
         ID Follower_ID ENTRY               EXIT                STAY_IN
      <int>       <int> <dttm>              <dttm>                <int>
 1 98782150    98782154 2023-05-24 05:40:00 2023-05-24 05:40:00       7
 2 98782154    98782330 2023-05-24 05:48:00 2023-05-24 05:49:00      45
 3 98782330    98782150 2023-05-24 05:58:00 2023-05-24 06:00:00     122
 4 98782150    98753116 2023-05-24 06:03:00 2023-05-24 06:03:00      24
 5 98753116    98782152 2023-05-24 06:05:00 2023-05-24 06:07:00     131
 6 98782152    98782188 2023-05-24 07:13:00 2023-05-24 07:26:00     814
 7 98782188    98791463 2023-05-24 07:28:00 2023-05-24 07:38:00     552
 8 98791463    98782150 2023-05-24 08:06:00 2023-05-24 08:07:00      53
 9 98782150    98793248 2023-05-24 08:10:00 2023-05-24 08:25:00     905
10 98793248    98793431 2023-05-24 08:27:00 2023-05-24 08:27:00      25
# ℹ 997,991 more rows
# ℹ 6 more variables: Social_Group <chr>, L_time <dbl>, Hour_ENTRY <int>,
#   time_between <dbl>, FEED_INTK <int>, PEN <fct>

data_PIC_filtered <- data_PIC_filtered %>%
  dplyr::rename(Animal = ID)

dim(data_PIC_filtered)

[1] 998001     11

head(data_PIC_filtered)

# A tibble: 6 × 11
# Groups:   Social_Group [1]
    Animal Follower_ID ENTRY               EXIT                STAY_IN
     <int>       <int> <dttm>              <dttm>                <int>
1 98782150    98782154 2023-05-24 05:40:00 2023-05-24 05:40:00       7
2 98782154    98782330 2023-05-24 05:48:00 2023-05-24 05:49:00      45
3 98782330    98782150 2023-05-24 05:58:00 2023-05-24 06:00:00     122
4 98782150    98753116 2023-05-24 06:03:00 2023-05-24 06:03:00      24
5 98753116    98782152 2023-05-24 06:05:00 2023-05-24 06:07:00     131
6 98782152    98782188 2023-05-24 07:13:00 2023-05-24 07:26:00     814
# ℹ 6 more variables: Social_Group <chr>, L_time <dbl>, Hour_ENTRY <int>,
#   time_between <dbl>, FEED_INTK <int>, PEN <fct>

data_PIC_filtered <- within(data_PIC_filtered, {
  Animalpe <- Animal
  Follower_IDpe <- Follower_ID
  sdL_time <- L_time / sd(L_time)
})

dim(data_PIC_filtered)

[1] 998001     14

data_PIC_filtered

# A tibble: 998,001 × 14
# Groups:   Social_Group [306]
     Animal Follower_ID ENTRY               EXIT                STAY_IN
      <int>       <int> <dttm>              <dttm>                <int>
 1 98782150    98782154 2023-05-24 05:40:00 2023-05-24 05:40:00       7
 2 98782154    98782330 2023-05-24 05:48:00 2023-05-24 05:49:00      45
 3 98782330    98782150 2023-05-24 05:58:00 2023-05-24 06:00:00     122
 4 98782150    98753116 2023-05-24 06:03:00 2023-05-24 06:03:00      24
 5 98753116    98782152 2023-05-24 06:05:00 2023-05-24 06:07:00     131
 6 98782152    98782188 2023-05-24 07:13:00 2023-05-24 07:26:00     814
 7 98782188    98791463 2023-05-24 07:28:00 2023-05-24 07:38:00     552
 8 98791463    98782150 2023-05-24 08:06:00 2023-05-24 08:07:00      53
 9 98782150    98793248 2023-05-24 08:10:00 2023-05-24 08:25:00     905
10 98793248    98793431 2023-05-24 08:27:00 2023-05-24 08:27:00      25
# ℹ 997,991 more rows
# ℹ 9 more variables: Social_Group <chr>, L_time <dbl>, Hour_ENTRY <int>,
#   time_between <dbl>, FEED_INTK <int>, PEN <fct>, sdL_time <dbl>,
#   Follower_IDpe <int>, Animalpe <int>

colnames(data_PIC_filtered)

 [1] "Animal"        "Follower_ID"   "ENTRY"         "EXIT"         
 [5] "STAY_IN"       "Social_Group"  "L_time"        "Hour_ENTRY"   
 [9] "time_between"  "FEED_INTK"     "PEN"           "sdL_time"     
[13] "Follower_IDpe" "Animalpe"

Filtering for 14,15,16 animals per social group

#| warning: true
#| echo: true

social_group_summary <- data_PIC_filtered %>%
  group_by(Social_Group) %>%
  summarise(
    Unique_Animal_Count = n_distinct(Animal),  # Count distinct IDs in each social group
    feeding_rec = n()                      # Count all records in each social group
  )

# joining this summary back to the original data to keep all columns
extended_data <- data_PIC_filtered %>%
  left_join(social_group_summary, by = "Social_Group")

# Filter the data where the Unique_Animal_Count is greater than 13 to keep 14,15,16 SG
filtered_data <- filter(extended_data, Unique_Animal_Count > 13)

final_summary <- filtered_data %>%
  summarize(Total_rec = sum(feeding_rec))

final_summary

# A tibble: 255 × 2
   Social_Group      Total_rec
   <chr>                 <int>
 1 B0102240523240723  10857025
 2 B0104090519160719  21409129
 3 B0104180719250919  23242041
 4 B0104201218260219  13875625
 5 B0104240523240723   8826841
 6 B0104240621300821   8105409
 7 B0104280219070519  14085009
 8 B0106090720140920  14876449
 9 B0106130220130420  11437924
10 B0106170920161120  13749264
# ℹ 245 more rows

# Display the filtered data with all original columns
print(head(filtered_data))

# A tibble: 6 × 16
# Groups:   Social_Group [1]
    Animal Follower_ID ENTRY               EXIT                STAY_IN
     <int>       <int> <dttm>              <dttm>                <int>
1 98782150    98782154 2023-05-24 05:40:00 2023-05-24 05:40:00       7
2 98782154    98782330 2023-05-24 05:48:00 2023-05-24 05:49:00      45
3 98782330    98782150 2023-05-24 05:58:00 2023-05-24 06:00:00     122
4 98782150    98753116 2023-05-24 06:03:00 2023-05-24 06:03:00      24
5 98753116    98782152 2023-05-24 06:05:00 2023-05-24 06:07:00     131
6 98782152    98782188 2023-05-24 07:13:00 2023-05-24 07:26:00     814
# ℹ 11 more variables: Social_Group <chr>, L_time <dbl>, Hour_ENTRY <int>,
#   time_between <dbl>, FEED_INTK <int>, PEN <fct>, sdL_time <dbl>,
#   Follower_IDpe <int>, Animalpe <int>, Unique_Animal_Count <int>,
#   feeding_rec <int>

# total records after filtering
print(final_summary)

# A tibble: 255 × 2
   Social_Group      Total_rec
   <chr>                 <int>
 1 B0102240523240723  10857025
 2 B0104090519160719  21409129
 3 B0104180719250919  23242041
 4 B0104201218260219  13875625
 5 B0104240523240723   8826841
 6 B0104240621300821   8105409
 7 B0104280219070519  14085009
 8 B0106090720140920  14876449
 9 B0106130220130420  11437924
10 B0106170920161120  13749264
# ℹ 245 more rows

dim(filtered_data)

[1] 879515     16

filtered_data

# A tibble: 879,515 × 16
# Groups:   Social_Group [255]
     Animal Follower_ID ENTRY               EXIT                STAY_IN
      <int>       <int> <dttm>              <dttm>                <int>
 1 98782150    98782154 2023-05-24 05:40:00 2023-05-24 05:40:00       7
 2 98782154    98782330 2023-05-24 05:48:00 2023-05-24 05:49:00      45
 3 98782330    98782150 2023-05-24 05:58:00 2023-05-24 06:00:00     122
 4 98782150    98753116 2023-05-24 06:03:00 2023-05-24 06:03:00      24
 5 98753116    98782152 2023-05-24 06:05:00 2023-05-24 06:07:00     131
 6 98782152    98782188 2023-05-24 07:13:00 2023-05-24 07:26:00     814
 7 98782188    98791463 2023-05-24 07:28:00 2023-05-24 07:38:00     552
 8 98791463    98782150 2023-05-24 08:06:00 2023-05-24 08:07:00      53
 9 98782150    98793248 2023-05-24 08:10:00 2023-05-24 08:25:00     905
10 98793248    98793431 2023-05-24 08:27:00 2023-05-24 08:27:00      25
# ℹ 879,505 more rows
# ℹ 11 more variables: Social_Group <chr>, L_time <dbl>, Hour_ENTRY <int>,
#   time_between <dbl>, FEED_INTK <int>, PEN <fct>, sdL_time <dbl>,
#   Follower_IDpe <int>, Animalpe <int>, Unique_Animal_Count <int>,
#   feeding_rec <int>

social_group_summary <- filtered_data %>%
  group_by(Social_Group) %>%
  summarise(
    Unique_IDs = n_distinct(Animal)  
  )

print(social_group_summary)

# A tibble: 255 × 2
   Social_Group      Unique_IDs
   <chr>                  <int>
 1 B0102240523240723         16
 2 B0104090519160719         15
 3 B0104180719250919         15
 4 B0104201218260219         14
 5 B0104240523240723         16
 6 B0104240621300821         14
 7 B0104280219070519         15
 8 B0106090720140920         16
 9 B0106130220130420         16
10 B0106170920161120         16
# ℹ 245 more rows

unique_ids_frequency <- table(filtered_data$Unique_Animal_Count)

print(unique_ids_frequency)


    14     15     16 
127262 320849 431404

ggplot(filtered_data, aes(x= Unique_Animal_Count , y = feeding_rec, )) +
  geom_point()+
  geom_smooth()

`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'

Warning: Failed to fit group -1.
Caused by error in `smooth.construct.cr.smooth.spec()`:
! x has insufficient unique values to support 10 knots: reduce k.

#| warning: true
#| echo: true


dim(filtered_data)

[1] 879515     16

num_unique_animals <- n_distinct(filtered_data$Animal)
num_unique_social_groups <- n_distinct(filtered_data$Social_Group)

num_unique_animals

[1] 3910

num_unique_social_groups

[1] 255

num_unique_animals_follower <- n_distinct(filtered_data$Follower_ID)
num_unique_animals_follower

[1] 3910

Creating data file csv

write.csv(filtered_data, "data_PIC_JP.csv", row.names = FALSE)

Loading data file

colnames(filtered_data)

 [1] "Animal"              "Follower_ID"         "ENTRY"              
 [4] "EXIT"                "STAY_IN"             "Social_Group"       
 [7] "L_time"              "Hour_ENTRY"          "time_between"       
[10] "FEED_INTK"           "PEN"                 "sdL_time"           
[13] "Follower_IDpe"       "Animalpe"            "Unique_Animal_Count"
[16] "feeding_rec"

data_PIC_JP <- read.csv("data_PIC_JP.csv")

library(data.table)


Attaching package: 'data.table'

The following objects are masked from 'package:lubridate':

    hour, isoweek, mday, minute, month, quarter, second, wday, week,
    yday, year

The following object is masked from 'package:purrr':

    transpose

The following objects are masked from 'package:dplyr':

    between, first, last

Data_loaded_PIC <- fread("data_PIC_JP.csv")

print(Data_loaded_PIC)

          Animal Follower_ID               ENTRY                EXIT STAY_IN
           <int>       <int>              <POSc>              <POSc>   <int>
     1: 98782150    98782154 2023-05-24 05:40:00 2023-05-24 05:40:00       7
     2: 98782154    98782330 2023-05-24 05:48:00 2023-05-24 05:49:00      45
     3: 98782330    98782150 2023-05-24 05:58:00 2023-05-24 06:00:00     122
     4: 98782150    98753116 2023-05-24 06:03:00 2023-05-24 06:03:00      24
     5: 98753116    98782152 2023-05-24 06:05:00 2023-05-24 06:07:00     131
    ---                                                                     
879511: 84461467    84461368 2020-04-05 06:50:00 2020-04-05 06:50:00      25
879512: 84461368    84468375 2020-04-05 07:00:00 2020-04-05 07:01:00     111
879513: 84468375    84461342 2020-04-05 07:36:00 2020-04-05 07:38:00     167
879514: 84461342    84478471 2020-04-05 07:41:00 2020-04-05 07:45:00     225
879515: 84478471    84461342 2020-04-05 07:50:00 2020-04-05 07:50:00       9
             Social_Group   L_time Hour_ENTRY time_between FEED_INTK    PEN
                   <char>    <num>      <int>        <int>     <int> <char>
     1: B0102240523240723 1.945910          5          480         0  B0102
     2: B0102240523240723 3.806662          5          540        -2  B0102
     3: B0102240523240723 4.804021          5          180        32  B0102
     4: B0102240523240723 3.178054          6          120         0  B0102
     5: B0102240523240723 4.875197          6         3960        31  B0102
    ---                                                                    
879511: B1006300120310320 3.218876          6          600         6  B1006
879512: B1006300120310320 4.709530          7         2100         0  B1006
879513: B1006300120310320 5.117994          7          180        -1  B1006
879514: B1006300120310320 5.416100          7          300      1423  B1006
879515: B1006300120310320 2.197225          7           60        -1  B1006
        sdL_time Follower_IDpe Animalpe Unique_Animal_Count feeding_rec
           <num>         <int>    <int>               <int>       <int>
     1: 1.592346      98782154 98782150                  16        3295
     2: 3.115006      98782330 98782154                  16        3295
     3: 3.931149      98782150 98782330                  16        3295
     4: 2.600613      98753116 98782150                  16        3295
     5: 3.989392      98782152 98753116                  16        3295
    ---                                                                
879511: 2.634018      84461368 84461467                  16        3387
879512: 3.853826      84468375 84461368                  16        3387
879513: 4.188074      84461342 84468375                  16        3387
879514: 4.432015      84478471 84461342                  16        3387
879515: 1.797997      84461342 84478471                  16        3387