Indiana National Guard Retention

Author: Ryann Laky

With the appointment of the Indiana National Guard’s latest Adjutant General, Major General R. Dale Lyles, the motto for the state has also changed: ‘People First’. Over the years, the whole of the United States Army has come to realize that its greatest asset is its people. However, given that the Army is an operations-based organization, it is often easy to forget to care for the greatest asset. With the INNG’s motto of ‘People First’, a few well-respected leaders were asked to give their definitions of Soldier Care:

 

“Soldier Care is a function of basic leadership at all levels. Leaders provide purpose, motivation, and direction. This applies equally to mission accomplishment as well as Soldier well-being. An engaged leader must genuinely care to ensure that all aspects of a Soldier’s life are in balance so they may offer maximum effort toward their military task.”

– Major General Tim Winslow, Indiana Army National Guard, 81st Troop Command, Commanding General

 

“To care for Soldiers is to create an environment of living, learning, and training such that it gives the Soldier the greatest opportunity to thrive unrestricted [to] become the best and most lethal person that they could possibly be; and to identify, build on, and reinforce their strengths, and identify, reduce, and eliminate their weaknesses [to] allow that individual to do their part when the time comes to ensure mission success.”

– Sergeant Major Mike Vogt, Indiana Army National Guard, 76th Infantry Brigade Combat Team

 

“Soldier care is taking care of the whole Soldier: making sure they have what they need to do their job now, providing the mentorship and training to do their job in the future, and ensuring they are physically, medically, and mentally prepared. This includes taking care of the individual personal needs.”

– Lieutenant Colonel Bryan Peterson, Indiana Army National Guard, 76th Infantry Brigade Combat Team

 

Mission: The Indiana National Guard develops the most talented Soldiers and Airmen, in the best led teams, to fight and win our Nation’s wars and serve the Hoosier state when called upon.

Vision: The Indiana National Guard will be the premier community-based military force for state and international missions, by putting our people first.

 
 

Domains I & II: Business Understanding & Analytic Approach

Domains I & II: Business Understanding & Analytic Approach

 

The INNG has a dedicated personnel section, much like human resources in the civilian sector, called J1. However, J1 covers much more than just human resources; they handle various pay issues, personnel actions (such as changes of address), recruiting, retention, resolution of medical issues, and various training. One of the biggest problems the INNG faces is the retention of talent. Service Members usually join the Guard for education benefits, to gain skills, and to travel the world at low cost – all benefits for Service Members in a weakened economy. As Service Members stay in the Guard, they gain a set of indispensable skills and talents that translate and market well to the civilian market, pulling them away from the Guard and making the Guard more of an inconvenience than an asset for the individual. Retention has posed a major problem for the INNG, especially as the need for more specialized occupations increases and societal interest in a part-time military career decreases. Therefore, J1 must work harder to retain this talent by caring for Service Members and putting them and their families first.

 

Operational Definitions

Operational Definitions

Before addressing retention, there are a handful of definitions that will be used moving forward:

  • Absent Without Leave (AWOL) – any Soldier who has taken an unauthorized leave from his/her training or duty station is considered AWOL. Additional stipulations can be found here.

  • Civilian Education Level (CIVED) – the highest level of civilian education received by the Soldier at the date of the data pull.

  • Date of Rank (DOR) – Promotion date for the Soldier’s current rank.

  • Department of the Army (DOA) – Military Department within the United States Department of Defense.

  • Director’s Personnel Readiness Overview (DPRO) – the Army’s primary database and personnel readiness data collection tool that pulls from various personnel data sources, to include information regarding Soldier demographics, promotion periods, sources of enlistment/commission, etc.

  • Expiration Term of Service (ETS) – colloquially referred to an enlisted Soldier’s last date in service and used interchangeably between enlisted Soldiers and commissioned Officers.

  • Fiscal Year (FY) – financial planning year for the US military, spanning from the first day of October through the last day of the following September; as an example, FY23 starts in October 2022 and ends in September 2023.

  • J1 – Director of Manpower and Personnel; equivalent of a Director of Human Resources in the private sector; J implies joint.

  • Joint – implies an operation or function including multiple branches of service, to include US entities and multinational agencies; in this case, it refers to the joining of the US Army and the US Air Force.

  • Initial Entry Training (IET)– formerly known as Basic Training, is the program of physical and mental training required in order for an individual to become a Soldier in the United States Army, Army Reserve or Army National Guard; both officers and enlisted Soldiers must complete IET to be considered qualified for entry into a formation.

  • Mandatory Removal Date (MRD) – the date at which a commissioned officer is required to be removed from service, usually on the date they reach 60 years of age or later (pending waivers).

  • Military Education Level (MILED) – the highest level of military education received by the Soldier at the date of the data pull.

  • Military Occupational Specialty (MOS) – a coded job for each service member, specific to their duties and assignments aligned within the military’s organizational structure.

  • National Guard Bureau (NGB) – the federal instrument responsible for the administration of the National Guard established by the United States Congress as a joint bureau of the Department of the Army and the Department of the Air Force, created by the Militia Act of 1903.

  • Pay Entry Base Date (PEBD) – the date at which the Soldier signed their first contract, whether commissioned or enlisted.

  • Soldier Care – a relatively new concept in the Army, it is the holistic approach to Soldier health and wellbeing and is defined differently by leaders across the United States Army.

  • Source of Commissioning (SOC) – the source through which a Soldier commissioned (i.e., Direct Commission, Officer Candidate School, Reserve Officer Training Corps, etc.).

  • Source of Enlistment (SOE) – the source through which a Soldier enlisted (i.e., Voluntary Enrollment, Voluntary Enlistment, etc.).

  • The Adjutant General (TAG) – the highest position of the any state’s National Guard; formally politically appointed by the state’s governor, but with jurisdiction over all assets within a state’s National Guard.

  • Training Pay Category (TPC) – high-level generic code for pay status the Soldier is in, particularly as it pertains to their status of IET to become qualified in the position they hold.

  • War Fighting Function (WFF) – broad coverage of MOSs or series numbers categorized into the overall function in which they fall, to include Command and Control, Movement and Maneuver, Intelligence, Fires, Sustainment, and Protection.

 

Phased Approach

Phased Approach

Because of the stratification of the J1, the response to the business problem has been phased to best serve the INNG, as follows:

  • Phase I: Gather and analyze retention data through DPRO data extraction from INNG Service Members to identify the generic profile of a non-retained Soldier. This project will primarily focus on this phase.

  • Phase II: Use this generic profile of a non-retained Soldier to address potential solutions in retaining this type of Soldier.

  • Phase III: Identify trends in self-identified justifications for Soldiers leaving the INNG, and pair those responses with profiles pulled from the DPRO data extraction.

  • Phase IV: Use this trend in self-identified attributes of non-retained Soldiers to address potential solutions in retaining this type of Soldier.

  • Phase V: Develop and implement a Service Member re-enlistment survey (much like the exit survey) to allow Service Members the opportunity to self-identify their reasons for staying in the INNG.

 

There are a handful of key questions that can be answered to help feed this phased approach in tackling retention, such as:

  • What, if any, indicators exist in a Soldier’s DPRO profile (i.e., sex, rank, MOS, etc.) to better determine which Soldiers are more likely to get out? This is largely dependent on consistencies between Soldiers within the data provided.

  • What are some quantifiable justifications that affect whether they get out (pay, education benefits, rank stagnancy, etc.)? This would further shape the efforts of the Indiana National Guard to retain Soldiers via tangible actions in their careers.

  • What factors are being identified within that retention window and being used to shape the Soldier’s exit interview? In other words, are the opinions shared in the exit survey reflective of their DPRO profile? This would provide legitimacy (or lack thereof) to the survey used when Soldiers leave the INNG.

  • What factors cause Soldiers to stay in the Guard (based on what career markers exist within a Soldier’s profile who does re-enlist prior to their contract end-date)? This is also largely dependent on consistencies between Soldiers within the data provided.

 

Key Stakeholders

Key Stakeholders

To the standard person, retention in the INNG is not at all a priority. In fact, most would argue that even members of the INNG are not worried about retention unless it is their own and they’re approaching their ETS window. However, leaders across the National Guards of all states are interested in retention of their Soldiers. Without people (again, the greatest asset), the INNG and other states cannot operate. To assess the criticality of stakeholders to this project, the image to the right is used to generate a scale on which to prioritize stakeholders, with quadrants addressed from left-to-right, top-to-bottom.

  • Quadrant 1 (High Power, Low Interest): require some attention and must be satisified

  • Quadrant 2 (High Power, High Interest): require the most amount of attention and must be managed closely

  • Quadrant 3 (Low Power, Low Interest): require minimanl attention and must be monitored with minimal effort

  • Quadrant 4 (Low Power, High Interest): require some attention and must be informed

 

Some key stakeholders of this project are outlined below, in order of their priority according to the image on the right:

  1. TAG of the INNG: With his ‘People First’ strategy and the amount of power he holds in the state, he is the greatest stakeholder in this project. He specifically requested formal regular updates on this project as it progresses to redirect his staff to assist and to reallocate resources for further research. The retention of Soldiers in the INNG also greatly affects his annual budget allotments sanctioned through NGB. TAG is assessed to have both high power and high interest.

  2. J1 of the INNG: This is the primary section responsible for pulling data that contributes to this project and the primary implementation group, pending the results. While they are under the authority of TAG, they hold the main power (with about 50 employees) for assistance with and implementation of this project. INNG’s J1 is assessed to have both high power and high interest, but less than that of TAG.

  3. Soldiers and Leaders of the INNG: While most Soldiers and Leaders are unaware of this project, they all hold power over this project in the tactical implementation of its results, their implementation of Soldier Care, and the effects they can have on this project moving forward. Soldiers and Leaders are assessed to have some power and low interest.

  4. The General Public: The general public is not aware of this project, but if the INNG can use this project to better shape its retention, that will leave the INNG with a sustained and experienced force ready to assist, protect, and defend the civilian opulace of Indiana. The general public is assessed to have low power and low interest.

  5. DOA, NGB, all 54 States/Territories: While no individuals at NGB, DOA, or other states are aware of this ongoing project, this project could have impacts that may be worth implementing at their levels if it proves fruitful. Retention is an issue across all of DOA, to include NGB and all 54 states/territories. If this retention project and analysis affects the state appropriately, the results could benefit all leaders across all echelons. As a group, these are assessed to have high, but not applicable, power and very low interest. This is all contingent on the outcome of and the feasibility to implement the project.

 

Domain III: Data Understanding & Preparation

Domain III: Data Understanding & Preparation

Below is information regarding the raw data files pulled from DPRO, including their specific number of rows and columns. Because this data is fragmented, the data understanding tables are the same (where duplicated) as labeled in the headers. Everything from this point forward is considered unclassified but for official use only (with limited distribution) – it is not to be used for public release.

 

Data Overview

Data Overview

Fiscal Year Losses: These tables contain the Soldier profiles for those not retained at the end of a FY. There are a total of six tables (by FY). This data can be used to compare qualities or attributes of Soldiers that stayed in versus those who exited the guard between FYs. This data contains basic Soldier information such as unit, months in current grade and MOS, gender, race, ethnicity, and other demographic data. It also includes the type of Soldier (whether enlisted, warrant officer, or commissioned officer) and basic unit information. These tables also contain information on why the Soldier exited the INNG, according to codes (including descriptions), as determined by the INNG J1 (i.e., medical retirement, regular retirement, adverse action, etc.).

  • Purpose: provide INNG with factual profile-based quantitative and qualitative data for state losses
  • Composition:
    • FY17 25 columns x 1,454 rows
    • FY18 25 columns x 1,823 rows
    • FY19 25 columns x 1,913 rows
    • FY20 25 columns x 1,957 rows
    • FY21 25 columns x 1,299 rows
    • FY22 25 columns x 1,039 rows
  • Limitations: DPRO has limitations in outputting large amounts of accurate data, thus this data was pulled by FYs to reduce the margin for error in production; losses should not be duplicated between FYs; these columns were initially selected to provide basic information on Soldiers for qualities suspected of having an impact on Soldier retention

 

Fiscal Year Strength: These tables contain basic Soldier information for the entire starting strength for the FY, which also includes the Soldiers that were lost at the end of a fiscal year. In contrast to the losses data, this strength data does not indicate codes for exiting the INNG. However, this does contain additional information regarding ETS or MRD.

  • Purpose: provide INNG with factual profile-based quantitative and qualitative data for state strength
  • Composition:
    • FY17 15 columns x 11,472 rows
    • FY18 15 columns x 11,708 rows
    • FY19 15 columns x 11,607 rows
    • FY20 15 columns x 11,078 rows
    • FY21 15 columns x 10,491 rows
    • FY22 15 columns x 10,704 rows
  • Limitations: DPRO has limitations in outputting large amounts of accurate data, thus this data was pulled by FYs to reduce the margin for error in production; strength may be duplicated between FYs if a Soldier was retained between FYs; these columns were initially selected to provide basic information on Soldiers for qualities suspected of having an impact on Soldier retention

 

Data Understanding

Data Understanding

Below are the tables for data understanding for the data. Columns for tables across fiscal year losses are the same, just as columns for tables across fiscal year strength are the same. However, there are some duplicate columns between losses/strength, and some that are inconsistent between losses/strength.

  • Losses (FYXX Losses.xlsx):

 

  • Assigned Strength (FYXX Assigned Strength.xlsx):

 

Data Cleaning and Preparation

Data Cleaning and Preparation

Cleaning of this data used several methods, to include manual entry, removal of duplicates, addition of columns for simpler filtration, and down-scaling sets for targeted analysis.

  1. Microsoft Excel: In excel, the tables of starting strength and total losses for each FY were united without removing duplicates. This leaves the following two tables:
    • Strength_v1 15 columns x 67,042 rows
    • Losses_v1 25 columns x 9,485 rows
  2. Microsoft Excel: In excel, duplicate values in the columns Soldier Name jointly with Last Four were removed from the tables for total strength and total losses. This leaves the following two tables:
    • Strength_v2 15 columns x 18,835 rows (48,207 duplicates removed, indicating there were 48,207 instances where Soldiers were retained between FYs; this is not indicative of number of total Soldiers retained. For example, one Soldier if retained across 3 FYs would count as 3.)
    • Losses_v2 25 columns x 9,223 rows (262 duplicates removed. Duplicates not expected; duplicates determined from dual unit assignment, priority given to highest tiered unit. As an example, Military Police Companies all fall under 81st Troop Command; a Soldier can have a dual-assignment to 81st Troop Command and an MP company.)
  3. Tableau Prep Builder 2023.1: In Tableau Prep, a left join of strength (left) to losses (right). This leaves one table, without duplicate information removed:
    • Joined_v1 40 columns x 18,835 rows (this is consistent with what is expected, given every Soldier considered a loss should also be counted in the initial strength)
  4. Tableau Prep Builder 2023.1: In Tableau Prep, columns containing duplicate or redundant information were removed, to include Unit State, UPC, Unit Name, POD, Soldier Name, Last Four, Months in Grade Completed, Grade, DMOS, Gender, Race/Ethnicity, Loss Reason, Unit State-1, POD-1, UPC-1, TPC, Mo in Grd, RSP Site, Military Personnel Class, and RSID (Key). This leaves one table with duplicate information removed:
    • Joined_v2 20 columns x 18,835 rows
  5. Tableau Prep Builder 2023.1: In Tableau Prep, the columns were renamed to be better suited to their contents. This leaves one table with cleaner column titles in accordance with the operational definitions listed previously: Joined_v3 20 columns x 18,835 rows. Below is a summary of the changes:
    • Soldier Name-1Name
    • Last Four-1Last Four
    • Gender-1Gender
    • Race / Ethnicity-1Race/Ethnicity
    • Grade-1Grade
    • DMOS-1MOS
    • Months in Grade Completed-1Months in Grade
    • Mo in SvcMonths in Service
    • ETS or MRD DateETS/MRD
    • CIVED CertCIVED
    • Unit Name-1Unit Name
    • TPC-DescTPC
    • Loss Reason-DescLoss Reason
  6. Tableau Prep Builder 2023.1: In Tableau Prep, the column data types were adjusted according to their true values. This leaves one table with cleaner data types (easier for analysis) in accordance with their contents: Joined_v4 20 columns x 18,835 rows. Below is a summary of the changes:
    • DOR: String to Date
    • ETS/MRD: String to Date
    • PEBD: String to Date
    • Date of Commission: String to Date
  7. Tableau Prep Builder 2023.1: In Tableau Prep, the column with Months in Service is complete only for those considered as losses. However, this information was reasonably deduced using the difference between the PEBD and the date for which this data was pulled. Reported Months in Service remained, while missing values were imputed using the DATEDIFF() function. This leaves one table with Months in Service imputed:
    • Joined_v5 20 columns x 18,835 rows
  8. Tableau Prep Builder 2023.1: In Tableau Prep, the column Loss Reason has many closely-related loss reasons. These were consolidated for more meaningful analysis later one. This leaves one table with more consolidated content: Joined_v6 20 columns x 18,835 rows. Below is a summary of the remaining categories for separation:
    • null
    • Accepted Commission
    • Administrative
    • AWOL
    • Component/Service Transfer
    • Criminal Misconduct
    • Death
    • Enrolled in ROTC
    • Erroneous Enlistment
    • Failure to Meet Requirements
    • Hardship or Religious
    • IET Discharge
    • Medical Retirement/Separation
    • Non-Criminal Misconduct
    • Obligation Complete
    • Regular Retirement
    • Resigned Commission
    Below is an image showing the overall flow of the cleaning, which results in the final table Joined_v6 outlined in the Cleaned Data Understanding below.

  1. Microsoft Excel: In excel, multiple columns can be used to determine if a Soldier is a loss (based on missing data). However, to make filtering and processing data simpler, an additional column was added Loss with the values of Y and N to indicate whether the Soldier was a loss, leaving the following table:
    • Joined_v7 21 columns x 18,835 rows
  2. Microsoft Excel: While Name is a column containing full unique names for each Soldier, using this data isn’t in accordance with the Privacy Act. An additional column titled ID containing unique indentifiers for each Soldier’s input was created, leaving the following table:
    • Joined_v8 22 columns x 18,835 rows
  3. Microsoft Excel: The remaining table still has Last Four in conjunction with Name, which together would be considered PII, or Personally Identifiable Information and in violation of the Privacy Act. To mitigate bias in identifying Soldiers names, the columns were removed, leaving:
    • Joined_v9 20 columns x 18,835 rows
  4. R Markdown: In cleaning the column MOS, it is much easier to analyze the series as to the explicit MOS for each Soldier. As an example, the explicit MOS of a Soldier is outlined by several factors: their series number (indicated by the first two numbers of the MOS), the specification (indicated by the following letter of the MOS), and their proficiency (indicated by the string following the letter). As Soldiers rise in the ranks of their MOS, their series and specification remains the same but their proficiency changes. Delineating between specific MOS and series number is an easier way to group Soldier for analysis, as a series specifies the general area they specialize in. As an example, a 25-series has a plethora of specific MOSs, but all have the same number of 25 designating their series. This leaves the following:
    • Joined_v10 20 columns x 18,835 rows
  5. Microsoft Excel: Two columns added to group each Soldier’s MOS and Grade. For MOS, a new column titled WFF for War Fighting Function was added. This WFF is a broader category that defines the generic purpose for each MOS. For Grade, a new column titled Type was added. These types include Warrant, Officer, and Enlisted. These changes leave the following table:
    • Joined_v11 22 columns x 18,835 rows
  6. Microsoft Excel: The column Race/Ethnicity has many microcategories. For ease of analysis, the default race/ethnicity was pushed to the first race/ethnicity listed in the column, leaving the following table:
    • Joined_v12 22 columns x 18,835 rows

 

Below is the heading of the final table representing Joined_v12, including the data types and a snippet into the values these columns contain.

#######################
# Final Excel Heading #
#######################
str(retention)
## tibble [18,835 × 22] (S3: tbl_df/tbl/data.frame)
##  $ ID                : num [1:18835] 10000 10001 10002 10003 10004 ...
##  $ Gender            : chr [1:18835] "M" "F" "M" "M" ...
##  $ Race/Ethnicity    : chr [1:18835] "White" "Black" "White" "White" ...
##  $ MOS               : chr [1:18835] "01" "01" "01" "00" ...
##  $ WFF               : chr [1:18835] "Immaterial" "Immaterial" "Immaterial" "Immaterial" ...
##  $ Grade             : chr [1:18835] "O3" "O4" "O3" "E9" ...
##  $ Type              : chr [1:18835] "Officer" "Officer" "Officer" "Enlisted" ...
##  $ DOR               : POSIXct[1:18835], format: NA NA ...
##  $ Months in Grade   : num [1:18835] 9 57 13 46 58 20 53 16 2 97 ...
##  $ Months in Service : num [1:18835] 229 357 235 398 203 358 435 154 188 407 ...
##  $ ETS/MRD           : POSIXct[1:18835], format: "2039-06-30" "2027-09-30" ...
##  $ MILED             : chr [1:18835] NA NA NA "SSD Level 6 Grad" ...
##  $ CIVED             : chr [1:18835] NA NA NA "Completed 1 Semester to 1-4 Years College" ...
##  $ Unit Name         : chr [1:18835] "DET 1 (CERF P) 81ST TC" "81ST TROOP COMMAND (-)" "81ST TROOP COMMAND (-)" "81ST TROOP COMMAND (-)" ...
##  $ TPC               : chr [1:18835] "Completed Training" "Completed Training" "Completed Training" "Completed Training" ...
##  $ PEBD              : POSIXct[1:18835], format: "2003-08-15" "1992-12-21" ...
##  $ Transaction Date  : POSIXct[1:18835], format: NA NA ...
##  $ Attrition         : chr [1:18835] NA NA NA "Y" ...
##  $ SOC/SOE           : chr [1:18835] NA NA NA "Vol Enrl RC, Under 10 USC651 on/After 1 June 84" ...
##  $ Date of Commission: POSIXct[1:18835], format: NA NA ...
##  $ Loss Reason       : chr [1:18835] NA NA NA "Regular Retirement" ...
##  $ Loss              : chr [1:18835] "N" "N" "N" "Y" ...

 

Below is summary of missing values per column in Joined_v12. One special attribute to note about the total number of missing values is that many have a total of 9,874. This is due to the initial data sets not having matching columns on this basic demographic data. In this case, the initial Strengths data set was missing information such as MILED, CIVED, and SOC/SOE. And of course, initial strength also did not include the Transaction Date, Attrition, or Loss Reason columns as these Soldiers were not initially counted as losses unless truly lost following the join.

##################
# Number Missing #
##################
colSums(is.na(retention))
##                 ID             Gender     Race/Ethnicity                MOS 
##                  0                  0                  0                  0 
##                WFF              Grade               Type                DOR 
##                  0                  0                  0               9874 
##    Months in Grade  Months in Service            ETS/MRD              MILED 
##                  0                  0                  0               9874 
##              CIVED          Unit Name                TPC               PEBD 
##               9874                  0                  0                  0 
##   Transaction Date          Attrition            SOC/SOE Date of Commission 
##               9874               9874               9874              18211 
##        Loss Reason               Loss 
##               9874                  0

 

Final Data Understanding

Cleaned Data Understanding Table

Below is the final data understanding table for Joined_v12 to be used throughout the remainder of this analysis. One special attribute to note about the total number of missing values is that many have a total of 9,874. This is due to the initial data sets not having matching columns on this basic demographic data. In this case, the initial Strengths data set was missing information such as MILED, CIVED, and SOC/SOE. And of course, initial strength also did not include the Transaction Date, Attrition, or Loss Reason columns as these Soldiers were not initially counted as losses unless truly lost following the join.

The target value for this predictive analysis is: Loss. This would allow factors in the data set to potentially be used to predict whether a Soldier is lost or retained.

 

Domains IV & V: Method Selection & Data Analysis

Domain IV & V: Method Selection & Data Analysis

This section provides an overview of some exploratory analysis into the data set, including two interactive portions from a Tableau Dashboard. This also provides some insight into the methods used and the goals of their use.

 

Exploratory Analysis

Exploratory Analysis: Entire Data Set

Below is a shallow step into some analysis to learn more about the configuration of the data set. The below figures explore all of the data, to include both standing strength and losses. These first three visualizations built in R detail several counts across the force, including counts by Gender, WFF, Type, and Race/Ethnicity. Some key trends can be identified in these exploratory graphs:

  • Soldiers of the INNG are predominantly male
  • Soldiers of the INNG are predominantly white
  • Soldiers of the INNG are predominantly enlisted
  • Soldiers of the INNG predominantly fill the WFF of Movement & Maneuver and Sustainment

 

 

Exploratory Analysis: Losses

The below figures explore only information known for Soldiers who left the INNG. Because this data is not available for both lost and retained Soldiers, it’ll only be used here for broader understanding of the data set. The below bar plots show the counts of the highest level of military education achieved by Soldiers who left the INNG, as well as some basic counts of lost Soldiers by WFF, Grade, and Gender. From the below visualizations, a few observations can be made:

  • SSD Level 1 graduates and no schooling graduates make up the vast majority of those who got out. SSD (Self-Structured Development) Level 1 is a requirement and a prerequisite for E4 Soldiers to attend BLC (Basic Leadership Course) and promote to E5. These counts are consistent with trending data for Soldiers who leave the guard, as depicted in the Tableau Dashboard below.
  • The majority of Soldiers who left the INNG at least have a high school diploma, and most of those only having a high school diploma. This is consistent with the general requirement for Soldiers to have a high school diploma to enter the National Guard. While exceptions can be made, these cases are very rare, detailed below.
  • Consistently among all WFFs and Grades, males make up the majority of Soldiers lost, which is consistent with the majority of the overall data set being male, as depicted above.

 

 

 

Exploratory Analysis: Retained

The below shows some basic counts of lost Soldiers by WFF, Grade, and Gender. From the below visualizations, a few observations can be made:

  • Consistently among all WFFs and and Grades, males make up the majority of Soldiers retained, which is consistent with the majority of the overall data set being male, as depicted above.
  • As compared to Soldiers lost, the composition of grades is much more evenly distributed, where E4 is almost on par with E1 and E2 retained.

 

Analytic Methods

Analytic Methods

Below is a description of the analytic methods used for predicting a loss based on the provided data that exists for all Soldiers.

  1. Correlation Heat Maps: While not an explicit predictive method, correlation maps and their associated matrices can assist in identifying relationships in the data prior to the analysis. In this case, WFF and Gender were correlated, Soldier Type and Gender were correlated, and Loss Reason and Gender were correlated. The correlation maps done were limited to explore the data and identify key relationships for future recommendations. However, this list is not exhaustive of all possibilities for correlation exploration.

  2. Classification Tree: Classification trees are used to predict categorical dependent variables using categorical and numeric covariates. In this case, Gender, WFF, Grade, Months in Grade, and Months in Service were used to build this model.

  3. KNN: KNN is used to predict categorical independent variables using numeric dependent variables. In this case, Months in Grade and Months in Service were used as predictors for whether the Soldier would be considered a loss.

  4. Logistic Regression: Logistic Regression is also used to predict categorical dependent variables using categorical and numeric independent variables, although conversion of character variables to factor variables is required. In this case, following multiple iterations of backwards selection (removing insignificant variables), the covariates of Gender, Months in Grade, and Months in Service are reliable and significant in predicting Loss as Y or N.

Where required, the same testing and training sets were used throughout all analytic methods, with a seed set to ensure consistent sampling for each of the sets.

 

R Packages

R Packages

Below is a table of R packages used, their justifications for use, and the designated definitions for each.

Definitions for these R packages were taken directly from the R Project.

 

 

Analytics: Correlation

Correlation

Below are two correlation heat maps identifying correlation between the War Fighting Functions (WFF) and Gender, the first for losses and the second for Soldiers that were retained. While there are no obvious significant correlations between the variables given for Soldiers lost nor between the variables given for Soldiers retained, some subtle differences can be noted:

  • Females tend to trend more positively towards Sustainment
  • Males tend to trend more positively toward Movement & Maneuver and Fires
  • The remainder of the War Fighting Functions are split fairly evenly in this correlation with Females trending in Intelligence slightly more and Males trending in Protection, Immaterial, Fires, and Command and Control
  • Both WFF and Gender are mutually exclusive

 

#####################
# Correlation Plots #
#####################

#Correlation for Losses
loss_Y<-filter(retention, Loss=="Y") #filter for lost
retention_loss_cor1 <- dummy_cols(loss_Y, select_columns = c("Gender", "WFF"))
retention_cor_mat <- round(cor(retention_loss_cor1[23:31]), 4) #to four decimals
melt_retention_cor_mat <- melt(retention_cor_mat) #melted correlation

#Correlation Plot for Losses
ggplot(data = melt_retention_cor_mat, aes(x=Var1, y=Var2, fill=value)) + geom_tile()  + geom_text(aes(label = value), color = "black", size = 2) + theme(axis.text.x = element_text(angle = 45, hjust=1)) + scale_fill_gradient(low = "white", high = "black", guide = "colorbar")

#Correlation for Retains
loss_N<-filter(retention, Loss=="N") #filter for retained
retention_loss_cor2 <- dummy_cols(loss_N, select_columns = c("Gender", "WFF"))
retention_cor_mat <- round(cor(retention_loss_cor2[23:31]), 4) #to four decimals
melt_retention_cor_mat <- melt(retention_cor_mat) #melted correlation

#Correlation Plot for Retains
ggplot(data = melt_retention_cor_mat, aes(x=Var1, y=Var2, fill=value)) + geom_tile()  + geom_text(aes(label = value), color = "black", size = 2) + theme(axis.text.x = element_text(angle = 45, hjust=1)) + scale_fill_gradient(low = "white", high = "black", guide = "colorbar")

Below are two correlation heat maps identifying correlation between the Type of Soldier (Officer, Enlisted, Warren) and Gender, the first for losses and the second for Soldiers that were retained. While there are no obvious significant correlations between the variables given for Soldiers lost nor between the variables given for Soldiers retained, some subtle differences can be noted:

  • Males trend positively toward Warrant and Officer
  • Females trend slightly positively toward Enlisted
  • Both Type and Gender are mutually exclusive

 

#####################
# Correlation Plots #
#####################

retention_loss_cor3 <- dummy_cols(loss_Y, select_columns = c("Gender", "Type"))
retention_cor_mat <- round(cor(retention_loss_cor3[23:27]), 4) #to four decimals
melt_retention_cor_mat <- melt(retention_cor_mat)

ggplot(data = melt_retention_cor_mat, aes(x=Var1, y=Var2, fill=value)) + geom_tile()  + geom_text(aes(label = value), color = "black", size = 2) + theme(axis.text.x = element_text(angle = 45, hjust=1)) + scale_fill_gradient(low = "white", high = "black", guide = "colorbar")

retention_loss_cor4 <- dummy_cols(loss_N, select_columns = c("Gender", "Type"))
retention_cor_mat <- round(cor(retention_loss_cor4[23:27]), 4) #to four decimals
melt_retention_cor_mat <- melt(retention_cor_mat)

ggplot(data = melt_retention_cor_mat, aes(x=Var1, y=Var2, fill=value)) + geom_tile()  + geom_text(aes(label = value), color = "black", size = 2) + theme(axis.text.x = element_text(angle = 45, hjust=1)) + scale_fill_gradient(low = "white", high = "black", guide = "colorbar")

Below is a correlation heat map identifying correlation between Loss Reason and Gender for Soldiers lost. While there are no obvious significant correlations between these variables, some subtle differences can be noted:

  • Females trend more positively for loss due to IET Discharge and Hardship or Religious
  • Males trend slightly more positively for loss due to Resigned Commission, Regular Retirement, Obligation Complete, Non-Criminal Misconduct, Failure to Meet Requirements, Enrolled in ROTC, Death, Criminal Misconduct, AWOL, and Accepted Commission
  • Females trend slightly more positively for loss due to Medical Retirement/Separation, Erroneous Enlistment, Component/Service Transfer, and Administrative
  • Both Loss Reason and Gender are mutually exclusive

While correlations are not inherently predictive nor do they show causation, they can help tell a story about the data. From the above relationships identified, it is clear that females tend to leave the INNG for more administrative or medical reasons, whereas males tend to leave the INNG for either professional growth (i.e., accepting commission or enrolling in ROTC) or various forms of misconduct (i.e., criminal, non-criminal, or absenteeism without leave). This, alone, can suggest changes in the INNG to capitalize on professional growth, reprimand misconduct, and shape the well-being of Soldiers.

 

#####################
# Correlation Plots #
#####################

retention_loss_cor5 <- dummy_cols(loss_Y, select_columns = c("Gender", "Loss Reason"))
retention_cor_mat <- round(cor(retention_loss_cor5[23:40]), 4) #to four decimals
melt_retention_cor_mat <- melt(retention_cor_mat)

ggplot(data = melt_retention_cor_mat, aes(x=Var1, y=Var2, fill=value)) + geom_tile()  + geom_text(aes(label = value), color = "black", size = 1) + theme(axis.text.x = element_text(angle = 45, hjust=1)) + scale_fill_gradient(low = "white", high = "black", guide = "colorbar")

While correlations can provide some basic insight into the internal relationships in the data, these correlations do not provide insight into further analysis. Correlations do not inherently point to causation, but much like stereotypes surrounding service in the military, the demographic relationships present in the above heatmaps prove to be true. Correlation maps and their associated matrices can assist in identifying relationships in the data prior to the analysis. In this case, WFF and Gender were correlated, Soldier Type and Gender were correlated, and Loss Reason and Gender were correlated. The correlation maps done were limited to explore the data and identify key relationships for future recommendations. However, this list is not exhaustive of all possibilities for correlation exploration.

Analytics: Classification Tree

Classification Tree

Classification trees are inherently very easy to understand and digest as they tend to follow more intuitive processing for decisions. They also provide a simple visual representation of relationships that work well with qualitative data without the need to use dummy variables, as is true with correlation. One of the disadvantages of classification trees is the room for error due to a lower level of predictive accuracy.

This classification tree below uses the factors of Gender, WFF, Grade, Months in Grade, and Months in Service to predict whether Soldiers are retained or lost. To read the tree below, begin with the first node:

  1. At the top, 0.48 indicates 48% of the total training sample were retained.
  2. Following the first decision marks whether the Soldiers fall within the ranks listed. If yes, moving to the left-most node on the following level, this node indicates 50% of the Soldiers within the ranks listed have a probability of being retained of 39%.
  3. Following on the left, the next decision is whether they’ve served more than 14 months. If yes, moving to the left, this node indicates 47% of the Soldiers within these ranks listed AND that have served over 14 months have a 36% probability of being retained.

This methodology can be followed for all nodes. Without going into exhaustive detail on the classification tree, some important information can be extracted:

  • Although Gender and WFF were used in the development in this classification tree, the variables were not deemed important enough to be held in any aspect of the tree. Classification trees are generally flexible models that do not increase the parameters by adding more variables.
  • Soldiers NOT within the ranks listed below and with months in service less than 87 have the greatest chance (given the parameters) of getting out. This is likely around the time Soldiers complete their first contract.
  • Soldiers within the ranks listed below, with months in service greater than or equal to 14, and with months in grade less than or equal to 12 months have the greatest chance of staying in. This is likely due to the recent promotion and the, again, greater drive to serve the organization and extend their contracts (given an already shorter time in service, as is).
#################
# Decision Tree #
#################

#Random Sampling
index <- sample(nrow(retention), nrow(retention)*0.50)
ret_train <- retention[index,]
ret_test <- retention[-index,]

#Fit and Display Decision Tree
ret_rpart <- rpart(formula = Loss ~ Gender + WFF + Grade + `Months in Grade` + `Months in Service`, data = ret_train, method = 'class')
rpart.plot(ret_rpart, extra = 106)

#Prediction
ret_pred_dt <- predict(ret_rpart, newdata = ret_test, type = "class")

#Confusion Matrix
cm1 <- confusionMatrix(ret_pred_dt, as.factor(ret_test$Loss))
cm1
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    N    Y
##          N 4129 1942
##          Y  838 2509
##                                          
##                Accuracy : 0.7048         
##                  95% CI : (0.6955, 0.714)
##     No Information Rate : 0.5274         
##     P-Value [Acc > NIR] : < 2.2e-16      
##                                          
##                   Kappa : 0.4001         
##                                          
##  Mcnemar's Test P-Value : < 2.2e-16      
##                                          
##             Sensitivity : 0.8313         
##             Specificity : 0.5637         
##          Pos Pred Value : 0.6801         
##          Neg Pred Value : 0.7496         
##              Prevalence : 0.5274         
##          Detection Rate : 0.4384         
##    Detection Prevalence : 0.6446         
##       Balanced Accuracy : 0.6975         
##                                          
##        'Positive' Class : N              
## 

The accuracy score for this prediction tool is listed below, which displays the percentage of values correctly predicted using this method. While this is not the greatest method for prediction, as will soon be shown, it can still be useful in quickly and intuitively categorizing Soldiers and determining, to the probabilities above, the likelihood they’ll stay in or get out (depending on the parameters above).

####################
# Display Accuracy #
####################
cm1$overall['Accuracy']
##  Accuracy 
## 0.7048206

Note: Whether a Soldier is retained is predicted in this model using Gender, WFF, Months in Grade, and Months in Service.

 

Analytics: KNN Model

KNN

KNN below is used to predict categorical variables using numeric covariates. In this case, we are using the numeric Months in Grade and Months in Service as predictors in whether Loss is dictated as Y or N. This algorithm stores all available data and classifies new data points based on similarity. Through trial-and-error in verifying the performance of the algorithm, the optimum value for k is 5, shown below.

#############
# KNN Model #
#############

#Develop KNN Model
ret_knn <- knn(train = ret_train[, 9:10], test = ret_test[, 9:10], cl = as.vector(as.matrix(ret_train[, 22])), k = 5)

#Confusion Matrix for Misclassified
cm2 <- confusionMatrix(ret_knn, as.factor(ret_test$Loss))
cm2
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    N    Y
##          N 4077 1311
##          Y  890 3140
##                                           
##                Accuracy : 0.7663          
##                  95% CI : (0.7576, 0.7748)
##     No Information Rate : 0.5274          
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.5289          
##                                           
##  Mcnemar's Test P-Value : < 2.2e-16       
##                                           
##             Sensitivity : 0.8208          
##             Specificity : 0.7055          
##          Pos Pred Value : 0.7567          
##          Neg Pred Value : 0.7792          
##              Prevalence : 0.5274          
##          Detection Rate : 0.4329          
##    Detection Prevalence : 0.5721          
##       Balanced Accuracy : 0.7631          
##                                           
##        'Positive' Class : N               
## 

The accuracy score for this prediction tool is listed below, which displays the percentage of values correctly predicted using this method. This is slightly more accurate in predicting than the classification tree previously used, although visualizing this for predictions is not as intuitive as decision trees - the visulization has been removed.

####################
# Display Accuracy #
####################
cm2$overall['Accuracy']
##  Accuracy 
## 0.7662986

Note: Whether a Soldier is retained is predicted in this model using Months in Grade and Months in Service.

 

Analytics: Regression

Logistic Regression

A generalized linear model, or logistic regression, can be used to predict categorical variables using numeric and categorical covariates. In this case, following iterative removal of insignificant variables, only Gender, Months in Grade, and Months in Service remain.

#######################
# Logistic Regression #
#######################

#Convert Train Variables
ret_train$Loss <- as.factor(ret_train$Loss)

#Convert Test Variables
ret_test$Loss <- as.factor(ret_test$Loss)

#Run Regression
ret_glm <- glm(Loss ~ Gender + `Months in Grade` + `Months in Service`, data = ret_train, family = binomial)
summary(ret_glm)
## 
## Call:
## glm(formula = Loss ~ Gender + `Months in Grade` + `Months in Service`, 
##     family = binomial, data = ret_train)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.5936  -1.1136  -0.7922   1.1736   2.3485  
## 
## Coefficients:
##                      Estimate Std. Error z value Pr(>|z|)    
## (Intercept)         -0.083725   0.052798  -1.586   0.1128    
## GenderM              0.138438   0.054027   2.562   0.0104 *  
## `Months in Grade`    0.023110   0.001238  18.670   <2e-16 ***
## `Months in Service` -0.005759   0.000343 -16.789   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 13038  on 9416  degrees of freedom
## Residual deviance: 12586  on 9413  degrees of freedom
## AIC: 12594
## 
## Number of Fisher Scoring iterations: 4
#Generate Prediction
pred_ret_glm <- predict(ret_glm, newdata = ret_test, type = "response")
y_or_n <- ifelse(pred_ret_glm >= 0, "Y", "N")
p_class <- factor(y_or_n, levels = levels(ret_test$Loss))

#Confusion Matrix
cm3 <- confusionMatrix(p_class, as.factor(ret_test$Loss))
cm3
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    N    Y
##          N    0    0
##          Y 4967 4451
##                                           
##                Accuracy : 0.4726          
##                  95% CI : (0.4625, 0.4827)
##     No Information Rate : 0.5274          
##     P-Value [Acc > NIR] : 1               
##                                           
##                   Kappa : 0               
##                                           
##  Mcnemar's Test P-Value : <2e-16          
##                                           
##             Sensitivity : 0.0000          
##             Specificity : 1.0000          
##          Pos Pred Value :    NaN          
##          Neg Pred Value : 0.4726          
##              Prevalence : 0.5274          
##          Detection Rate : 0.0000          
##    Detection Prevalence : 0.0000          
##       Balanced Accuracy : 0.5000          
##                                           
##        'Positive' Class : N               
## 

The accuracy score for this prediction tool is listed below, which displays the percentage of values correctly predicted using this method. This is much less accurate in predicting than the classification tree or the KNN model previously used. Another thing to note: this model’s confusion matrix lends to the possibility that this method is only useful for predicting false negatives and true negatives. This leaves this model relatively unreliable in working to predict whether a Soldier will be retained.

####################
# Display Accuracy #
####################
cm3$overall['Accuracy']
##  Accuracy 
## 0.4726056

Note: Whether a Soldier is retained is predicted in this model using Gender, Months in Grade, and Months in Service.

Domain VI: Summary

Domain VI: Summary

This analysis began with the discussion of Soldier care, and the intent of this analysis was initially to provide some insight into statistics around retention of Soldiers to lend further insights to leaders in the Indiana National Guard on just how to care for Soldiers. One of the ways to approach this question of Soldier care is by examining why Soldiers get out through statistically sound means. In this case, this project’s aim evolved into utilizing available data to determine the algorithm that would best reflect and predict whether a Soldier is to be retained.

 

Deployment and Recommendations

Deployment and Recommendations

Each Soldier and their leaders has their own perception of how and why Soldiers leave the INNG. With a wide selection of predictor variables to choose from, sometimes an analysis such as this can be convoluted with bias or its variables can be selected based on our own perceptions. The deployment of this analysis is, by no means, inclusive of all factors affecting retention. However, it does provide insight into some factors that can realistically affect whether a Soldier stays in.

According to the analyses done, both Months in Grade and Months in Service can be used in a variety of analytic methods to predict whether a Soldier can get out, among other variables in some cases (pending limitations of the methods used).

  1. Classification Tree: This tree used the variables Gender, WFF, Grade, Months in Grade, and Months in Service for its development. Viewing the, Gender and WFF were not deemed important enough factors to determine the retention of a Soldier in the data set. However, from the Grade at the father node, both Months in Grade and Months in Service were used to break out the decisions for this. Using this decision tree, which serves both as a visual aid in determining a Soldiers’ retention and as an intuitive tool for understanding retention, we can be 95% confident that this tree can correctly categorize retention of Soldiers with an accuracy between 69.55% and 71.4%. While not a perfect model, it’s usable and easy to understand, and has an acceptable level of accuracy.

  2. KNN Model: While KNN models are not easy to visualize, they are very thorough in identifying multi-layered predictions through relationships that exist within the data set. In this model, only the variables of Months in Grade and Months in Service were used, primarily due to the numeric nature required of covariates used in this form of model. Using this model, we can (with a 95% confidence) correctly categorize retention of Soldiers with an accuracy between 75.76% and 77.48%. This model is slightly better in predictions for retention than the Classification Tree above and the Logistic Regression below, but is much more limited in its application as it does not accept non-numeric (or character) variables as inputs. As an accurate tool with minimal data collection, leaders in the INNG can rapidly use this tool to more accurately predict the retention of their Soldiers.

  3. Logistic Regression: This regression model, following multiple steps of backwards selection, used the variables of Gender, Months in Grade, and Months in Service for its predictions. Using this model, we can (with a 95% confidence) correctly categorize the true negatives and false negatives of Soldier retention with an accuracy between 46.25% and 48.27%. This model is overall much worse than the KNN model and the Classification Tree.

Overall, it is highly recommended that leaders primarily refer to the classification tree for determining whether a Soldier is to stay in. While the classification tree is not the most accurate, it does provide a relatively intuitive model for analysis into whether a Soldier will leave the INNG and uses minimal data entry for its decisions. Given the breadth of the data (spanning over 7 years), this specific tool can be quickly used by leaders scanning formations to determine the likelihood a Soldier will leave the guard. And without much additional data input, it can practically be deployed in use by leaders at all echelons of the INNG.

Limitations and Lessons Learned

Limitations and Lessons Learned

Data for this project is by no means all-inclusive and exhaustive. There are a plethora of statistical points than can be used to increase the breadth and accuracy of this project. Initially, this project started with two distinct types of data sets: the total strength at the start of a fiscal year (by Soldier name) and the total losses by the end of that fiscal year (by Soldier name). However, columns existed in the losses tables but not in the strength table. These columns included data about

  • the highest level of civilian education achieved,
  • the highest level of professional military education achieved,
  • the date of current rank,
  • and the Soldier’s source of entrance into the INNG.

These known attributes alone could provide more insight into predictors for whether a Soldier could get out, but because the data only existed for Soldiers that left the guard, it could not be used in this analysis.

There are additional qualities available for further examination to predict whether a Soldier leaves the INNG. The Director’s Personnel Readiness Overview, or DPRO, has a lot of versatility and variability in the predictors one can use in an analysis like this, such as

  • flags present on a Soldier,
  • eligibility for and reception of educational benefits,
  • eligibility for and reception of bonuses and entitlements,
  • specialty schools attendance,
  • awards received,
  • number, frequency, and most recent date of deployment(s),
  • additional military occupational specialties (or MOSs),
  • and more.

This list, again, is by no means exhaustive. In beginning this project, the hope was to analyze the data itself as opposed to weighting the methods used for prediction. However, given that some prediction methods are stronger than others using the minimal variables available, it’s safe to say that additional variables for input will not only increase the accuracy of the prediction methods but also expand on the potential methods used for further analysis.

Closing Remarks

Closing Remarks

While this project was limited in its data, leaders in the INNG are still strongly encouraged to use ADP 6-22: Army Leadership and the Profession to assist in developing a conducive leadership style that serves Soldiers, promotes a team mindset, and lends to the self-policing profession that is the US Army. Although this data does not suggest causation into the true reasons for Soldiers leaving the INNG, it does give leaders the ability to identify Soldiers within their formations that are at risk for leaving the guard, even if only using the variables of Months in Grade and Months in Service. If this document is applied appropriately, leaders can identify those at-risk Soldiers and work with them to identify courses of action to keep them invested in the INNG. Through personal anecdotal experience, the foundation of a Soldier’s leaders is what causes them to get out or stay in:

  • If a Soldier hasn’t been paid, it is the leader’s responsibility to ensure that Soldier is taken care of.
  • If a Soldier is not meeting physical fitness requirements, it is the leader’s responsibility to ensure the Soldier is fit for service.
  • If a Soldier is not medically well, it is the leader’s responsibility to ensure the Soldier gets appropriate care.
  • If a Soldier desires attendance to a special school or training, it is the leader’s responsibility to ensure the Soldier attends.
  • If a Soldier is interested in attaining higher education, it is the leader’s responsibility to ensure the Soldier receives access to educational benefits.

Deep predictive analysis into the retention of Soldiers is a very broad and complicated endeavor, especially in searching to identify a predictive tool that can be used to shape retention programs. The long-term objective of this project is to quantifiably, through quantitative and qualitative means, identify what factors can affect an Indiana National Guard Soldier’s retention. Then, in turn, use the results of this predictive analysis to then prescribe a method for the maintenance of Soldier Care and Soldier retention programs. Some of this data can be pulled from DPRO (such as with this project) which primarily covers each Soldier’s profile. However, additional data would be needed to cultivate a holistic approach in developing the feedback loop that serves as a true Soldier retention protocol. Some of this data includes, but is not limited to, command climate surveys conducted annually to assess leadership within a command echelon, surveys detailing personal revelations from Soldiers leaving the INNG, performance rates of the unit(s) as a whole, career progression tracking for each Soldier, and more.

In addition to quantitative data provided by DPRO used in this initial analysis, each Soldier is strongly encouraged to complete a standard paragraph-entry survey when exiting the Indiana National Guard. This lengthy survey requests that the Soldiers provide true reasons as to why they’re leaving the service. Unfortunately, this data is available only for Indiana, as other states have the freedom to implement surveys as they see fit, and the survey changes frequently, adding another layer of complication to its applicability. As it stands now, this survey is also not required of Soldiers as they exit the INNG. As a result, this survey obtains only about 10% participation, with some of the entries being unusable as they contain no information. Further textual analysis into the exit survey data available could lend to a justification to make the exit survey mandatory, as well as to develop a streamlined exit survey requirement for all 54 states and territories.

And finally, this project will be able to lend to the most important topic of all: the definition and implementation of Soldier Care. Developing this definition, while not a direct outcome of the project, will allow leaders across the National Guard to build and implement a program dedicated to the care of its Service Members by tying every fiber of action to the point of this definition. Further analysis of this problem set will solidify, using quantifiable analysis to support it, the intention behind caring for Soldiers.

**In Closing*: In an organization such as the Indiana National Guard, whose vision is focused on “putting our people first”, it is absolutely critical that every leader at every echelon applies this vision and this mission statement into Soldier Care.