PROJECT DESCRIPTION

See updates spanning 12/20/2024–01/08/2025 for initial Alaska carrier-route-year build and carrier lists, 06/01/2025–07/26/2025 for HHI, Bush vs. Mainline segmentation, and integrated bankruptcy/M&A analysis.

Full script builds the panel, merges event data (with alias/parent splits),tracks exits and market concentration.

  1. PI: Mike Jones - UAA-ISER -
  2. RP: Katie Longuevan - UAA-ISER -

Purpose of this code: - Build a detailed carrier-route-year dataset for Alaska (post-2002). - Add labels for route type (mainline vs. bush) and carrier exit timing. - Track passenger, freight, and mail totals per carrier per route per year. - Enrich the dataset with carrier counts per route and route-year.

Step 1: Filter All Data Post-2002 (Policy Change)

Step 2: Build Carrier-Route-Year Panel (Post-2002)

Katie code: Route is defined by the undirected pair of city names,origin_code and destination_code are airport codes in lexical order.

Each row is a unique carrier’s operation on a route in a given year/month,capturing summed traffic and performance metrics.

Note: Summarize traffic by carrier, route, year, and month; route is undirected city pair.

Step 3: Classify Routes by Mainline vs. Bush

Extract all mainline (hub) airport codes from the bush_mainline_cities reference sheet. ‘Bush1_Hub0’ == 0 denotes a mainline hub; these codes define the state’s high-capacity trunk network. These codes are critical for distinguishing trunk routes (hub-to-hub) from bush service. Get list of hub airports for trunk network (mainline hubs).

Step 4: QA Preview of Mainline vs. Bush Labeling

Shows first 10 rows so we can visually confirm classification results.

Note: Check first 10 rows to confirm classification worked.

Step 5: Count Carriers per Route (All Years)

While each row represents one carrier per route-year, this step finds the total number of unique carriers that ever operated on each route across the dataset. Adds context on market diversity.

Note: Count how many unique carriers ever flew each route (context for competition).

Joins the route-level carrier counts back to the detailed panel dataset so every carrier-route-year record has a field for overall market diversity.

Note: Add carrier count back to full dataset for each route.

Finds the maximum observed number of carriers on any single route,providing a quick diagnostic for market competition extremes.

Step 6: Identify Entry/Exit Years per Carrier

Creates route-specific and state-level first/last year markers for each carrier. These are essential for understanding timing of exits relative to route and state.

Note: First and last year for each carrier on each route.

Step 7: Add Traffic Totals (Passengers, Freight, Mail)

This calculation aggregates the passenger numbers to give insights into carrier performance and route popularity.

Analyzing freight allows us to gauge the volume of goods transported, offering insights into economic activity. This metric can highlight the role of carriers in facilitating communication and commerce through mail transport.

Creates a new ‘total_freight_mail’ column that represents the aggregate of both freight and mail per carrier, per route, per year.

Goal: HHI Calculations for Alaska Carrier Route-Year —

Authors: Katie, Mike

Step HH1: Prepare Data for HHI Calculations

The Herfindahl-Hirschman Index (HHI) measures market concentration by summing the squared market shares of all participants within a market. Higher HHI values reflect a more concentrated market and lower competition. These calculations will produce HHI values for passengers, freight, mail,and combined freight+mail by route and year.

Step HHI2: Calculate HHI for Passengers

Katie code: HHI is calculated for each route and year based on passenger totals.

Mike note: NEED ***** guardrails so the HHI is not calculated for a group when total_passengers = 0. Was previously defaulting to HHI = 10,000, which skews results. Includes filter to avoid zero-passenger markets.

Note: Calculate passenger-based HHI per route-year (guarded for zero totals).

Step HHI3: Calculate HHI for Freight –

HHI is computed for freight volumes per route-year using the same method as passengers. Note: Calculate freight-based HHI per route-year.

Step HHI4: Calculate HHI for Mail

HHI is computed for mail volumes per route-year. Note: Calculate mail-based HHI per route-year.

Step HHI5: Calculate HHI for Combined Freight + Mail

This combines freight and mail volumes to reflect total cargo activity. Note: Calculate combined cargo HHI per route-year.

Note: These HHI values (passenger, freight, mail, cargo) are now ready for integration into market concentration and bankruptcy/event carrier analysis.

Step HHI6: Sort and Count Carriers per Route-Year

Sort the data by the key variables (route, year, carrier_name) for clear output and analysis order.

Also, create the yearly number of carriers serving each route within this master dataset to support market concentration and HHI analysis.

Note: Order rows by route, year, and carrier for clarity.

Certified Airline Merger List Comparison and Analysis

Sources:

  1. U.S. Bankruptcies and Service Cessations: https://www.airlines.org/dataset/u-s-bankruptcies-and-services-cessations/
    • Provides detailed records of U.S. airline bankruptcies, including Chapter 7 and Chapter 11 filings, and information on service cessations.
  2. U.S. Airline Mergers and Acquisitions: https://www.airlines.org/dataset/u-s-airline-mergers-and-acquisitions/
    • Contains a comprehensive list of U.S. airline mergers and acquisitions, tracking changes in airline ownership and operations over time.
  3. deleted not needed 7/24/25 Certificated Air Carrier List: https://www.transportation.gov/mission/office-secretary/office-policy/aviation-policy/certificated-air-carrier-list
  • Provides a comprehensive list of certificated air carriers, including operating authorities and service types for U.S.-based airlines.

Goal:

The purpose of this script is to integrate data from two sources to analyze airline mergers and bankruptcies. By merging the Certified Carrier List with bankruptcy and merger data, the code aims to create a consolidated dataset for tracking airline operational changes, including aliases, operating statuses, and financial outcomes (e.g., reorganizations or service cessations). The final outputs support further analysis or reporting.

Bankruptcy Data Cleaning and Analysis

Goal: The objective of this script is to clean and process the Bankruptcy dataset.

Chapter 7 = liquidated (exited the market) Chapter 11 = reorganized (continues operating)

Key objectives: 1. Standardize column names and handle missing values. 2. Map Chapter data to descriptive statuses (Chapter 7 vs Chapter 11). 3. Extract and format date information (year and month). 4. Extract contextual notes from raw text. 5. Flag carriers relevant to Alaska via fuzzy matching.

Step 1: Import Bankruptcy Dataset

Step 2: Standardize Column Names

Step 3: Trim Whitespace & Map Statuses

Step 4: Extract and Standardize Date Components —-

Step 5: Extract Notes & Clean Carrier Names

Step 6: Retain Only Relevant Columns

Step 7: Identify Alaska‑Relevant Carriers (Combined Matching)

Goal: Use Exact, Token, and Fuzzy matching against the AK list (all_data_new$unique_carrier_name) – but match from bankruptcy_carriers.

Step 8: Build br_carriers_ak

Step 9: Review Non‑matched Carriers —-

Step 10: Inspect Fuzzy Near‑Misses —-

At this point: • br_carriers_ak contains your Alaska carriers, • bankruptcy_carriers remains the full, cleaned panel, • non_matches shows the carriers still unmatched and near_misses highlights borderline fuzzy cases for any manual review.