Date: 2026-01-30
Reported by: A. Perez
Affected table:
Alliance_RPT.DataScience.PIF_LIST_ALL (replicated from
PROD_EDW.DATASCIENCE.PIF_LIST_ALL)
Impact: “CLOSED” and “OPEN” appearing in HCPCS code
fields on the PDR dashboards
Severity: 100% of rows affected
(63,225 out of 63,225 contain a status value in
hcpcs_code_list)
Data from unrelated closed cases is bleeding into
columns 34-40 of every row in the SQL Server replica of
PIF_LIST_ALL. Starting at column 34
(HCPCS_CODE_LIST), the values are not HCPCS codes — they
are the STATUS, CASENUMBER,
CLAIMNUMBER, etc. of a different case that
got concatenated onto the row.
This is causing the word “CLOSED” to appear in the HCPCS Codes field on our PDR dashboards.
The Snowflake source table has 33 columns. The SQL Server replica has 40 columns.
Snowflake (PROD_EDW) |
SQL Server (Alliance_RPT) |
|
|---|---|---|
| Column count | 33 | 40 |
| Last column | HCPCS_CODE_LIST (col 33) |
PaidDateList (col 40) |
| Column casing | ALL UPPERCASE | Cols 1-33 lowercase, cols 34-40 mixed case |
| Extra columns | — | HCPCS_CODE_LIST, REVENUECODE_LIST,
FirstPaidDate, LastPaidDate,
DistinctPaidDates, MultiplePaidDates,
PaidDateList |
The 7 extra columns (34-40) in SQL Server do not exist in Snowflake. They are being created by the replication process. The different casing of these columns (mixed case vs. the lowercase of the original 33 columns) confirms they did not come from the same source DDL.
The contamination is NOT in the Snowflake
PIF_LIST_ALL table — it is in the Snowflake-to-SQL-Server
replication/ETL that populates
Alliance_RPT.DataScience.PIF_LIST_ALL.
Position Snowflake (PROD_EDW) SQL Server (Alliance_RPT)
-------- ---------------------------- ----------------------------
1 STATUS status
2 CASENUMBER casenumber
3 CLAIMNUMBER claimnumber
... ... (cols 4-32 match) ... ...
33 HCPCS_CODE_LIST icd10_rollup_list
---- Snowflake table ends here ----
34 (does not exist) HCPCS_CODE_LIST <-- contains STATUS of another case
35 (does not exist) REVENUECODE_LIST <-- contains CASENUMBER of another case
36 (does not exist) FirstPaidDate <-- contains CLAIMNUMBER of another case
37 (does not exist) LastPaidDate <-- contains UNIQUE_KEY of another case
38 (does not exist) DistinctPaidDates <-- contains BUS_DAYS of another case
39 (does not exist) MultiplePaidDates <-- contains FIRST_RECEIVEDDATE of another case
40 (does not exist) PaidDateList <-- contains remaining columns of another case
The extra columns 34-40 contain another case’s fields 1-7+, shifted by exactly 33 positions. The replication process is appending a second case’s entire row as extra columns.
We verified this by reading the contaminating case number directly
from the phantom columns, looking it up in PIF_LIST_CLOSED,
and confirming the values match field-by-field.
| Phantom column (SQL Server) | Value found | Maps to | Value from PIF_LIST_CLOSED for FL022084990 | Match |
|---|---|---|---|---|
HCPCS_CODE_LIST (col 34) |
CLOSED |
status | CLOSED |
TRUE |
REVENUECODE_LIST (col 35) |
FL022084990 |
casenumber | FL022084990 |
TRUE |
FirstPaidDate (col 36) |
44108602 |
claimnumber | 44108602 |
TRUE |
LastPaidDate (col 37) |
FL022084990_44108602 |
unique_key | FL022084990_44108602 |
TRUE |
FL022084990 does
not exist in PIF_LIST_ALL as its own
rowPIF_LIST_CLOSED —
it’s a closed case whose data got absorbed into
FL026174482’s row| Phantom column (SQL Server) | Value found | Maps to | Value from PIF_LIST_CLOSED for FL025036053 | Match |
|---|---|---|---|---|
HCPCS_CODE_LIST (col 34) |
CLOSED |
status | CLOSED |
TRUE |
REVENUECODE_LIST (col 35) |
FL025036053 |
casenumber | FL025036053 |
TRUE |
FirstPaidDate (col 36) |
46893019 |
claimnumber | 46893019 |
TRUE |
LastPaidDate (col 37) |
FL025036053_46893019 |
unique_key | FL025036053_46893019 |
TRUE |
FL025036053 exists only in
PIF_LIST_CLOSED, not in PIF_LIST_ALLtotal_rows contains_status starts_with_newline
63225 63225 43570
HCPCS_CODE_LIST
column (col 34)HCPCS_CODE_LIST starting with a newline character
(\n), which is the row-concatenation seamEvery row’s phantom REVENUECODE_LIST (col 35) contains a
case number instead of revenue codes:
casenumber claimnumber status hcpcs_code_list revenuecode_list firstpaiddate
FL027292966 <NA> OPEN \nCLOSED FL026433261 <NA>
FL013528788 <NA> CLOSED \nCLOSED FL08476382 <NA>
FL08474305 <NA> CLOSED \nCLOSED FL08899910 <NA>
FL026433982 <NA> CLOSED \nCLOSED FL010675624 <NA>
FL010235252 <NA> CLOSED \nCLOSED FL08410660 <NA>
FL026433911 <NA> CLOSED \nCLOSED FL09052943 <NA>
FL08381633 <NA> CLOSED \nCLOSED FL010295368 <NA>
FL07715766 <NA> CLOSED \nCLOSED FL013528874 <NA>
FL026433874 <NA> CLOSED \nCLOSED FL026360933 <NA>
FL013325035 <NA> CLOSED \nCLOSED FL018776585 <NA>
Column Value
---------------------------------------- -------------------------------------------------------
status : OPEN
casenumber : FL026174482
claimnumber : 44108531
unique_key : FL026174482_44108531
bus_days : 74.000000
first_receiveddate : 2025-10-14
last_receiveddate : 2025-10-14
first_closeddate : <NA>
last_closeddate : <NA>
rows_merged : 36
max_count : 1
vendorname_list : THE FOOT DOCTORS SANTA CRUZ CTY
vendornumber_list : 1091975
contracttype_list : NC
claimcategorycode_list : MED
determination_list : <NULL>
finalcategory_list : MED
maincategory_list : MED
has_subcategory_list : false
dim_cptprocedure_key_list : 206774,209983
dim_claimprofile_key_list : 11057,16048,24533,54792
claimtypedesc_list : Medical Claims
claimvisittype_list : Other Visit
claimpaidstatus_list : Denied,Paid
cptprocedurecode_list : 11055,11721
icd10diagnosisgroupsize_list : 4
hcpcs_key_list : <NA>
hcpcs_desc_list : Unknown
hcpcs_cat1_list : Unknown
hcpcs_cat2_list : Unknown
icd10_key_list : 2948,45832,63328,63332
icd10_desc_list : Corns And Callosities | Pain In Left Toe(S) | ...
icd10_rollup_list : B35,L84,M79
--- BELOW THIS LINE: PHANTOM COLUMNS (34-40) NOT IN SNOWFLAKE SOURCE ---
--- DATA IS FROM A DIFFERENT CASE (FL022084990) ---
HCPCS_CODE_LIST : \nCLOSED <-- STATUS of other case
REVENUECODE_LIST : FL022084990 <-- CASENUMBER of other case
FirstPaidDate : 44108602 <-- CLAIMNUMBER of other case
LastPaidDate : FL022084990_44108602 <-- UNIQUE_KEY of other case
DistinctPaidDates : 31.000000 <-- BUS_DAYS of other case
MultiplePaidDates : 2024-10-11 <-- FIRST_RECEIVEDDATE of other case
PaidDateList : 2024-10-11|*2024-11-26|*2024-11-26|*18|*1|*
VALLEY CHILDREN'S HOSPITAL OP|*1083798|*C|*
OTC|*Upheld|*Hospital Outpatient|*
<-- ALL remaining columns of other case
The Snowflake source table
PROD_EDW.DATASCIENCE.PIF_LIST_ALL has 33
columns and appears to be clean. The replication process that
copies this table to Alliance_RPT.DataScience.PIF_LIST_ALL
is creating 7 extra columns (34-40) that do not exist
in the source.
These phantom columns contain another case’s data — specifically, the first 7+ fields of an unrelated closed case get appended as new columns on every row:
| Phantom column position | Phantom column name | Contains |
|---|---|---|
| 34 | HCPCS_CODE_LIST |
STATUS of another case |
| 35 | REVENUECODE_LIST |
CASENUMBER of another case |
| 36 | FirstPaidDate |
CLAIMNUMBER of another case |
| 37 | LastPaidDate |
UNIQUE_KEY of another case |
| 38 | DistinctPaidDates |
BUS_DAYS of another case |
| 39 | MultiplePaidDates |
FIRST_RECEIVEDDATE of another case |
| 40 | PaidDateList |
Remaining columns (pipe-delimited) of another case |
The \n (newline) before CLOSED in column 34
is the seam — the second case’s data is appended as a new line during
replication, and the column parser interprets it as additional
columns.
The replication process (Snowflake -> SQL Server) is encountering
newline characters within the data of column 33
(HCPCS_CODE_LIST in Snowflake). The replication likely uses
a text-based export (CSV/flat file) and the newline in the data is being
interpreted as a row delimiter. This causes:
CLOSED (a status from
another case) lands in column 34Alternatively, the replication may be doing a join or union that
combines rows from PIF_LIST_ALL with rows from another
table (like PIF_LIST_CLOSED), and the result has more
columns than the source.
rows_merged = 36 for case FL026174482, which only has 1
claim (max_count = 1) — the extra rows likely come from the
other caseFL022084990,
FL025036053) are closed cases that exist
in PIF_LIST_CLOSED but not in PIF_LIST_ALL as
their own rows — their data was absorbed into other rowsPIF_LIST_ALL’s own schema-- Snowflake: verify only 33 columns
DESCRIBE TABLE PROD_EDW.DATASCIENCE.PIF_LIST_ALL;
-- Snowflake: check if HCPCS_CODE_LIST contains status values
SELECT
COUNT(*) AS total_rows,
SUM(CASE WHEN HCPCS_CODE_LIST LIKE '%CLOSED%'
OR HCPCS_CODE_LIST LIKE '%OPEN%' THEN 1 ELSE 0 END) AS contains_status
FROM PROD_EDW.DATASCIENCE.PIF_LIST_ALL;
-- If contains_status = 0, Snowflake is clean and the issue is in replication
PROD_EDW.DATASCIENCE.PIF_LIST_ALL to
Alliance_RPT.DataScience.PIF_LIST_ALL?-- Snowflake: the affected case
SELECT * FROM PROD_EDW.DATASCIENCE.PIF_LIST_ALL WHERE CASENUMBER = 'FL026174482';
-- Snowflake: the contaminating closed case is NOT in PIF_LIST_ALL
SELECT * FROM PROD_EDW.DATASCIENCE.PIF_LIST_ALL WHERE CASENUMBER = 'FL022084990';
-- returns 0 rows
-- Snowflake: but it IS in PIF_LIST_CLOSED
SELECT STATUS, CASENUMBER, CLAIMNUMBER
FROM PROD_EDW.DATASCIENCE.PIF_LIST_CLOSED
WHERE CASENUMBER = 'FL022084990';
-- returns CLOSED, FL022084990, 44108602
-- These match columns 34-36 of the SQL Server row exactly
rows_merged count may be inflated by rows from
unrelated casesAlliance_RPT), not Snowflake, so the Snowflake data being
clean does not help until replication is fixedPROD_EDW.DATASCIENCE.PIF_LIST_ALL —
33 columns| # | Column |
|---|---|
| 1 | STATUS |
| 2 | CASENUMBER |
| 3 | CLAIMNUMBER |
| 4 | UNIQUE_KEY |
| 5 | BUS_DAYS |
| 6 | FIRST_RECEIVEDDATE |
| 7 | LAST_RECEIVEDDATE |
| 8 | FIRST_CLOSEDDATE |
| 9 | LAST_CLOSEDDATE |
| 10 | ROWS_MERGED |
| 11 | MAX_COUNT |
| 12 | VENDORNAME_LIST |
| 13 | VENDORNUMBER_LIST |
| 14 | CONTRACTTYPE_LIST |
| 15 | CLAIMCATEGORYCODE_LIST |
| 16 | DETERMINATION_LIST |
| 17 | FINALCATEGORY_LIST |
| 18 | MAINCATEGORY_LIST |
| 19 | HAS_SUBCATEGORY_LIST |
| 20 | DIM_CPTPROCEDURE_KEY_LIST |
| 21 | DIM_CLAIMPROFILE_KEY_LIST |
| 22 | CLAIMTYPEDESC_LIST |
| 23 | CLAIMVISITTYPE_LIST |
| 24 | CLAIMPAIDSTATUS_LIST |
| 25 | CPTPROCEDURECODE_LIST |
| 26 | ICD10DIAGNOSISGROUPSIZE_LIST |
| 27 | HCPCS_KEY_LIST |
| 28 | HCPCS_DESC_LIST |
| 29 | HCPCS_CAT1_LIST |
| 30 | HCPCS_CAT2_LIST |
| 31 | ICD10_KEY_LIST |
| 32 | ICD10_DESC_LIST |
| 33 | ICD10_ROLLUP_LIST |
Alliance_RPT.DataScience.PIF_LIST_ALL — 40 columnsColumns 1-33 match Snowflake (but lowercase). Columns 34-40 are phantom columns not in the source:
| # | Column | Notes |
|---|---|---|
| 1-33 | (same as Snowflake, lowercase) | Legitimate data |
| 34 | HCPCS_CODE_LIST | Phantom — contains STATUS of another case |
| 35 | REVENUECODE_LIST | Phantom — contains CASENUMBER of another case |
| 36 | FirstPaidDate | Phantom — contains CLAIMNUMBER of another case |
| 37 | LastPaidDate | Phantom — contains UNIQUE_KEY of another case |
| 38 | DistinctPaidDates | Phantom — contains BUS_DAYS of another case |
| 39 | MultiplePaidDates | Phantom — contains FIRST_RECEIVEDDATE of another case |
| 40 | PaidDateList | Phantom — contains remaining columns of another case (pipe-delimited) |