Proximity Analysis 2016

M. Fawcett - 11/22/2016 - Original

Introduction

This report is an analysis of patients admitted to a CCHS hospital during 2016 who had an ICD9 diagnosis code indicating opioid dependency or withdrawal effect.

(See ProximityAnalysis.Rmd for a similar analysis of calendar year 2015 patients.)

It provides statistical analysis of the distance between patient home residence and office location of top opioid prescribers to see if there could be a relationship between the distances and the likelihood of an opioid related complication or diagnosis during patient hospitalization. Meaning, do patients who live closer to high prescribers have a greater likelihhood of being coded with an opioid related condition while in the hospital.

Assumes an MS SQL Server database named OWRA on server VI_SS_Research exists and that an ODBC connection object pointing to it has been set up on this workstation.

Opioid prescriptions and prescribers data came from Medicare Part D data from www.data.gov

See: SQL Server VI_SS_Research. Database OWRA for SQL tables & stored procedures.
See: \\ccor_ds\common\OpiateWithdraw\preparation\analysis\R for R code & data.

Do some workspace set up.

## Clean up workspace
rm(list = ls())

#Load libraries
if (!require("curl")) install.packages("curl")
require("curl") ## for coding Web addresses

if (!require("httr")) install.packages("httr")
require("httr")  ## for posting soap requests

if (!require("RODBC")) install.packages("RODBC")
require("RODBC")  ## for working with SQL Server

if (!require("sp")) install.packages("sp")
require("sp")  ## for distance between points calculations.

## Set random number seed for reproducibility purposes.
set.seed(1234)

## Set working directory for R code
workingdir = "\\\\ccor_ds\\common\\dbase\\OpiateWithdraw\\preparation\\analysis\\R\\Code"
setwd(workingdir)

## Read data file
## Dir path for Windows: 
datapath <- "\\\\ccor_ds\\common\\dbase\\OpiateWithdraw\\preparation\\analysis\\R\\Data"

Build Dataset

This assumes that pre-work has been done to create a SQL Server table containing a population of patients and mean distances to top opioid prescribers.

## Build distance/outcome matrix.
## Create connection to database
ch <- odbcConnect("OWRA")

## create SQL query string to return distance/outcome
sql <- "SELECT [GeometricMeanDistance], CASE WHEN COALESCE(ADM, PRI, SEC) IS NULL THEN '0' ELSE '1' END AS Outcome FROM [study].[tblPopulationCY2016] WHERE [GeometricMeanDistance] > 0 "

## Execute query
dist.df <- as.data.frame(sqlQuery(ch, sql))


odbcClose(ch)

Analyze Results

All distances are in miles.

Frequency distribution of distances between patients and prescribers:

Below are QQ plots to assess normal distribution of distance data. The data does not appear to be normally distributed.

## Q-Q Plot with no opioid codes
qqnorm(subset(dist.df [,1], dist.df [, 2] == 0), main = 'Q-Q Plot of Distances, No Opioid Code, CY2016'); qqline(subset(dist.df [,1], dist.df [, 2] == 0))

## Q-Q Plot with opioid codes
qqnorm(subset(dist.df [,1], dist.df [, 2] == 1), main = 'Q-Q Plot of Distances, At Least One Opioid Code, CY2016'); qqline(subset(dist.df [,1], dist.df [, 2] == 1))

## Q-Q Plot of both distributions
qqplot(x = subset(dist.df [,1], dist.df [, 2] == 0), y = subset(dist.df [,1], dist.df [, 2] == 1), main = 'Q-Q Plot of Both Distributions, CY2016')

The mean GeometricMeanDistance between patients who had opioid condition codes and prescribers: 11.3332946

The mean GeometricMeanDistance between patients who did not have opioid condition codes and prescribers: 11.8007546

Hypothesis Test

Null Hypothesis: There is no difference in the geometric mean distance between patient and prescribers for the two groups in CY2016.

## 2 Sample t-test comparing overall mean distance between patient and prescribers for the two groups (with opioid diagnosis codes and without opioid diagnosis codes)
t.test(subset(dist.df [,1], dist.df [, 2] == 0), subset(dist.df [,1], dist.df [, 2] == 1))

## 
##  Welch Two Sample t-test
## 
## data:  subset(dist.df[, 1], dist.df[, 2] == 0) and subset(dist.df[, 1], dist.df[, 2] == 1)
## t = 1.4407, df = 261.07, p-value = 0.1509
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1714662  1.1063862
## sample estimates:
## mean of x mean of y 
##  11.80075  11.33329

With the difference between means of the two groups having a p-value = 0.1509, we cannot discard the null hypothesis that the geometric mean distance between patient and prescribers for the two groups are the same.

However the distribution of distances within the two samples was not strongly normal, so this conclusion may not be valid.