Introduction: My project is to see which region has higher HIV death rate among the countries, and also which countries have higher HIV death rate. After finding the countries, I will find from my research that which variable has higher impact on HIV control. If I can reach a conclussion, than it would be helpful for Highest HIV death nation to focus on those variable for controlling HIV infection or death. For this research I downloaded data from WHO (World Health Organization), use MySQL database. I have shown my work step by step as follows:
#Create function to download a package
#Sources:http://stackoverflow.com/questions/9341635/check-for-installed-packages-before-running-install-packages
packages <- function(x){
x <- as.character(match.call()[[2]])
if (!require(x,character.only=TRUE)){
install.packages(pkgs=x,repos="http://cran.r-project.org")
require(x,character.only=TRUE)
}
}
#install required packages by using the created function.
packages("rmongodb")
## Loading required package: rmongodb
packages("rjson")
## Loading required package: rjson
packages("mongolite")
## Loading required package: mongolite
packages("RMongo")
## Loading required package: RMongo
## Loading required package: rJava
packages("RMySQL")
## Loading required package: RMySQL
## Loading required package: DBI
##
## Attaching package: 'DBI'
## The following objects are masked from 'package:RMongo':
##
## dbDisconnect, dbGetQuery
packages("DBI")
library(stringr)
library(RCurl)
## Loading required package: bitops
##
## Attaching package: 'RCurl'
## The following object is masked from 'package:rJava':
##
## clone
library(knitr)
library(rmongodb)
library(jsonlite)
##
## Attaching package: 'jsonlite'
## The following objects are masked from 'package:rjson':
##
## fromJSON, toJSON
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(sqldf)
## Loading required package: gsubfn
## Loading required package: proto
## Loading required package: RSQLite
##
## Attaching package: 'RSQLite'
## The following object is masked from 'package:RMySQL':
##
## isIdCurrent
## sqldf will default to using MySQL
library(tidyr)
##
## Attaching package: 'tidyr'
## The following object is masked from 'package:RCurl':
##
## complete
library(ggplot2)
library(rjson)
library(mongolite)
library(RMongo)
library(RMySQL)
library(DBI)
library(RMySQL)
HIV_db1 = dbConnect(MySQL(), user='root', password='abcd1234', host='localhost')
dbSendQuery(HIV_db1, 'CREATE SCHEMA HIV_Final4;')
## <MySQLResult:1197116016,0,0>
dbSendQuery(HIV_db1, 'use HIV_Final4;')
## <MySQLResult:135677488,0,1>
dbSendQuery(HIV_db1,'CREATE TABLE HIV_Death (
Region_Id int not null primary key,
Region_name varchar(25) NOT null,
Number_of_Death varchar(25)
);')
## <MySQLResult:111532240,0,2>
dbSendQuery(HIV_db1,'CREATE TABLE HIV_Infection (
Infection_Region_Id int not null primary key,
Adult_infection_Rate varchar(25) NOT null,
Number_of_Infection varchar(25)
);')
## <MySQLResult:3409800,0,3>
dbSendQuery(HIV_db1,'CREATE TABLE HIV_Living (
Living_Region_Id int not null primary key,
Number_of_Living varchar(25)
);')
## <MySQLResult:112149448,0,4>
dbSendQuery(HIV_db1,'CREATE TABLE Mother_Child_Transmission (
Transmission_Id int not null primary key,
Country_Name varchar (100),
Number_Pregnant_antiretrovirals_preventing varchar (10),
Number_HIV_Need_antiretrovirals varchar (10),
Percentage_HIV_Receive_antiretrovirals varchar (10)
);')
## <MySQLResult:1448831808,0,5>
dbSendQuery(HIV_db1,'CREATE TABLE HIV_Counselling_Receive (
Counselling_Id int not null primary key,
Country_Name varchar (100),
Number_Receive_Counselling varchar (25),
Number_Receive_Counselling_PerThousand varchar (25)
);')
## <MySQLResult:111459640,0,6>
dbSendQuery(HIV_db1,'CREATE TABLE HIV_Theraphy_Coverage (
Coverage_Id int not null primary key,
Country_Name varchar (200),
Percentage_Receive_Therapy varchar (255),
Number_Receive_Therapy varchar (255)
);')
## <MySQLResult:3409800,0,7>
dbSendQuery(HIV_db1,'CREATE TABLE Highest_HIV_death (
Country_Id int not null primary key,
Country_Name varchar (200),
total_death varchar (255),
total_population varchar (255),
Percentage_death varchar (255)
); ')
## <MySQLResult:1450402152,0,8>
I download HIV death, infection and living with HIV virus data from World Health Organization database into my computer desktop and import them from my desktop to MySQL database into the following HIV_Death, HIV_Infection, HIV_Living table respectively. After inserting the values I creat a join statement into MySQL to combine those three tables. My quesry is as follows:
(SELECT Region_Id, Region_name, Number_of_Death, Adult_infection_Rate, Number_of_Infection, Number_of_Living FROM HIV_Death AS d INNER JOIN HIV_Infection AS i ON d.Region_Id = i.Infection_Region_Id INNER JOIN HIV_Living AS l ON i.Infection_Region_Id = l.Living_Region_Id;)
I save those entry into my desktop as Result_Death_Infection_Living.CSV file and read the file into R markdown by following read.csv statement.
# Read Result_Death_Infection_Living.CSV file
HIV_Data <- read.csv('C:/Users/sql_ent_svc/Desktop/Final_presentation_Data/Result_Death_Infection_Living.CSV')
head (HIV_Data)
## Region_Id Region_name Number_of_Death Adult_infection_Rate
## 1 1 Africa 800 000 2.6
## 2 2 Americas 62 000 0.3
## 3 3 South-East Asia 130 000 0.2
## 4 4 Europe 56 000 0.4
## 5 5 Eastern Mediterranean 15 000 0.1
## 6 6 Western Pacific 44 000 0.1
## Number_of_Infection Number_of_Living
## 1 1 400 000 25 500 000
## 2 150 000 3 400 000
## 3 230 000 3 500 000
## 4 170 000 2 500 000
## 5 42 000 330 000
## 6 95 000 1 400 000
#Cleanning the data by deleting the spaces between the numbers.
HIV_Data <- HIV_Data %>% mutate_each(funs(gsub(" ", "", .)),Number_of_Death:Number_of_Living)
head (HIV_Data)
## Region_Id Region_name Number_of_Death Adult_infection_Rate
## 1 1 Africa 800000 2.6
## 2 2 Americas 62000 0.3
## 3 3 South-East Asia 130000 0.2
## 4 4 Europe 56000 0.4
## 5 5 Eastern Mediterranean 15000 0.1
## 6 6 Western Pacific 44000 0.1
## Number_of_Infection Number_of_Living
## 1 1400000 25500000
## 2 150000 3400000
## 3 230000 3500000
## 4 170000 2500000
## 5 42000 330000
## 6 95000 1400000
# Pie Chart with Percentages of Death due to HIV throughout the region
Death <- c(800000, 62000, 130000, 56000, 15000, 44000)
Region <- c("Africa", "America","South-East Asia","Europe", "Eastern Mediterranean", "Western Pacific")
pct <- round(Death/sum(Death)*100)
Region<- paste(Region, pct) # add percents to labels
Region <- paste(Region,"%",sep="") # ad % to labels
pie(Death,labels = Region, col=rainbow(length(Region)),
main="Pie Chart of Death due to HIV throughout the region")
From the Pie Chart we can see Africa region has the Highest percentages of Death 72% and Southest Asia has second Highest percentages of Death 12%
# Pie Chart with Percentages of HIV Infection throughout the region
Infection <- c(1400000, 150000, 230000, 170000, 42000, 95000)
Region <- c("Africa", "America","South-East Asia","Europe", "Eastern Mediterranean", "Western Pacific")
pct <- round(Infection/sum(Infection)*100)
Region<- paste(Region, pct) # add percents to labels
Region <- paste(Region,"%",sep="") # ad % to labels
pie(Infection,labels = Region, col=rainbow(length(Region)),
main="Pie Chart of Percentages of HIV Infection throughout the region")
From the Pie Chart we can see Africa region has the Highest percentages of Infection 67% and Southest Asia has second Highest percentages of Infection 11%
#install.packages("reshape")
library(reshape)
##
## Attaching package: 'reshape'
## The following objects are masked from 'package:tidyr':
##
## expand, smiths
## The following object is masked from 'package:dplyr':
##
## rename
HIV_Death1 <- data.frame(Region=c('Africa','America','South East Asia','Europe','Eastern Mediterranean', 'Western Pacific'), Total_Death= c(800000, 62000, 130000, 56000, 15000, 44000), Infection=c(1400000, 150000, 230000, 170000, 42000, 95000) , Living_With_HIV=c(25500000, 3400000, 3500000, 2500000, 330000, 1400000 ))
HIV_Death2 <- melt(HIV_Death1, id.vars='Region')
ggplot(HIV_Death2, aes(variable, value, width=2.5)) +
geom_bar(aes(fill = Region), position = "dodge", stat="identity")
## Warning: position_dodge requires non-overlapping x intervals
From the above bar plot we can see that Africa has Highest infection rate, Highest HIv death compare to all other region.
Now, to chech which top 20 countries has highest rate of HIV death, Infection compare to their population I created Highest_HIV_death table in MySQL. Then from the website WHO I downloaded the data and imported to MySQL database. Then I imported that table into R.
# Read table Highest_Death_2015.CSV
HIV_Data_Highest_Death1 <- read.csv('C:/Users/sql_ent_svc/Desktop/Final_presentation_Data/Highest_Death_2015.CSV')
#Cleanning the data by removing the commas.
Highest_Death2 <- HIV_Data_Highest_Death1 %>% mutate_each(funs(gsub(",", "", .)), Death:total_population)
Highest_Death2
## Country Death total_population
## 1 Nigeria 239700 182201962
## 2 South Africa 235100 54490406
## 3 India 135500 1311050527
## 4 Tanzania 80000 52646521
## 5 Mozambique 76800 27597070
## 6 Uganda 63300 38407677
## 7 Kenya 57500 46050302
## 8 Ethiopia 47200 99390750
## 9 Malawi 45600 17215232
## 10 Zimbabwe 39500 15424303
## 11 Cameroon 34600 23058597
## 12 Congo, Democratic Republic of the 31700 4562646
## 13 Cote d'Ivoire 31200 22701556
## 14 Zambia 30300 15519000
## 15 Indonesia 26800 257563815
## 16 China 26000 1376048943
## 17 Thailand 20800 67959359
## 18 Ukraine 18100 42635097
## 19 United States 17000 320090857
## 20 Lesotho 15500 2122110
## Perchentage_death
## 1 0.131557310
## 2 0.431452098
## 3 0.010335223
## 4 0.151956860
## 5 0.278290413
## 6 0.164810801
## 7 0.124863459
## 8 0.047489329
## 9 0.264881705
## 10 0.256089368
## 11 0.150052495
## 12 0.694772288
## 13 0.137435513
## 14 0.195244539
## 15 0.010405188
## 16 0.001889468
## 17 0.030606528
## 18 0.042453287
## 19 0.005310992
## 20 0.730405116
From this Highest_Death2 table I will creat bar plot and Pie chart to see which countries are the highest death in HIV
# Simple Bar Plot of Highest 20 countries of HIV Death
ggplot(Highest_Death2, aes(Country, Perchentage_death, width= .75)) +
geom_bar(aes(fill = Country), position = "dodge", stat="identity")
# According to the bar graph here we can see that Lesotho, Congo, South Africa, Mozambique are the first Highest, 2nd highest, 3rd highest, and fourth highest HIV Death percentage respectively in regards to percentage of death.
# Pie Chart of HIV Death throughout the Country
Death <- c(239700 , 235100, 135500, 80000, 76800, 63300, 57500, 47200, 45600,39500, 34600, 31700 , 31200 , 30300 , 26800 , 26000 , 20800, 18100, 17000, 15500)
Country <- c("Nigeria", " South Africa","India","Tanzania ", "Mozambique", "Uganda", " Kenya", " Ethiopia","Malawi"," Zimbabwe ", " Cameroon ", " Congo", " Cote d'Ivoire ", "Zambia ", "Indonesia", " China", "Thailand","Ukrain"," USA ", " Lesotho")
pie(Death,labels = Country, col=rainbow(length(Country)),
main="Pie Chart of HIV Death throughout the Country")
#According to the piechart here NIgeria, South Africa, and India are the first highest, 2nd Higest and 3rd highest nation of HIV Total Death respectively in regards to total death due to HIV in 2015.
I downloaded HIV Transmission from to child, HIV counselling Receive, HIV Theraphy receive data from World Health Organization database into my computer desktop and import them from my desktop to MySQL database into the following Mother_Child_Transmission, HIV_Counselling_Receive, HIV_Theraphy_Coverage table respectively. After inserting the values I creat a join statement into MySQL to combine those three tables. My quesry is as follows:
(SELECT * FROM Mother_Child_Transmission AS m INNER JOIN HIV_Counselling_Receive AS c ON m. Country_Name = c. Country_Name INNER JOIN HIV_Theraphy_Coverage AS t ON c. Country_Name = t. Country_Name; )
I save those entry into my desktop as Result_Transmission_Counselling_Theraphy1.CSV file and read the file into R markdown by following read.csv statement.
HIV_Data_Counselling <- read.csv('C:/Users/sql_ent_svc/Desktop/Final_presentation_Data/Result_Transmission_Counselling_Theraphy1.CSV')
head (HIV_Data_Counselling)
## Country antiretro_preventing Need_antiretro
## 1 Afghanistan 2 500
## 2 Albania 1 No data
## 3 Algeria 112 No data
## 4 Angola 8 709 19 000
## 5 Antigua and Barbuda No data
## 6 Argentina 1 401 No data
## Receive_antiretro. Receive_Counselling Receive_Counselling_Pthous
## 1 1 359 435 21
## 2 No data No data No data
## 3 32 668 996 23
## 4 45 No data No data
## 5 No data No data No data
## 6 No data 535 562 17
## Receive_Therapy_Per Receive_Therapy
## 1 4 281
## 2 No data 354
## 3 57 6 020
## 4 25 76 666
## 5 No data No data
## 6 47 59 751
After inserting the values I creat a join statement into MySQL to combine the values of the two tables Result_Transmission_Counselling_Theraphy1.CSV and Highest_Death_2015.CSV . My quesry is as follows:
(SELECT * FROM Highest_HIV_death AS h INNER JOIN HIV_Counselling_Receive AS c ON h. Country_Name = c. Country_Name INNER JOIN HIV_Theraphy_Coverage AS t ON h. Country_Name = t. Country_Name INNER JOIN Mother_Child_Transmission AS m ON h. Country_Name = m. Country_Name;)
I save those entry into my desktop as Result_Highest_Transmission_Counselling_Theraphy1.CSV file and read the file into R markdown by following read.csv statement.
#Reade the file
HIV_Data_HighestRisk <- read.csv('C:/Users/sql_ent_svc/Desktop/Final_presentation_Data/Result_Highest_Transmission_Counselling_Theraphy1.CSV', header = TRUE, stringsAsFactors = TRUE)
#Cleanning the data by removing dot, comma, and spaces.
HIV_Data_HighestRisk1 <- HIV_Data_HighestRisk %>%
mutate_each(funs(as.character(.)), total_death:total_population) %>%
mutate_each(funs(gsub(",", "", .)),total_death:total_population) %>%
mutate_each(funs(gsub(" ", "", .)),Receive_Counselling:Receive_antiretrovirals_per) %>%
mutate_each(funs(as.numeric(.)), total_death:total_population)
head (HIV_Data_HighestRisk1)
## Country_name total_death total_population Percentage_death
## 1 Cote d'Ivoire 31200 22701556 0.137435513
## 2 Cameroon 34600 23058597 0.150052495
## 3 China 26000 1376048943 0.001889468
## 4 Ethiopia 47200 99390750 0.047489329
## 5 India 135500 1311050527 0.010335223
## 6 Indonesia 26800 257563815 0.010405188
## Receive_Counselling Receive_Counselling_PerThousand Receive_Therapy_per
## 1 1685285 138 31
## 2 667770 51 22
## 3 2474891 55 34
## 4 8885851 159 50
## 5 24476271 27 21
## 6 1105944 6 8
## Receive_Therapy antiretrovirals_preventing Need_antiretrovirals
## 1 140710 17763 22000
## 2 145038 22297 34000
## 3 295358 3576 33000
## 4 362041 20149 28000
## 5 830707 10656 21000
## 6 50072 1368 14000
## Receive_antiretrovirals_per
## 1 80
## 2 66
## 3 23
## 4 73
## 5 27
## 6 10
HIV_Data_HighestRisk1
## Country_name total_death total_population Percentage_death
## 1 Cote d'Ivoire 31200 22701556 0.137435513
## 2 Cameroon 34600 23058597 0.150052495
## 3 China 26000 1376048943 0.001889468
## 4 Ethiopia 47200 99390750 0.047489329
## 5 India 135500 1311050527 0.010335223
## 6 Indonesia 26800 257563815 0.010405188
## 7 Kenya 57500 46050302 0.124863459
## 8 Lesotho 15500 2122110 0.730405116
## 9 Malawi 45600 17215232 0.264881705
## 10 Mozambique 76800 27597070 0.278290413
## 11 Nigeria 239700 182201962 0.131557310
## 12 South Africa 235100 54490406 0.431452098
## 13 Thailand 20800 67959359 0.030606528
## 14 Uganda 63300 38407677 0.164810801
## 15 Ukraine 18100 42635097 0.042453287
## 16 Zambia 30300 15519000 0.195244539
## 17 Zimbabwe 39500 15424303 0.256089368
## Receive_Counselling Receive_Counselling_PerThousand Receive_Therapy_per
## 1 1685285 138 31
## 2 667770 51 22
## 3 2474891 55 34
## 4 8885851 159 50
## 5 24476271 27 21
## 6 1105944 6 8
## 7 7769024 294 55
## 8 475714 354 35
## 9 1895058 205 50
## 10 5726580 395 42
## 11 5943493 60 22
## 12 5622360 151 45
## 13 1341041 24 61
## 14 8638851 429 50
## 15 2776548 27 21
## 16 2453242 305 57
## 17 1465298 165 51
## Receive_Therapy antiretrovirals_preventing Need_antiretrovirals
## 1 140710 17763 22000
## 2 145038 22297 34000
## 3 295358 3576 33000
## 4 362041 20149 28000
## 5 830707 10656 21000
## 6 50072 1368 14000
## 7 755226 50259 75000
## 8 111322 8065 11000
## 9 536527 38506 60000
## 10 646312 94883 100000
## 11 747382 60955 210000
## 12 3078570 263674 240000
## 13 271652 4587 4800
## 14 749308 112909 120000
## 15 216781 21671 21600
## 16 658831 55045 64000
## 17 787980 58667 75000
## Receive_antiretrovirals_per
## 1 80
## 2 66
## 3 23
## 4 73
## 5 27
## 6 10
## 7 67
## 8 72
## 9 64
## 10 91
## 11 29
## 12 95
## 13 95
## 14 92
## 15 64
## 16 86
## 17 78
# Convering the Char variable type data into numeric variable type data for our statistical research purposes.
HIV_Data_HighestRisk2 <- transform(HIV_Data_HighestRisk1, Receive_Counselling = as.numeric(Receive_Counselling))
HIV_Data_HighestRisk2 <- transform(HIV_Data_HighestRisk2, Receive_Counselling_PerThousand = as.numeric(Receive_Counselling_PerThousand))
HIV_Data_HighestRisk2 <- transform(HIV_Data_HighestRisk2, Receive_Therapy_per = as.numeric(Receive_Therapy_per))
HIV_Data_HighestRisk2 <- transform(HIV_Data_HighestRisk2, Receive_Therapy = as.numeric(Receive_Therapy))
HIV_Data_HighestRisk2 <- transform(HIV_Data_HighestRisk2, antiretrovirals_preventing = as.numeric(antiretrovirals_preventing ))
HIV_Data_HighestRisk2 <- transform(HIV_Data_HighestRisk2, Need_antiretrovirals = as.numeric (Need_antiretrovirals ))
HIV_Data_HighestRisk2 <- transform(HIV_Data_HighestRisk2, Receive_antiretrovirals_per = as.numeric(Receive_antiretrovirals_per))
st = as.data.frame(HIV_Data_HighestRisk2)
str(st)
## 'data.frame': 17 obs. of 11 variables:
## $ Country_name : Factor w/ 17 levels "Cameroon","China",..: 3 1 2 4 5 6 7 8 9 10 ...
## $ total_death : num 31200 34600 26000 47200 135500 ...
## $ total_population : num 2.27e+07 2.31e+07 1.38e+09 9.94e+07 1.31e+09 ...
## $ Percentage_death : num 0.13744 0.15005 0.00189 0.04749 0.01034 ...
## $ Receive_Counselling : num 1685285 667770 2474891 8885851 24476271 ...
## $ Receive_Counselling_PerThousand: num 138 51 55 159 27 6 294 354 205 395 ...
## $ Receive_Therapy_per : num 31 22 34 50 21 8 55 35 50 42 ...
## $ Receive_Therapy : num 140710 145038 295358 362041 830707 ...
## $ antiretrovirals_preventing : num 17763 22297 3576 20149 10656 ...
## $ Need_antiretrovirals : num 22000 34000 33000 28000 21000 14000 75000 11000 60000 100000 ...
## $ Receive_antiretrovirals_per : num 80 66 23 73 27 10 67 72 64 91 ...
#Finding Correlation Coefficient between the variables to see which variable has most impact to the HIV death or infection.
x <- HIV_Data_HighestRisk2[2]
y <- HIV_Data_HighestRisk2[3]
cor(x, y)
## total_population
## total_death 0.1078012
#total_death and total_population has weak positive correlation 0.1078012 which means if population increases, death also increase.
x <- HIV_Data_HighestRisk2[2]
y <- HIV_Data_HighestRisk2[6]
cor(x, y)
## Receive_Counselling_PerThousand
## total_death -0.1057016
# total_death and Receive_Counselling has weak negative correlation -0.1057016 which means if counselling increase death decrease.
x <- HIV_Data_HighestRisk2[2]
y <- HIV_Data_HighestRisk2[7]
cor(x, y)
## Receive_Therapy_per
## total_death -0.1372914
# total_death and Receive_Theraphy has negative correlation -0.1372914 which means if theraphy increases, percentage of death decreases.
x <- HIV_Data_HighestRisk2[2]
y <- HIV_Data_HighestRisk2[9]
cor(x, y)
## antiretrovirals_preventing
## total_death 0.6591505
# total_death and antiretrovirals_preventing has strong positive correlation 0.6591505 which means if the countries which have higher rate of HIV death need more antiretrovirals Preventing Medicine.
x <- HIV_Data_HighestRisk2[2]
y <- HIV_Data_HighestRisk2[11]
cor(x, y)
## Receive_antiretrovirals_per
## total_death -0.1068921
# total_death and Receive_antiretrovirals_per has negative correlation -0.1068921 which means if more antiretrovirals receive, total death of HIV death decreases.
pairs(HIV_Data_HighestRisk2) # Correlation coefficient for all the pairs.
#Finding a regression model for the HIV death variable with the help of those variable which has most impact on HIV death variable
input <- st[,c("total_death","Receive_Counselling_PerThousand","Receive_Therapy_per","Receive_antiretrovirals_per")]
model <- lm( total_death ~ Receive_Counselling_PerThousand + Receive_Therapy_per + Receive_antiretrovirals_per, data=input)
# Show the model.
print(model)
##
## Call:
## lm(formula = total_death ~ Receive_Counselling_PerThousand +
## Receive_Therapy_per + Receive_antiretrovirals_per, data = input)
##
## Coefficients:
## (Intercept) Receive_Counselling_PerThousand
## 90440.873 -22.515
## Receive_Therapy_per Receive_antiretrovirals_per
## -515.991 7.225
# Get the Intercept and coefficients as vector elements.
cat("# # # # The Coefficient Values # # # ","\n")
## # # # # The Coefficient Values # # #
a <- coef(model)[1]
print(a)
## (Intercept)
## 90440.87
Counselling <- coef(model)[2]
Theraphy <- coef(model)[3]
Antiretroviral_receive <- coef(model)[4]
print(Counselling)
## Receive_Counselling_PerThousand
## -22.51496
print(Theraphy)
## Receive_Therapy_per
## -515.9909
print(Antiretroviral_receive)
## Receive_antiretrovirals_per
## 7.225037
#Regression Model: HIV_death_Control_Model = 90440.87 +(-22.51496 )*Receive_Counselling_PerThousand+(-515.9909 )*Receive_Therapy_per+(7.225037 )*Receive_antiretrovirals_per
Conclussion: First, I have shown here which region has highest rate of HIV death and infection, then I showed which countries have highest rate of HIV death and infection and then by using correlation and regression analysis I showed the most impact variable on HIV death and infection. Finally, I can say from my research above that Counselling, theraphy and antiretroviral receive has the most impact on HIV controlling. And the above regression model and correlation analysis clearly stated my conclussion.