Forex Part 02 Automated Trading Championship 2012

Background

The data for this R script comes from a web site called MQL5.com. However, there are NO CSV files, hence we have to scrape the
data from the web page using the XML package.

Script

(1) Read the data.

(2) Part A: Which country has the most participants in ATC 2012?

(3) Part B: Which country has the best participants in ATC 2012?

(4) Part C: What is the correlation of deals, trades and profit?

source("C:/Users/denbrige/100 FxOption/103 FxOptionVerBack/080 Fx Git/R-source/PlusForex.R", 
    echo = FALSE)
## Loading required package: R.oo
## Loading required package: R.methodsS3
## R.methodsS3 v1.4.2 (2012-06-22) successfully loaded. See ?R.methodsS3 for
## help.
## R.oo v1.10.1 (2012-10-16) successfully loaded. See ?R.oo for help.
## Attaching package: 'R.oo'
## The following object(s) are masked from 'package:methods':
## 
## getClasses, getMethods
## The following object(s) are masked from 'package:base':
## 
## attach, detach, gc, load, save
## R.utils v1.16.2 (2012-09-12) successfully loaded. See ?R.utils for help.
## Attaching package: 'R.utils'
## The following object(s) are masked from 'package:utils':
## 
## timestamp
## The following object(s) are masked from 'package:base':
## 
## cat, commandArgs, getOption, inherits, isOpen, lapply, parse, warnings
library(psych)
library(RColorBrewer)
library(wordcloud)
## Loading required package: Rcpp
library(gclus)
## Loading required package: cluster
library(ltm)
## Loading required package: MASS
## Loading required package: msm
## Loading required package: mvtnorm
## Loading required package: polycor
## Loading required package: sfsmisc
## Attaching package: 'polycor'
## The following object(s) are masked from 'package:psych':
## 
## polyserial
## Attaching package: 'ltm'
## The following object(s) are masked from 'package:psych':
## 
## factor.scores

#
# |------------------------------------------------------------------------------------------|
# | M A I N P R O C E D U R E |
# |------------------------------------------------------------------------------------------|
# --- Init loading data
rawDfr <- forexAtcReadDfr()
# --- Count of rows of data
nrow(rawDfr)
## [1] 451

# --- Coerce character into numeric or date
rawDfr[, 5] <- suppressWarnings(as.numeric(gsub(" ", "", rawDfr[, 5])))  # deals
rawDfr[, 6] <- suppressWarnings(as.numeric(gsub(" ", "", rawDfr[, 6])))  # trades
rawDfr[, 7] <- suppressWarnings(as.numeric(gsub(" ", "", rawDfr[, 7])))  # pf
rawDfr[, 8] <- suppressWarnings(as.numeric(gsub(" ", "", rawDfr[, 8])))  # balance
rawDfr[, 9] <- suppressWarnings(as.numeric(gsub(" ", "", rawDfr[, 9])))  # profit
rawDfr[, 10] <- suppressWarnings(as.numeric(gsub(" ", "", rawDfr[, 10])))  # equity

Part A: Which country has the most participants in ATC 2012?

The country that has the most participants in ATC 2012 is Russia Fedration with 156 participants.

#
# |------------------------------------------------------------------------------------------|
# | P A R T A P R O C E D U R E |
# |------------------------------------------------------------------------------------------|
# --- Plot a wordcloud
if (length(unique(rawDfr$country)) > 1) {
    par(mfrow = c(1, 1), mar = c(2.1, 2.1, 2.1, 2.1))

    nameDfr <- rawDfr$country
    wordcloud(gsub(" ", ".", rawDfr$country), colors = brewer.pal(6, "Set2"), 
        random.order = FALSE)
}
## Loading required package: tm

plot of chunk unnamed-chunk-2

Part B: Which country has the best participants in ATC 2012?

The country that has the best participants in ATC 2012 is South Africa with a median profit of about $23,000.

However, the country that has the top participant (JPAlonso) is USA with a profit of about $90,000.

#
# |------------------------------------------------------------------------------------------|
# | P A R T B P R O C E D U R E |
# |------------------------------------------------------------------------------------------|
# --- Plot a boxplot
boxDfr <- data.frame(name = rawDfr$country, value = rawDfr$equity - 10000)

par(mfrow = c(1, 1), las = 2, mar = c(10.1, 5.1, 2.1, 2.1))
countryFtr <- ForexBoxplotFtr(boxDfr, FUN = median, main = "ATC 2012 Median Profit Ranking by Country")
abline(h = 0, col = "red")

plot of chunk unnamed-chunk-3


par(mfrow = c(1, 1), las = 2, mar = c(10.1, 5.1, 2.1, 2.1))
countryFtr <- ForexBoxplotFtr(boxDfr, FUN = max, main = "ATC 2012 Max Profit Ranking by Country")
abline(h = 0, col = "red")

plot of chunk unnamed-chunk-3

Part C: What is the correlation of deals, trades and profit?

#
# |------------------------------------------------------------------------------------------|
# | P A R T C P R O C E D U R E |
# |------------------------------------------------------------------------------------------|
# --- Scatterplot and Correlation Analysis (library gclus and ltm)
# Scatterplot
subDfr <- data.frame(deals = rawDfr$deals, trades = rawDfr$trades, profit = rawDfr$equity - 
    10000)
par(mfrow = c(1, 1), las = 1)
cpairs(subDfr, gap = 0.5, panel.colors = dmat.color(abs(cor(subDfr))), col = rgb(0, 
    0, 0, 0.1), main = "RAW Variables Ordered and Colored by Correlations (New)")

plot of chunk unnamed-chunk-4


# --- Correlation matrix
cor(subDfr)
##           deals   trades   profit
## deals   1.00000  0.90824 -0.03733
## trades  0.90824  1.00000 -0.04963
## profit -0.03733 -0.04963  1.00000

# --- Perform correlation test for matrix (library ltm) Correlation null
# hypothesis is that the correlation is zero (not correlated) If the
# p-value is less than the alpha level, then the null hypothesis is
# rejected Check for correlation p<0.05 is correlated
rcor.test(subDfr)
## 
##        deals  trades profit
## deals   *****  0.908 -0.037
## trades <0.001  ***** -0.050
## profit  0.429  0.293  *****
## 
## upper diagonal part contains correlation coefficient estimates 
## lower diagonal part contains corresponding p-values

# --- Simple Regression (unstandardized) Y = price; X = area;
raw1Lm <- lm(subDfr$profit ~ subDfr$trades)
summary(raw1Lm)
## 
## Call:
## lm(formula = subDfr$profit ~ subDfr$trades)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
##  -9537  -3856   -231   1778  87297 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept)    -341.18     391.58   -0.87     0.38
## subDfr$trades    -4.31       4.09   -1.05     0.29
## 
## Residual standard error: 7380 on 449 degrees of freedom
## Multiple R-squared: 0.00246, Adjusted R-squared: 0.000242 
## F-statistic: 1.11 on 1 and 449 DF,  p-value: 0.293

#
# |------------------------------------------------------------------------------------------|
# | E N D O F S C R I P T |
# |------------------------------------------------------------------------------------------|