Skip to content

Commit 40271ea

Browse files
committed
published Lasso, Elastic-Net and Ridge Regression paper
1 parent a412d8e commit 40271ea

File tree

4,465 files changed

+105884
-3723
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

4,465 files changed

+105884
-3723
lines changed

README.html

+63-3
Original file line numberDiff line numberDiff line change
@@ -103,10 +103,71 @@
103103
<div class="container-fluid main-container">
104104

105105
<!-- tabsets -->
106+
107+
<style type="text/css">
108+
.tabset-dropdown > .nav-tabs {
109+
display: inline-table;
110+
max-height: 500px;
111+
min-height: 44px;
112+
overflow-y: auto;
113+
background: white;
114+
border: 1px solid #ddd;
115+
border-radius: 4px;
116+
}
117+
118+
.tabset-dropdown > .nav-tabs > li.active:before {
119+
content: "";
120+
font-family: 'Glyphicons Halflings';
121+
display: inline-block;
122+
padding: 10px;
123+
border-right: 1px solid #ddd;
124+
}
125+
126+
.tabset-dropdown > .nav-tabs.nav-tabs-open > li.active:before {
127+
content: "";
128+
border: none;
129+
}
130+
131+
.tabset-dropdown > .nav-tabs.nav-tabs-open:before {
132+
content: "";
133+
font-family: 'Glyphicons Halflings';
134+
display: inline-block;
135+
padding: 10px;
136+
border-right: 1px solid #ddd;
137+
}
138+
139+
.tabset-dropdown > .nav-tabs > li.active {
140+
display: block;
141+
}
142+
143+
.tabset-dropdown > .nav-tabs > li > a,
144+
.tabset-dropdown > .nav-tabs > li > a:focus,
145+
.tabset-dropdown > .nav-tabs > li > a:hover {
146+
border: none;
147+
display: inline-block;
148+
border-radius: 4px;
149+
}
150+
151+
.tabset-dropdown > .nav-tabs.nav-tabs-open > li {
152+
display: block;
153+
float: none;
154+
}
155+
156+
.tabset-dropdown > .nav-tabs > li {
157+
display: none;
158+
}
159+
</style>
160+
106161
<script>
107162
$(document).ready(function () {
108163
window.buildTabsets("TOC");
109164
});
165+
166+
$(document).ready(function () {
167+
$('.tabset-dropdown > .nav-tabs > li').click(function () {
168+
$(this).parent().toggleClass('nav-tabs-open')
169+
});
170+
});
110171
</script>
111172

112173
<!-- code folding -->
@@ -115,7 +176,6 @@
115176

116177

117178

118-
119179
<div class="fluid-row" id="header">
120180

121181

@@ -184,11 +244,11 @@ <h3>1.1) Answer</h3>
184244
<li><s>GO-GARCH</s></li>
185245
<li><s>Copula-GARCH</s></li>
186246
</ul>
187-
<p>In order to started the high-frequency-trading statistical modelling, I inspect the dataset via <a href="http://rpubs.com/englianhu/handle-missing-value">binary.com面试试题 I - 单变量数据缺失值管理</a> and also <a href="http://rpubs.com/englianhu/handle-multivariate-missing-value">binary.com 面试试题 I - 多变量数据缺失值管理 II</a> but the univariate modelling caused some statistical error. The papers compares multi-methods like <code>interpolatan</code>, <code>kalman</code>, <code>locf</code> and <code>ma</code>. The <a href="http://rpubs.com/englianhu/binary-Q1Inter-HFT">binary.com Interview Question I - Interday High Frequency Trading Models Comparison</a> compares SARIMA, mcsGARCH, <s>midasr, midas-garch, Levy process</s> models.</p>
247+
<p>In order to started the high-frequency-trading statistical modelling, I inspect the dataset via <a href="http://rpubs.com/englianhu/handle-missing-value">binary.com面试试题 I - 单变量数据缺失值管理</a> and also <a href="http://rpubs.com/englianhu/handle-multivariate-missing-value">binary.com 面试试题 I - 多变量数据缺失值管理 II</a> but the univariate modelling caused some statistical error. The papers compares multi-methods like <code>interpolatan</code>, <code>kalman</code>, <code>locf</code> and <code>ma</code>. The <a href="http://rpubs.com/englianhu/binary-Q1Inter-HFT">binary.com Interview Question I - Interday High Frequency Trading Models Comparison</a> compares ts, msts, SARIMA, mcsGARCH, <s>midasr, midas-garch, Levy process</s> models.</p>
188248
</div>
189249
<div id="blooper" class="section level3">
190250
<h3>1.2) <span style="color:red">Blooper</span></h3>
191-
<p>Initially, I wrote a shiny app (as showing in below gif file) but it is heavily budden for loading. Kindly browse over <a href="https://beta.rstudioconnect.com/content/2367/">ShinyApp</a> which contain the questions and answers of 3 questions. For the staking model, I simply forecast the highest and lowest price, and then :</p>
251+
<p>Initially, I wrote a shiny app (as showing in below gif file) but it is heavily budden for loading. Kindly browse over <a href="https://beta.rstudioconnect.com/content/2367/">ShinyApp</a> (Kindly refer to <a href="http://rpubs.com/englianhu/binary-Q1L-EN-R">binary.com Interview Question I - Lasso, Elastic-Net and Ridge Regression</a> for more information) which contain the questions and answers of 3 questions. For the staking model, I simply forecast the highest and lowest price, and then :</p>
192252
<ul>
193253
<li>Kelly criterion and using highest or lowest price for closing transaction, otherwise using closing price if the forecasted lowest/highest price is not occur.</li>
194254
<li>Placed $100 an each of the forecasted variance value and do the settlement based on the real variance value.</li>

README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -59,11 +59,11 @@ Besides, I wrote a shinyApp which display the real-time price through API. Kindl
5959
- <s>GO-GARCH</s>
6060
- <s>Copula-GARCH</s>
6161

62-
In order to started the high-frequency-trading statistical modelling, I inspect the dataset via [binary.com面试试题 I - 单变量数据缺失值管理](http://rpubs.com/englianhu/handle-missing-value) and also [binary.com 面试试题 I - 多变量数据缺失值管理 II](http://rpubs.com/englianhu/handle-multivariate-missing-value) but the univariate modelling caused some statistical error. The papers compares multi-methods like `interpolatan`, `kalman`, `locf` and `ma`. The [binary.com Interview Question I - Interday High Frequency Trading Models Comparison](http://rpubs.com/englianhu/binary-Q1Inter-HFT) compares SARIMA, mcsGARCH, <s>midasr, midas-garch, Levy process</s> models.
62+
In order to started the high-frequency-trading statistical modelling, I inspect the dataset via [binary.com面试试题 I - 单变量数据缺失值管理](http://rpubs.com/englianhu/handle-missing-value) and also [binary.com 面试试题 I - 多变量数据缺失值管理 II](http://rpubs.com/englianhu/handle-multivariate-missing-value) but the univariate modelling caused some statistical error. The papers compares multi-methods like `interpolatan`, `kalman`, `locf` and `ma`. The [binary.com Interview Question I - Interday High Frequency Trading Models Comparison](http://rpubs.com/englianhu/binary-Q1Inter-HFT) compares ts, msts, SARIMA, mcsGARCH, <s>midasr, midas-garch, Levy process</s> models.
6363

6464
### 1.2) <span style='color:red'>Blooper</span>
6565

66-
Initially, I wrote a shiny app (as showing in below gif file) but it is heavily budden for loading. Kindly browse over [ShinyApp](https://beta.rstudioconnect.com/content/2367/) which contain the questions and answers of 3 questions. For the staking model, I simply forecast the highest and lowest price, and then :
66+
Initially, I wrote a shiny app (as showing in below gif file) but it is heavily budden for loading. Kindly browse over [ShinyApp](https://beta.rstudioconnect.com/content/2367/) (Kindly refer to [binary.com Interview Question I - Lasso, Elastic-Net and Ridge Regression](http://rpubs.com/englianhu/binary-Q1L-EN-R) for more information) which contain the questions and answers of 3 questions. For the staking model, I simply forecast the highest and lowest price, and then :
6767

6868
- Kelly criterion and using highest or lowest price for closing transaction, otherwise using closing price if the forecasted lowest/highest price is not occur.
6969
- Placed $100 an each of the forecasted variance value and do the settlement based on the real variance value.

binary-Q1BET.Rmd

+40-53
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,10 @@ output:
1111
toc_float:
1212
collapsed: yes
1313
smooth_scroll: yes
14+
code_folding: hide
1415
---
1516

16-
```{r setup, include = FALSE}
17+
```{r setup, include=FALSE}
1718
suppressPackageStartupMessages(library('BBmisc'))
1819
pkgs <- c('knitr', 'kableExtra', 'tint', 'devtools', 'lubridate', 'plyr', 'stringr', 'magrittr', 'dplyr', 'tidyr', 'tidyverse', 'tidyquant', 'turner', 'readr', 'quantmod', 'htmltools', 'highcharter', 'googleVis', 'formattable', 'ggfortify', 'DT', 'forecast', 'PerformanceAnalytics', 'broom', 'microbenchmark', 'doParallel', 'Boruta', 'fBasics', 'fPortfolio', 'rugarch', 'parma', 'rmgarch')
1920
@@ -53,7 +54,7 @@ rm(pkgs)
5354

5455
In order to test the timeline of daily highest and lowest price, here I created this file to read the high volume tick-data-history to test the efficiency of Kelly Criterion betting models. Kindly refer to [Reference] for further information.
5556

56-
*binary.com Interview Question I - Tick-Data-HiLo For Daily Trading (Blooper)* descript that the VaR figure required in order to place orders. [What is the difference between Sharpe ratio and value at risk?](https://www.quora.com/What-is-the-difference-between-Sharpe-ratio-and-value-at-risk) states the difference between VaR and shape ratio where the shape ratio will be use in the future research.
57+
<span style='color:goldenrod'>*binary.com Interview Question I - Tick-Data-HiLo For Daily Trading (Blooper)*</span> descript that the VaR figure required in order to place orders. [What is the difference between Sharpe ratio and value at risk?](https://www.quora.com/What-is-the-difference-between-Sharpe-ratio-and-value-at-risk) states the difference between VaR and shape ratio where the shape ratio will be use in the future research.
5758

5859
[ARIMA+GARCH Trading Strategy on the S&P500 Stock Market Index Using R](https://www.quantstart.com/articles/ARIMA-GARCH-Trading-Strategy-on-the-SP500-Stock-Market-Index-Using-R) compares the ROI of buy and hold and application of ARIMA + GARCH model.
5960

@@ -66,7 +67,7 @@ In order to test the timeline of daily highest and lowest price, here I created
6667

6768
## Intro Reference
6869

69-
*Currency Hedging Strategies Using Dynamic Multivariate GARCH* compares DCC, BEKK, CCC and VARMA-AGARCH models to examine the conditional volatilities among the spot and two distint futures maturities, namely near-month and next-to-near-month contracts. The estimated conditionl covariances matrices from these models were used to calculate the optimal portfolios weights and optimal hedge ratios. The empirical results in the paper reveal that there are not big differences either the near-month or next-to-near-month contract is used for hedge spot position on currencies. They also reveal that hedging ratios are lower for near-month contract when the USD/EUR and USD/JPY exchange rates are anlyzed. This result is explained in terms of the higher correlation between spot prices and the next-to-near-month future prices than that with near-month contract and additionally because of the lower volatility of the long maturity futures. Finally across all currencies and error densities, the CCC and VARMA-AGARCH models provide similar results in terms of hedging ratios, portfolio variance reduction and hedging effectiveness. Some difference might appear when the DCC and BEKK models are used. Below is the table summary of the paper.
70+
<span style='color:goldenrod'>*Currency Hedging Strategies Using Dynamic Multivariate GARCH*</span> compares DCC, BEKK, CCC and VARMA-AGARCH models to examine the conditional volatilities among the spot and two distint futures maturities, namely near-month and next-to-near-month contracts. The estimated conditionl covariances matrices from these models were used to calculate the optimal portfolios weights and optimal hedge ratios. The empirical results in the paper reveal that there are not big differences either the near-month or next-to-near-month contract is used for hedge spot position on currencies. They also reveal that hedging ratios are lower for near-month contract when the USD/EUR and USD/JPY exchange rates are anlyzed. This result is explained in terms of the higher correlation between spot prices and the next-to-near-month future prices than that with near-month contract and additionally because of the lower volatility of the long maturity futures. Finally across all currencies and error densities, the CCC and VARMA-AGARCH models provide similar results in terms of hedging ratios, portfolio variance reduction and hedging effectiveness. Some difference might appear when the DCC and BEKK models are used. Below is the table summary of the paper.
7071

7172
![](www/hedge-strategy-01A.jpg)
7273

@@ -94,7 +95,7 @@ The correlations of the dynamic patterns in Tables 8A-8C are given in Tables 9A-
9495

9596
In summary, the estimates based on both OHR and optimal weight values recommend holding more FUT2 than FUT1 contracts for USD/EUR and USD/JPY spot/futures portfolios, meaning that we should increase the percentage of futures contracts for longer term portfolios when these currencies are used.
9697

97-
*Dynamic Portfolio Optimization using Generalized Dynamic Conditional Heteroskedastic Factor Models* compares . The paper studies the portfolio selection problem based on a generalized dynamic factor model (GDFM) with conditional heteroskedasticity in the idiosyncratic components. We propose a Generalized Smooth Transition Conditional Correlation (GSTCC) model for the idiosyncratic components combined with the GDFM. Among all the multivariate GARCH models that the authors propose, the generalized smooth transition conditional correlation provides the best result.
98+
<span style='color:goldenrod'>*Dynamic Portfolio Optimization using Generalized Dynamic Conditional Heteroskedastic Factor Models*</span> studies the portfolio selection problem based on a generalized dynamic factor model (GDFM) with conditional heteroskedasticity in the idiosyncratic components. We propose a Generalized Smooth Transition Conditional Correlation (GSTCC) model for the idiosyncratic components combined with the GDFM. Among all the multivariate GARCH models that the authors propose, the generalized smooth transition conditional correlation provides the best result.
9899

99100
![](www/ROI-DPO.jpg)
100101

@@ -110,7 +111,7 @@ I try to surf over internet and the model has no yet widely use. Here I can only
110111

111112
## VaR
112113

113-
I stored the forecast VaR value as well, kindly refer to *How Good Are Your VaR Estimates?*^ for more information.
114+
I stored the forecast VaR value as well, kindly refer to <span style='color:goldenrod'>*How Good Are Your VaR Estimates?*</span> for more information.
114115

115116
- *ARMA(1,1)-GARCH(1,1)
116117
Estimation and forecast using rugarch 1.2-2*
@@ -166,7 +167,7 @@ Alternatively, using the rugarch package which defaults to standardized distribu
166167

167168
I use more than 3 years data (from week 1 2015 until week 27 2018)^[You are feel feel to get the data via [FXCMTickData](https://github.com/fxcm/FXCMTickData)] for the question as experiment, 1st year data is burn-in data for statistical modelling and prediction purpose while following 2 years data for forecasting and staking. There have 52 trading weeks within a year.
168169

169-
```{r read-data, echo = FALSE, eval = FALSE}
170+
```{r echo=FALSE, eval=FALSE}
170171
## ================== eval = FALSE =============================
171172
## Do not execute...
172173
##
@@ -297,7 +298,7 @@ plot(forecast(fit))
297298
forecast(fit, h = 4)
298299
```
299300

300-
```{r read-data2, warning = FALSE}
301+
```{r warning=FALSE}
301302
## get currency dataset online.
302303
## http://stackoverflow.com/questions/24219694/get-symbols-quantmod-ohlc-currency-data
303304
#'@ getFX('USD/JPY', from = '2014-01-01', to = '2017-01-20')
@@ -308,6 +309,11 @@ forecast(fit, h = 4)
308309
#'@ names(USDJPY) <- str_replace_all(names(USDJPY), 'JPY=X', 'USDJPY')
309310
#'@ USDJPY <- xts(USDJPY[, -1], order.by = USDJPY$Date)
310311
312+
cr_code <- c('AUDUSD=X', 'EURUSD=X', 'GBPUSD=X', 'CHF=X', 'CAD=X', 'CNY=X', 'JPY=X')
313+
314+
names(cr_code) <- c('AUDUSD', 'EURUSD', 'GBPUSD', 'USDCHF', 'USDCAD', 'USDCNY', 'USDJPY')
315+
#'@ names(cr_code) <- c('USDAUD', 'USDEUR', 'USDGBP', 'USDCHF', 'USDCAD', 'USDCNY', 'USDJPY')
316+
311317
#'@ saveRDS(USDJPY, './data/USDJPY.rds')
312318
USDJPY <- read_rds(path = './data/USDJPY.rds')
313319
mbase <- USDJPY
@@ -320,48 +326,19 @@ dateID <- dateID[dateID > dateID0]
320326

321327
```{r data-summary}
322328
dim(mbase)
323-
summary(mbase) %>% kable(width = 'auto')
329+
summary(mbase) %>%
330+
tidy %>%
331+
.[,-1] %>%
332+
kable(caption = 'MSE of daily Opened and Closed Transaction Orders') %>%
333+
kable_styling(bootstrap_options = c('striped', 'hover', 'condensed', 'responsive')) %>%
334+
scroll_box(width = '100%', height = '400px')
324335
```
325336

326337
# Betting Strategy
327338

339+
## Betting Model
328340

329341

330-
# Model Comparison
331-
332-
```{r tidy-data1}
333-
##数据1
334-
fx1 <- llply(names(cr_code), function(x) {
335-
fls <- list.files(paste0('data/fx/', x), pattern = '^pred1')
336-
dfm <- ldply(fls, function(y) {
337-
readRDS(paste0('data/fx/', x, '/', y))
338-
}) %>% data.frame(Cat = 'pred1', .) %>% tbl_df
339-
names(dfm)[4:5] <- c('Price', 'Price.T1')
340-
dfm
341-
})
342-
names(fx1) <- names(cr_code)
343-
344-
##数据2
345-
fx2 <- llply(names(cr_code), function(x) {
346-
fls <- list.files(paste0('data/fx/', x), pattern = '^pred2')
347-
dfm <- ldply(fls, function(y) {
348-
readRDS(paste0('data/fx/', x, '/', y))
349-
}) %>% data.frame(Cat = 'pred2', .) %>% tbl_df
350-
names(dfm)[4:5] <- c('Price', 'Price.T1')
351-
dfm
352-
})
353-
names(fx2) <- names(cr_code)
354-
355-
## Merge and tidy dataset.
356-
fx1 %<>% ldply %>% tbl_df
357-
fx2 %<>% ldply %>% tbl_df
358-
fx <- suppressAll(
359-
bind_rows(fx1, fx2) %>% arrange(Date) %>%
360-
mutate(.id = factor(.id), Cat = factor(Cat), Price.T1 = lag(Price.T1, 56)) %>%
361-
dplyr::filter(Date >= ymd('2013-01-01') & Date <= ymd('2017-08-30')))
362-
363-
rm(fx1, fx2)
364-
```
365342

366343
# Conclusion
367344

@@ -383,23 +360,32 @@ It's useful to record some information about how your file was created.
383360
- R version (short form): `r getRversion()`
384361
- [**rmarkdown** package](https://github.com/rstudio/rmarkdown) version: `r packageVersion('rmarkdown')`
385362
- File version: 1.0.1
386-
- Author Profile: [®γσ, Eng Lian Hu](https://beta.rstudioconnect.com/content/3091/ryo-eng.html)
363+
- Author Profile: [®γσ, Eng Lian Hu](https://beta.rstudioconnect.com/content/4352/)
387364
- GitHub: [Source Code](https://github.com/englianhu/binary.com-interview-question)
388365
- Additional session information:
389366

390-
```{r info, echo=FALSE, warning=FALSE, results='asis'}
391-
suppressMessages(require('dplyr', quietly = TRUE))
392-
suppressMessages(require('formattable', quietly = TRUE))
393-
394-
sys1 <- devtools::session_info()$platform %>% unlist %>% data.frame(Category = names(.), session_info = .)
367+
```{r info, echo=FALSE, warning=FALSE, message=FALSE, results='asis'}
368+
sys1 <- session_info()$platform %>%
369+
unlist %>%
370+
data.frame(Category = names(.), session_info = .)
395371
rownames(sys1) <- NULL
396372
397-
sys1 %<>% rbind(., data.frame(Category = 'Current time', session_info = paste(as.character(now('Asia/Tokyo')), 'JST')))
398-
399-
sys2 <- data.frame(Sys.info()) %>% mutate(Category = rownames(.)) %>% .[2:1]
373+
sys2 <- data.frame(Sys.info()) %>%
374+
mutate(Category = rownames(.)) %>%
375+
.[2:1]
400376
names(sys2)[2] <- c('Sys.info')
401377
rownames(sys2) <- NULL
402378
379+
if (nrow(sys1) == 7 & nrow(sys2) == 8) {
380+
sys1 %<>% rbind(., data.frame(
381+
Category = 'Current time',
382+
session_info = paste(as.character(lubridate::now('Asia/Tokyo')), 'JST')))
383+
} else {
384+
sys2 %<>% rbind(., data.frame(
385+
Category = 'Current time',
386+
Sys.info = paste(as.character(lubridate::now('Asia/Tokyo')), 'JST')))
387+
}
388+
403389
cbind(sys1, sys2) %>%
404390
kable(caption = 'Additional session information:') %>%
405391
kable_styling(bootstrap_options = c('striped', 'hover', 'condensed', 'responsive'))
@@ -444,7 +430,8 @@ rm(sys1, sys2)
444430
33. [Comparison of Value at Risk Models and Forecasting Realized Volatility by using Intraday Data](https://github.com/scibrokes/real-time-fxcm/blob/master/reference/Comparison%20of%20Value%20at%20Risk%20Models%20and%20Forecasting%20Realized%20Volatility%20by%20using%20Intraday%20Data.pdf)
445431
34. [binary.com Interview Question I - Tick-Data-HiLo For Daily Trading (Blooper)](http://rpubs.com/englianhu/binary-Q1TD)
446432
35. [How Good Are Your VaR Estimates?](http://www.unstarched.net/2012/12/26/how-good-are-your-var-estimates/)
433+
36. [Kelly's Criterion in Portfolio Optimization - A Decoupled Problem](https://github.com/englianhu/binary.com-interview-question/blob/master/reference/Kelly's%20Criterion%20in%20Portfolio%20Optimization%20-%20A%20Decoupled%20Problem.pdf)
447434

448435
---
449436

450-
**Powered by - Copyright® Intellectual Property Rights of <img src='www/oda-army2.jpg' width='24'> [Scibrokes®](http://www.scibrokes.com)個人の経営企業**
437+
<span style='color:RoyalBlue'>**Powered by - Copyright® Intellectual Property Rights of [<img src='www/scb_logo.jpg' width='64'>®](http://www.scibrokes.com)個人の経営企業**</span>

binary-Q1BET.html

+1,067
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)