Authors

Hao Chen

Type

Text

Type

Dissertation

Advisor

Zhu, Wei | Wu, Song | Kuan, Pei Fen | Xiao, Keli.

Date

2015-12-01

Keywords

Artificial Neural Networks, Genome-wide Association Study, High Frequency Trading | Statistics

Department

Department of Applied Mathematics and Statistics.

Language

en_US

Source

This work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree.

Identifier

http://hdl.handle.net/11401/77543

Publisher

The Graduate School, Stony Brook University: Stony Brook, NY.

Format

application/pdf

Abstract

Artificial neural network model is a powerful method that has been widely applied in many different areas. It is essentially a nonlinear statistical model, empirically proved with good prediction accuracy, and has been applied in both regression and classification problems. One challenge in applying artificial neural network models is constructing proper structure adaptive to specific problems. This thesis work is to introduce a novel, double-layered feed-forward neural network (DNN) model with special link patterns. Its applications to genome-wide association studies and stock price prediction in high frequency time scale have been explored. Detecting gene-gene interactions in traditional Genome-wide associate studies (GWAS) is mostly at the SNP level, called SNP-SNP interactions, which ignores the existence of large amount of correlations embedded among nearby SNPs. Popular existing methods with this mechanism, such as multifactor-dimensionality reduction (MDR) and random forests, would usually suffer from redundant interaction tests, due to the correlations between SNPs, and subsequently from less powers. With our new DNN model, we can take advantage of the correlations between SNPs and perform interaction test at the level of SNP blocks. Extensive simulation studies have been conducted to compare our new method with Random Forests. And our simulation results suggest that the DNN model can have higher power than Random Forests in detecting the existence of causal SNPs – no matter the effect is interactive or marginal. We also have applied the DNN model to financial markets, forecasting changes of stock prices in high frequency. One advantage of our DNN model is that it utilizes correlation information between different stocks, a pattern more commonly observed in high-frequency data but ignored in most existing methods. Our method has been tested on the 100 stocks with largest capital in S&P 500 using 5-minute data, and its performance has been benchmarked with a single layer neural network model and the classical ARMA-GARCH model. The DNN model clearly outperforms to the other models in terms of prediction accuracy and Sharpe ratio. Given the parallelizable scheme of our method with DNN models, it may be capable for designing profitable trading strategies in high frequency time scale. | Artificial neural network model is a powerful method that has been widely applied in many different areas. It is essentially a nonlinear statistical model, empirically proved with good prediction accuracy, and has been applied in both regression and classification problems. One challenge in applying artificial neural network models is constructing proper structure adaptive to specific problems. This thesis work is to introduce a novel, double-layered feed-forward neural network (DNN) model with special link patterns. Its applications to genome-wide association studies and stock price prediction in high frequency time scale have been explored. Detecting gene-gene interactions in traditional Genome-wide associate studies (GWAS) is mostly at the SNP level, called SNP-SNP interactions, which ignores the existence of large amount of correlations embedded among nearby SNPs. Popular existing methods with this mechanism, such as multifactor-dimensionality reduction (MDR) and random forests, would usually suffer from redundant interaction tests, due to the correlations between SNPs, and subsequently from less powers. With our new DNN model, we can take advantage of the correlations between SNPs and perform interaction test at the level of SNP blocks. Extensive simulation studies have been conducted to compare our new method with Random Forests. And our simulation results suggest that the DNN model can have higher power than Random Forests in detecting the existence of causal SNPs – no matter the effect is interactive or marginal. We also have applied the DNN model to financial markets, forecasting changes of stock prices in high frequency. One advantage of our DNN model is that it utilizes correlation information between different stocks, a pattern more commonly observed in high-frequency data but ignored in most existing methods. Our method has been tested on the 100 stocks with largest capital in S&P 500 using 5-minute data, and its performance has been benchmarked with a single layer neural network model and the classical ARMA-GARCH model. The DNN model clearly outperforms to the other models in terms of prediction accuracy and Sharpe ratio. Given the parallelizable scheme of our method with DNN models, it may be capable for designing profitable trading strategies in high frequency time scale. | 90 pages

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.