Stock Price Prediction Using Machine Learning: Techniques and Insights

Livnat Cohen
4 min readAug 6, 2024

--

Predicting stock prices accurately can provide a crucial advantage in today’s fast-paced financial markets. Machine learning has revolutionized this process by analyzing vast amounts of historical data to forecast market trends and optimize trading strategies. This article explores how machine learning techniques, such as LSTM and transformer models, decode complex patterns in stock data to predict future price movements.

Part 1: Exploring The Data

Predictive modeling in stock markets entails a structured approach to analyzing historical data, choosing suitable models, and assessing their performance to forecast future market behavior. This process merges statistical rigor with advanced machine learning techniques to generate actionable insights for investors and traders. Therefore, let’s explore our data briefly before delving into the technical sections.

Specifically, I utilized the yfinance package (from Yahoo Finance) to obtain all the necessary data for my exploration and training.

Let's begin by examining our dataset. For this analysis, I chose Apple stock, denoted by the ticker symbol 'AAPL'.

First, I download the data of the last 12 years, lets see some statistics about it:

Fig 1: Apple stock statistics over the past 12 years

If we focus solely on the closing prices, we can observe the stock’s trend over the specified timeline:

Fig 2: close prices of the stock

A Simple Moving Average shows the same patterns:

Fig 3: Moving avg. closing prices

We’ll also examine the average prices in the monthly and weekly data:

Fig 4: monthly and weekly closing prices over the past 12 years

Part 2: Exploring Relationships Between Stocks

It’s quite evident that stocks from different sectors — such as technology, pharmaceuticals, and energy — will exhibit some degree of correlation with each other. This can be clearly demonstrated through a correlation matrix for major tech giants like Apple, Microsoft, Amazon, and Google.

Fig 5: Correlation matrix of the tech companies

But what if we’re interested in examining influences beyond a specific sector? For instance, what if another company, not within the tech sector, has an impact on or is influenced by our stock?

Let’s explore this further.
I will analyze all 500 S&P stocks to identify a notable match that isn’t a tech company.
Here are the results:

Ticker
AAPL 1.000000
VRSK 0.885680
TYL 0.860245
ORCL 0.841546
FDX 0.802560
TTWO 0.799141
GOOGL 0.791950
GOOG 0.790318
MCO 0.788160
PANW 0.786191
MSFT 0.782999
SPGI 0.782517
VRTX 0.781841
WELL 0.779521
RCL 0.778220
AJG 0.774949
GEV 0.767042
MMC 0.760805
CCL 0.756409
FICO 0.753192
ISRG 0.751818

As you can see, there are stocks with strong correlations to Apple that are not part of the major tech companies. We can conduct further research on each of these stocks to identify the most promising ones to follow.

Part 3: Predictive Algorithm

In our predictive algorithm, we leveraged Long Short-Term Memory (LSTM) networks, a type of Recurrent Neural Network (RNN) well-suited for time series forecasting. Unlike traditional models, LSTMs can capture long-term dependencies and trends in sequential data, making them ideal for predicting stock prices based on historical data.

The LSTM model was designed with multiple layers to capture complex patterns in the data. Our architecture included:

  • Input Layer: Accepts sequences of historical stock prices.
  • LSTM Layers: Two stacked LSTM layers with 128 units each, designed to capture temporal dependencies.
  • Dense Layer: A fully connected layer with 25 units to transform the LSTM output.
  • Output Layer: Produces the predicted stock price.

The model was compiled with the Adam optimizer and mean squared error loss function.

To train our LSTM model, we used historical stock price data, including features like opening price, closing price, volume, and other technical indicators. The data was normalized to ensure consistency and split into training and test sets. We employed a sliding window approach to create sequences of data points for training the LSTM.

The LSTM network was trained to minimize the mean squared error between predicted and actual stock prices. We employed techniques such as dropout regularization to prevent overfitting and early stopping to halt training when performance on the validation set ceased to improve.

After training, the model was evaluated on a test set to assess its predictive accuracy. We measured performance using metrics such as mean absolute error (MAE) and root mean squared error (RMSE). The LSTM model demonstrated an ability to capture underlying trends and fluctuations in stock prices.

To visualize the model’s performance, we plotted the predicted stock prices against the actual prices. The results showed that the LSTM model could closely follow the actual stock price movements, with minor deviations attributed to market volatility and unforeseen events.

Fig 6: APPL stock — predictions in red, true values in green

--

--

Livnat Cohen
Livnat Cohen

Written by Livnat Cohen

Physicist, Data Scientist. Interests: Data Science, Machine Learning, AI, Python, Predictive Analytics, Stochastic Processes.

No responses yet