This is a little intro to using an LSTM model for time series data. It’s not in any way a thorough introduction to how LSTMs work, which is pretty complex and far too much info for a short blog like this one…

I’m going to demonstrate it using some financial (stock market) data, where I’ll predict the adjusted close price of a stock from the other trading info. I’ll also add some error bars on those predictions because, well, why not?

## Getting the Data

I’m going to use the YahooFinancials python library to find time series data on particular stocks. It’s pip installable and has more reliable and wider functionality than the older yfinance python package. However, it’s worth noting that *neither of these packages are official Yahoo! products* – they are just open source packages that access the Yahoo! public API.

```
from yahoofinancials import YahooFinancials
```

The next thing to do is choose a stock that you want to work with. You’ll need to find the ticker symbol abbreviation for the stock, but there are plenty of websites to help you do that. Here I’m choosing a random company from the London Stock Exchange (LSE): “YGEN.L”

```
ticker = 'YGEN.L'
yf = YahooFinancials(ticker)
```

There’s a bunch of information available through the `yahoo_financials`

object, but just for now we’ll stick with the historical share price data. Here I’m extracting 12 months of data on a daily basis:

```
historical_stock_prices = yf.get_historical_price_data('2021-01-01', '2021-12-30', 'daily')
```

I’m going to extract the share pricing data into a pandas data frame, remove the Unix timestamp (`date`

) and index the data using the `formatted_date`

value instead:

```
df = pd.DataFrame(historical_stock_prices[ticker]['prices'])
df = df.drop('date', axis=1).set_index('formatted_date')
```

Let’s take a look at the data:

```
df.head()
```

We can also plot the adjusted close values as a function of time:

```
import matplotlib.pyplot as pl
import matplotlib.dates as mdates
pl.rcParams["figure.figsize"] = (20,5)
date = df.index
adjclose = df['adjclose'].values
ax = pl.subplot(111)
ax.plot(date, adjclose)
ax.xaxis.set_major_locator(mdates.MonthLocator(interval=1))
ax.set_ylabel("Adjusted Close")
pl.xticks(rotation=30)
pl.show()
```

## Extracting the features & the target

I’m going to use the `high`

, `low`

, `open`

& `volume`

data as my features – and use those to predict the `adjusted close`

data. I’m dropping the `close`

data because it feels like cheating to include it:

```
ticker_x = df.iloc[:, :-1].drop('close', axis=1)
ticker_y = df.iloc[:, -1:]
```

I’ll normalise the input data using the routines from scikit-learn library. You could also do this by hand, but it saves a few lines of code to use the library functions:

```
from sklearn.preprocessing import StandardScaler, MinMaxScaler
mm = MinMaxScaler()
ss = StandardScaler()
```

```
ticker_x = ss.fit_transform(ticker_x)
ticker_y = mm.fit_transform(ticker_y)
```

Now let’s make a test/train split on the data. I’ll use 80% for training and 20% for test/validation:

```
n_train = int(0.8*feat_ss.shape[0])
train_x, train_y = ticker_x[:n_train, :], ticker_y[:n_train,:]
valid_x, valid_y = ticker_x[n_train:, :], ticker_y[n_train:,:]
print(train_x.shape, train_y.shape, valid_x.shape, valid_y.shape)
```

Then we need to convert these into tensor format and make them PyTorch variables:

```
train_x, train_y = Variable(torch.Tensor(train_x)), Variable(torch.Tensor(train_y))
valid_x, valid_y = Variable(torch.Tensor(valid_x)), Variable(torch.Tensor(valid_y))
```

Up until this point everything looks pretty much the same as if we were creating a PyTorch dataset for any standard neural network application. But for an LSTM we need to format the data slightly differently. The data need to have dimensions (*N*, *L*, *H _{in}*), where

*N*is the batch size,

*L*is the sequence length and

*H*is the number of features.

_{in}Because LSTMs are a type of recurrent neural network (RNN), they are designed to use sequences of data samples as inputs. The *sequence length* defines how many samples are in each sequence – and they can also output a sequence of samples. However, here I’m just going to use a single sample as my sequence and predict a single output. Even so, I still need to reshape the data to have *L=1*.

```
train_x = torch.reshape(train_x, (train_x.shape[0], 1, train_x.shape[1]))
valid_x = torch.reshape(valid_x, (valid_x.shape[0], 1, valid_x.shape[1]))
```

## Model Architecture

Now we need to define the architecture for our LSTM model. Typically LSTM networks have LSTM layers followed by a small number of fully-connected layers. Here I’m using a single PyTorch LSTM layer, followed by a small fully-connected network.

```
class LSTM(nn.Module):
def __init__(self, num_classes, input_size, hidden_size, num_layers):
super().__init__()
self.num_layers = num_layers
self.hidden_size = hidden_size
self.lstm = nn.LSTM(input_size=input_size, hidden_size=hidden_size, num_layers=num_layers, batch_first=True)
self.fc = nn.Linear(hidden_size, hidden_size)
self.out = nn.Linear(hidden_size, num_classes)
self.drop = nn.Dropout(p=0.5)
self.relu = nn.ReLU()
def forward(self,x):
h0 = Variable(torch.zeros(self.num_layers, x.size(0), self.hidden_size))
c0 = Variable(torch.zeros(self.num_layers, x.size(0), self.hidden_size))
x, (hn, cn) = self.lstm(x, (h0, c0))
x = x.view(-1, self.hidden_size)
x = self.relu(x)
x = self.fc(x)
x = self.drop(self.relu(x))
x = self.out(x)
return x
```

On line 9 you can see where the LSTM layer is being initialised. A key thing to watch for here is `batch_first=True`

because it’s important that matches the dimensions of our data, i.e. *N* is the first dimension.

## Model Training

Now we have the data prepared and the model defined we can start the training process. First let’s define our hyper-parameters:

```
input_size = 4
hidden_size = 128
num_layers = 1
num_classes = 1
epochs = 5000
learning_rate = 1e-3
```

Now let’s initiate the model and select an optimiser and loss function:

```
model = LSTM(num_classes, input_size, hidden_size, num_layers)
loss_fnc = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
```

Then we can run a training loop. I haven’t bothered to batch the data here because it’s pretty small.

```
for epoch in range(epochs):
model.zero_grad()
output = model(train_x)
loss = loss_fnc(output, train_y)
loss.backward()
optimizer.step()
if epoch % 500 == 0:
print("Epoch: {}, loss: {:.3f}".format(epoch, loss.item()))
```

## So how did it do..?

Let’s make a prediction over the whole dataset (train + validation). We’ll need to reshape the data and make it a PyTorch Variable first:

```
ticker_x = Variable(torch.Tensor(ticker_x))
ticker_x = torch.reshape(ticker_x, (ticker_x.shape[0], 1, ticker_x.shape[1]))
```

I want to predict an expectation value *and* an uncertainty for the prediction at each point, so I’m going to leave dropout turned on and make 100 forward passes through the model to get some posterior uncertainties:

```
n_it = 100
for it in range(n_it):
pred = model(ticker_x)
if it==0:
all_predict = mm.inverse_transform(pred.data.numpy())
else:
all_predict = np.concatenate((all_predict,mm.inverse_transform(pred.data.numpy())),1)
mean = np.expand_dims(np.mean(all_predict,axis=1),axis=1)
std = np.expand_dims(np.std(all_predict,axis=1),axis=1)
```

Now let’s plot the predictions (remembering to undo the scaling on the input data!):

```
ticker_y = mm.inverse_transform(ticker_y)
```

```
ax = pl.subplot(111)
pl.axvline(x=n_train, c='r', linestyle='--')
ax.plot(date, ticker_y, label='True Data')
pl.errorbar(x = date, y=mean, yerr = np.squeeze(std), label='Predicted Data')
ax.xaxis.set_major_locator(mdates.MonthLocator(interval=1))
ax.set_ylabel("Adjusted Close")
pl.xticks(rotation=30)
pl.legend()
pl.show()
```

Well the outputs seem pretty decent, even for the validation samples on the right hand side of the red line.

Then for the blog this.