Working with stock market data using Python: Part 1

With global stock markets reeling from the uncertainty around the current COVID-19 pandemic, I thought it might be interesting to see how we can pull some stock market data into Python for further analysis.

In this first post, we look at how to import daily trading data into Python and display the data for a chosen stock.

Before getting started, it is assumed that you are already familiar with Python and have both Python and it's package manager, pip installed.

To follow this project, some additional Python packages will also need to be installed.

Using pandas allows us to easily interact with the data using two dimensional structures called dataframes, whilst pandas-datareader is used for obtaining Yahoo stock market data.

The above packages (including dependencies) can be installed using the following two commands :
pip install pandas
pip install pandas-datareader

With the packages now installed, we are ready to get started.

First we import the modules from the installed packages, as well as the built-in datetime module:
import datetime as dt
import pandas as pd
import pandas_datareader as pdr

As we will soon see, the datetime module is needed to create date objects, which are used to extract stock prices in a specific date range.

Next, we define some variables needed to define the data that we want to import, based on the ticker symbol and the date range in which we are interested. The stock symbols used are as per Yahoo Finance, so depending on which global market data you want to access, the ticker symbols may differ slightly from your local market. (For example in Australia, the ticker symbol for Coles supermarket group is COL, but on Yahoo it is COL.AX).

Let's use the Australian All Ordinaries index as our target, starting from 1 January 2019 up to the current date:
# target stock details
stock_pick = '^AORD'
start_date = dt.datetime(2019,1,1)
end_date = dt.date.today()

To grab the data and place it in a dataframe, we pass the above variables to pandas_datareader and specify the target dataset as 'yahoo':
# get stock data
df = pdr.DataReader(stock_pick, 'yahoo', start_date, end_date)

To check if our data imported correctly, we can view the last 5 lines of data, using the tail method:
# print stock data 
print(df.tail())

Viewing the output from the above command you should see that the stock data has columns for date, open, high, low, close, volume and adjusted close. (The data may be truncated as below, depending on your terminal width):
                   High          Low  ...        Volume    Adj Close
Date                                  ...                           
2020-04-02  5282.600098  5063.500000  ...  1.548106e+09  5106.8999
2020-04-06  5338.000000  5106.899902  ...  1.273476e+09  5323.6000
2020-04-07  5464.200195  5237.000000  ...  1.523458e+09  5301.2998
2020-04-08  5368.000000  5176.000000  ...  1.507547e+09  5258.7998
2020-04-09  5439.399902  5258.799805  ...  1.363194e+09  5439.3999

It looks like everything has imported successfully and we are now ready to start working with the data.

The code for the above can be downloaded from my GitHub page.

In the next post, we will look at creating a standard chart using the data.

Labels: , , , , ,