With global stock markets reeling from the uncertainty around the current COVID-19 pandemic, I thought it might be interesting to see how we can pull some stock market data into Python for further analysis.
In this first post, we look at how to import daily trading data into Python and display the data for a chosen stock.
Before getting started, it is assumed that you are already familiar with
Python and have both Python and it's package manager,
pip installed.
To follow this project, some additional Python packages will also need to be installed.
Using
pandas allows us to easily interact with the data using two dimensional structures called dataframes, whilst
pandas-datareader is used for obtaining Yahoo stock market data.
The above packages (including dependencies) can be installed using the following two commands :
pip install pandas
pip install pandas-datareader
With the packages now installed, we are ready to get started.
First we import the modules from the installed packages, as well as the built-in
datetime
module:
import datetime as dt
import pandas as pd
import pandas_datareader as pdr
As we will soon see, the
datetime
module is needed to create date objects, which are used to extract stock prices in a specific date range.
Next, we define some variables needed to define the data that we want to import, based on the ticker symbol and the date range in which we are interested. The stock symbols used are as per
Yahoo Finance, so depending on which global market data you want to access, the ticker symbols may differ slightly from your local market. (For example in Australia, the ticker symbol for Coles supermarket group is COL, but on Yahoo it is COL.AX).
Let's use the Australian All Ordinaries index as our target, starting from 1 January 2019 up to the current date:
# target stock details
stock_pick = '^AORD'
start_date = dt.datetime(2019,1,1)
end_date = dt.date.today()
To grab the data and place it in a dataframe, we pass the above variables to
pandas_datareader
and specify the target dataset as 'yahoo':
# get stock data
df = pdr.DataReader(stock_pick, 'yahoo', start_date, end_date)
To check if our data imported correctly, we can view the last 5 lines of data, using the
tail
method:
# print stock data
print(df.tail())
Viewing the output from the above command you should see that the stock data has columns for date, open, high, low, close, volume and adjusted close. (The data may be truncated as below, depending on your terminal width):
High Low ... Volume Adj Close
Date ...
2020-04-02 5282.600098 5063.500000 ... 1.548106e+09 5106.8999
2020-04-06 5338.000000 5106.899902 ... 1.273476e+09 5323.6000
2020-04-07 5464.200195 5237.000000 ... 1.523458e+09 5301.2998
2020-04-08 5368.000000 5176.000000 ... 1.507547e+09 5258.7998
2020-04-09 5439.399902 5258.799805 ... 1.363194e+09 5439.3999
It looks like everything has imported successfully and we are now ready to start working with the data.
The code for the above can be downloaded from my
GitHub page.
In the next
post, we will look at creating a standard chart using the data.