Python Application: Download and Plot Stock Prices with Moving Average

Python Application: Download and Plot Stock Prices with Moving Average

People learning to program often struggle with how to decompose a problem into the steps necessary to write a program to solve that problem. This is one of a series of posts in which I take a problem and go through my decision-making process that leads to a program.

Problem: Download stock prices from a website, extract the closing prices for each day, and then plot the stock prices as well as a moving average of the prices. I assume that you understand statements, conditionals, loops, lists, strings, functions, and file input/output.

You can find a video with more details at https://www.youtube.com/watch?v=55piV5_cFeA 


Design

Many programs that I write read a file that I have created. In this case, the program will download data from a website and therefore I am not completely sure of the form of the data. I start by downloading the data and looking at it; the first couple of lines are

Date,Open,High,Low,Close,Volume
2019-08-20,9.46,9.64,9.46,9.56,58377
2019-08-21,9.62,9.77,9.62,9.76,71992

This shows me that

  • the file format is CSV (comma separated values), so it will be easy to extract the desired information.
  • there is a header line, which I will need to remove, and that the prices I want are in the 5th column (index 4).
  • the data downloaded as a single string, so using a newline as a delimiter was necessary to create ‘lines’ from it.

After downloading the data, we need to extract the closing prices. This consists of going through each line, tokenizing it, and getting the closing price on that line. The price is added to a list that will ultimately be plotted.

Now I want to produce the moving average. This is actually the most difficult part of the program. Let’s start with what a moving average is. A moving average of a set of values consists of many averages, each of which is based on a subset of the data. For example, if the data consists of daily prices and we want a 3-day moving average, then we find the average of the first three prices, then the average of days 2, 3, and 4, then the average of days 3, 4, and 5, and so forth. The purpose of the moving average is to smooth out the prices in order to see the general trend.

So how do we do this? Let’s say we have list of prices

p = [1, 2, 3, 4, 5, 6]

and we want the 3-day moving average of it. The indices of the list begin at 0, so the first average in our moving average is (p[0] + p[1] + p[2])/3. The second average will be (p[1] + p[2] + p[3])/3. The third average will be (p[2] + p[3] + p[4])/3. So what is the pattern? If the index of the first number is i, then the current average is (p[i] + p[i+1] + p[i+2])/3. If I generalize to allow moving averages of sizes other than 3, then an average as part of a w-day average is (p[i] + p[i+1] + … + p[i+w – 1])/w. In some languages I would have to use a loop to perform this calculate, with i changing as the loop iterated. In Python, I can use slicing to achieve this.

Finally, how will we plot the data? There is a library for Python called matplotlib that is not installed by default, so I need to install it myself. This library requires numpy, which I also need to install. matplotlib plots pairs of coordinates using

plot( X, Y )

where X is a list of the x-coordinates of the points and Y is a list of the y-coordinates of the points. The numbers in the moving average are the y-coordinates; their indices in the list are the x-coordinates.


Final Program

import urllib.request

def getWebpage( src ) :
    fp = urllib.request.urlopen(src)
    webpage = fp.read().decode('utf-8')
    fp.close()

    # webpage is a string, so tokenize
    #   delimiter is a newline
    tokens = webpage.split('\n')

    return tokens

def getPrices( d ) :
    # header = Date,Open,High,Low,Close,Volume
    #          2019-08-20,9.46,9.64,9.46,9.56,58377
    prices = []
    size = len(d)
    
    i = 1  # skip header line
    while i < size :
        t = d[i].strip().split(',')
        prices.append( float(t[4]) )
        i += 1
    
    return prices

def produceAvg( d, windowSize ) :
    # purpose:  produce running average of stock prices
    size = len( d )
    movavg = [ ]

    i = 0
    while i < size - windowSize + 1 :
        subTotal = sum(d[i : i+windowSize])
        movavg.append( subTotal/windowSize )

        i += 1
        
    return movavg

def plotStocks( p, m ) :
    # plotting requires matplotlib and numpy, which
    #   are not installed as part of the default
    #   Python installation.
    import matplotlib.pyplot as plt
    x = range( len(p) )
    plt.plot( x, p )

    x = range( len(m) )
    plt.plot( x, m )
    avgTitle = "%d day moving average" % window
    plt.legend(["daily closing price", avgTitle], loc = "lower left")
    plt.title( 'stock prices' )
    plt.show()

#####  main  #####
# Note that at some point this link may no longer work
url = "https://stooq.com/q/d/l/?s=googl.us&d1=20190820&d2=20200820&i=d"
data = getWebpage( url )

# I noticed that the list contains an extra blank string
#   at the end, so remove with pop 
data.pop()

# extract just the daily closing prices
prices = getPrices( data )

# ask user how many numbers to use in moving average
window = int(input("Enter window size: "))
movavg = produceAvg( prices, window )

plotStocks( prices, movavg )
Comments are closed.