{"id":411,"date":"2020-08-21T10:26:07","date_gmt":"2020-08-21T15:26:07","guid":{"rendered":"https:\/\/www.brezeale.com\/?p=411"},"modified":"2020-08-21T20:42:32","modified_gmt":"2020-08-22T01:42:32","slug":"python-application-download-and-plot-stock-prices-with-moving-average","status":"publish","type":"post","link":"https:\/\/www.brezeale.com\/?p=411","title":{"rendered":"Python Application: Download and Plot Stock Prices with Moving Average"},"content":{"rendered":"\n<p>People learning to program often struggle with how to decompose a problem into the steps necessary to write a program to solve that problem. This is one of a series of posts in which I take a problem and go through my decision-making process that leads to a program.<\/p>\n\n\n\n<p>Problem: Download stock prices from a website, extract the closing prices for each day, and then plot the stock prices as well as a moving average of the prices.  I assume that you understand statements, conditionals, loops, lists, strings, functions, and file input\/output.<\/p>\n\n\n\n<p>You can find a video with more details at <a href=\"https:\/\/www.youtube.com\/watch?v=55piV5_cFeA\">https:\/\/www.youtube.com\/watch?v=55piV5_cFeA<\/a>&nbsp;<\/p>\n\n\n\n<br>\n\n\n\n<h3 class=\"wp-block-heading\">Design<\/h3>\n\n\n\n<p>Many programs that I write read a file that I have created.  In this case, the program will download data from a website and therefore I am not completely sure of the form of the data.  I start by downloading the data and looking at it; the first couple of lines are<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nDate,Open,High,Low,Close,Volume\n2019-08-20,9.46,9.64,9.46,9.56,58377\n2019-08-21,9.62,9.77,9.62,9.76,71992\n<\/pre><\/div>\n\n\n<p>This shows me that<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>the file format is CSV (comma separated values), so it will be easy to extract the desired information.<\/li><li>there is a header line, which I will need to remove, and that the prices I want are in the 5th column (index 4).<\/li><li>the data downloaded as a single string, so using a newline as a delimiter was necessary to create &#8216;lines&#8217; from it.<\/li><\/ul>\n\n\n\n<p>After downloading the data, we need to extract the closing prices.  This consists of going through each line, tokenizing it, and getting the closing price on that line.  The price is added to a list that will ultimately be plotted.<\/p>\n\n\n\n<p>Now I want to produce the moving average.  This is actually the most difficult part of the program.  Let&#8217;s start with what a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Moving_average\">moving average<\/a> is.  A moving average of a set of values consists of many averages, each of which is based on a subset of the data.  For example, if the data consists of daily prices and we want a 3-day moving average, then we find the average of the first three prices, then the average of days 2, 3, and 4, then the average of days 3, 4, and 5, and so forth.  The purpose of the moving average is to smooth out the prices in order to see the general trend.<\/p>\n\n\n\n<p>So how do we do this?  Let&#8217;s say we have list of prices<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\np = &#x5B;1, 2, 3, 4, 5, 6]\n<\/pre><\/div>\n\n\n<p>and we want the 3-day moving average of it.  The indices of the list begin at 0, so the first average in our moving average is (p[0] + p[1] + p[2])\/3.  The second average will be (p[1] + p[2] + p[3])\/3.  The third average will be (p[2] + p[3] + p[4])\/3.  So what is the pattern?  If the index of the first number is i, then the current average is (p[i] + p[i+1] + p[i+2])\/3.  If I generalize to allow moving averages of sizes other than 3, then an average as part of a w-day average is (p[i] + p[i+1] + &#8230; + p[i+w &#8211; 1])\/w.  In some languages I would have to use a loop to perform this calculate, with i changing as the loop iterated.  In Python, I can use slicing to achieve this.<\/p>\n\n\n\n<p>Finally, how will we plot the data?  There is a library for Python called <code>matplotlib<\/code> that is not installed by default, so I need to install it myself.  This library requires <code>numpy<\/code>, which I also need to install.  <code>matplotlib<\/code> plots pairs of coordinates using<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nplot( X, Y )\n<\/pre><\/div>\n\n\n<p>where X is a list of the x-coordinates of the points and Y is a list of the y-coordinates of the points.  The numbers in the moving average are the y-coordinates; their indices in the list are the x-coordinates.<\/p>\n\n\n\n<br>\n\n\n\n<h3 class=\"wp-block-heading\">Final Program<\/h3>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nimport urllib.request\n\ndef getWebpage( src ) :\n    fp = urllib.request.urlopen(src)\n    webpage = fp.read().decode(&#039;utf-8&#039;)\n    fp.close()\n\n    # webpage is a string, so tokenize\n    #   delimiter is a newline\n    tokens = webpage.split(&#039;\\n&#039;)\n\n    return tokens\n\ndef getPrices( d ) :\n    # header = Date,Open,High,Low,Close,Volume\n    #          2019-08-20,9.46,9.64,9.46,9.56,58377\n    prices = &#x5B;]\n    size = len(d)\n    \n    i = 1  # skip header line\n    while i &lt; size :\n        t = d&#x5B;i].strip().split(&#039;,&#039;)\n        prices.append( float(t&#x5B;4]) )\n        i += 1\n    \n    return prices\n\ndef produceAvg( d, windowSize ) :\n    # purpose:  produce running average of stock prices\n    size = len( d )\n    movavg = &#x5B; ]\n\n    i = 0\n    while i &lt; size - windowSize + 1 :\n        subTotal = sum(d&#x5B;i : i+windowSize])\n        movavg.append( subTotal\/windowSize )\n\n        i += 1\n        \n    return movavg\n\ndef plotStocks( p, m ) :\n    # plotting requires matplotlib and numpy, which\n    #   are not installed as part of the default\n    #   Python installation.\n    import matplotlib.pyplot as plt\n    x = range( len(p) )\n    plt.plot( x, p )\n\n    x = range( len(m) )\n    plt.plot( x, m )\n    avgTitle = &quot;%d day moving average&quot; % window\n    plt.legend(&#x5B;&quot;daily closing price&quot;, avgTitle], loc = &quot;lower left&quot;)\n    plt.title( &#039;stock prices&#039; )\n    plt.show()\n\n#####  main  #####\n# Note that at some point this link may no longer work\nurl = &quot;https:\/\/stooq.com\/q\/d\/l\/?s=googl.us&amp;d1=20190820&amp;d2=20200820&amp;i=d&quot;\ndata = getWebpage( url )\n\n# I noticed that the list contains an extra blank string\n#   at the end, so remove with pop \ndata.pop()\n\n# extract just the daily closing prices\nprices = getPrices( data )\n\n# ask user how many numbers to use in moving average\nwindow = int(input(&quot;Enter window size: &quot;))\nmovavg = produceAvg( prices, window )\n\nplotStocks( prices, movavg )\n<\/pre><\/div>","protected":false},"excerpt":{"rendered":"<p>People learning to program often struggle with how to decompose a problem into the steps necessary to write a program to solve that problem. This is one of a series of posts in which I take a problem and go through my decision-making process that leads to a program. Problem: Download stock prices from a website, extract the closing prices for each day, and then plot the stock prices as well as a moving average of the prices. I assume&#8230;<\/p>\n<p class=\"read-more\"><a class=\"btn btn-default\" href=\"https:\/\/www.brezeale.com\/?p=411\"> Read More<span class=\"screen-reader-text\">  Read More<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[9,5],"tags":[],"class_list":["post-411","post","type-post","status-publish","format-standard","hentry","category-programming","category-python-programming"],"_links":{"self":[{"href":"https:\/\/www.brezeale.com\/index.php?rest_route=\/wp\/v2\/posts\/411","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.brezeale.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.brezeale.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.brezeale.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.brezeale.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=411"}],"version-history":[{"count":10,"href":"https:\/\/www.brezeale.com\/index.php?rest_route=\/wp\/v2\/posts\/411\/revisions"}],"predecessor-version":[{"id":523,"href":"https:\/\/www.brezeale.com\/index.php?rest_route=\/wp\/v2\/posts\/411\/revisions\/523"}],"wp:attachment":[{"href":"https:\/\/www.brezeale.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=411"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.brezeale.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=411"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.brezeale.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=411"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}