Python Application: Determining Which Degree Courses a Student is Eligible to Enroll In
People learning to program often struggle with how to decompose a problem into the steps necessary to write a program to solve that problem. This is one of a series of posts in which I take a problem and go through my decision-making process that leads to a program.
Problem: Compare the courses that a student has completed to the courses required for a degree and determine what courses the student is eligible to enroll in. The output of the program will be
- completed courses
- courses remaining for degree
- courses for which the prerequisites have been met
I assume that you understand statements, conditionals, loops, lists, strings, functions, file input/output, and dictionaries.
You can find a video with more details at https://www.youtube.com/watch?v=7Ahla3KXKJA
Design
The program will read two files and I begin my design by looking at them. In some real-world situations you may only have a description of what the data could look like, but when given access to real data I think that there is no replacement for looking at the data in order to understand what you are working with.
completed.txt
contains a list of the courses that the student has completed or is currently enrolled in. The student could simply update this file each semester and re-run the program to see what they are currently enrolled in. Here are a few lines from completed.txt
:
CSE 1310
ENGL 1301
MATH 1323
HIST 1311
fine arts elective
The file bscs-2013.csv
lists the courses required for a Computer Science degree. The lines beginning with a pound sign are comments for a reader and will need to be avoided when processing this information.
# courses required for BS Computer Science, 2012-2013 catalog
# course,category,prerequisites,notes
ENGL 1301,preprofessional,,
ENGL 1302,preprofessional,,
MATH 1426,preprofessional,MATH 1323,
MATH 2425,preprofessional,MATH 1426,
PHYS 1443,preprofessional,MATH 1426,
PHYS 1444,preprofessional,PHYS 1443|MATH 2425,
CSE 1104,preprofessional,,
CSE 1105,preprofessional,,
CSE 1320,preprofessional,CSE 1104|CSE 1105|CSE 1310|MATH 1323,
fine arts elective,general education,,note: see approved list
HIST 1311,general education,,
HIST 1312,general education,,
POLS 2311,general education,,
POLS 2312,general education,,
CSE 3380,professional,CSE 2315,note: MATH 3330 can be taken instead
CSE 4314,professional,COMS 2302,
CSE 4316,professional,CSE 3310|CSE 3320,
In bscs-2013.csv
we can see that each course line consists of four fields (some are empty):
- the course name, e.g., CSE 3380 or a generic name like literature elective.
- a category. The possibilities are preprofessional, professional, or general education. While these terms are not necessary for this problem, I included them in the data so that 1) they could be used in the future if necessary and 2) real-world data often includes extra information unnecessary for your needs that you just have to work around.
- course prerequisites is a vertical bar-delimited field of the courses that are prerequisites to the current course. This field plays a very important part in deciding what courses a student is eligible to enroll in.
- notes are additional information for some courses, such as “see approved list”.
When designing, I think of there being three parts: the first part is the input, which in this case are the two previously described files. The last part is the expected output. The middle part is how the input is transformed into the output. This also typically includes thoughts of data structures. Data structures are ways of organizing data. There are many data structures, each with its own pros and cons. For very simple things, such as printing a sequence of numbers, there really is no need to store anything beyond the current number. But for larger problems you may need to store the input as well as generated values before eventually producing output. This requires some form of data structure and you choose a data structure that will help you get from input to output.
To help me answer the data structure question, I will work through a sample problem using some of the data above. The point is to understand how I make my decisions and then translate this into instructions for the computer.
I’m going to work my way down the list of courses in bscs-2013.csv
, checking if they have already been taken and if not, have the prerequisites for it been taken.
- ENGL 1301 — on the list of completed courses, so move on
- ENGL 1302 — not taken and it has no prerequisites, so I am eligible to take it
- MATH 1426 — not taken and it has a prerequisite of MATH 1323 which I have taken, so I am eligible to take it and need to print any notes (none in this case)
- MATH 2425 — not taken and it has a prerequisite of MATH 1426. Since I have not taken MATH 1426, I am not eligible to take it
So what information from bscs-2013.csv
did I need to make my decisions? I needed the course name, the list of prerequisites, and will eventually need any notes associated with an eligible course. Even though I don’t need it for this problem, I chose to also store the category since including it was little additional work and it could potentially be useful for something else.
This leads to two possible data structures. The first is a 2D list, where the information for a course is contained within a list. It might look something like this (where the internal values are really strings):
[ [ENGL 1301,preprofessional,,],
[ENGL 1302,preprofessional,,],
[MATH 1426,preprofessional,MATH 1323,],
[MATH 2425,preprofessional,MATH 1426,],
[CSE 1320,preprofessional,CSE 1104|CSE 1105|CSE 1310|MATH 1323,],
[fine arts elective,general education,,note: see approved list] ]
The second data structure is a dictionary with the course name the key. So how can I store the other three things as the value? Use a dictionary as the value. It might look like this:
{ ENGL 1301 : { category : preprofessional,
prereqs : "",
notes : "" },
CSE 1320 : { category : preprofessional,
prereqs : CSE 1105|CSE 1310|MATH 1323,
notes : "" },
fine arts elective : { category : general education,
prereqs : "",
notes : see approved list }
}
I chose the dictionary option since it is easy to extract the keys from the dictionary, sort them, and then process the courses by key. Here is the pseudocode that I produced based upon my design:
# read data
open degree file
for each line
if not comment
tokenize line
store line in dictionary
open completed courses file
for each line
tokenize line
store line in list
# print courses completed and remaining
for course completed
print course
for course in degree
if course not completed
print course
# print courses for which you have prereqs
for each degree course
if course not completed
if prereqs met
print degree course
Final Program
#####################################
def getData(filenameReqs, filenameComp) :
## read data
degreeDict = { }
fp = open(filenameReqs, "r")
for line in fp :
line = line.strip() # get rid of white space
# at both ends
if line[0] != "#" : # if line is not a comment
course, category, prereqs, notes = line.split(',')
degreeDict[course] = {"category" : category,
"prereqs" : prereqs,
"notes" : notes}
fp.close()
### get list of courses completed (or currently enrolled in)
completedList = [ ]
fp = open(filenameComp, "r")
for line in fp :
completedList.append( line.strip() )
fp.close()
completedList.sort()
return degreeDict, completedList
#####################################
def printData( degreeDict ) :
# purpose: print course information (used for testing)
# input: dictionary
# returns: nothing
keys = degreeDict.keys()
keysList = sorted(keys)
for k in keysList :
print(k)
inner = degreeDict[k].keys()
innerList = sorted(inner)
for i in innerList :
print(" %-15s : %s" % (i, degreeDict[k][i]))
print()
#####################################
def printCompleted( completedList ) :
# purpose: print course information
# input: list
# returns: nothing
print("COURSES COMPLETED")
print("-----------------")
for c in completedList :
print(" ", c)
#####################################
def printRemaining( degreeDict, completedList ) :
# purpose: print course information
# input: dictionary, list
# returns: nothing
print("COURSES REMAINING")
print("-----------------")
keys = degreeDict.keys()
keysList = sorted(keys)
for c in keysList :
if c not in completedList :
print(" ", c)
#####################################
def printEligible( degreeDict, completedList ) :
# purpose: print course information
# input: dictionary, list
# returns: nothing
"""
for each degree course
if course not completed
if prereqs met
print degree course
"""
eligibleList = [ ]
for c in degreeDict :
if c not in completedList :
#print("*** %s ***" % c)
t = degreeDict[c]["prereqs"].split('|')
eligible = True # assume eligible unless missing
# prereq found
for p in t :
if p not in completedList :
eligible = False # found missing prereq
if eligible or t[0] == "" :
eligibleList.append( c )
eligibleList.sort()
print("COURSES ELIGIBLE TO TAKE")
print("------------------------")
for c in eligibleList :
if degreeDict[c]["notes"] != "" :
print(" %s (%s)" % (c, degreeDict[c]["notes"]))
else :
print(" ", c)
######################################
##### main #####
filenameReqs = "bscs-2013.csv"
filenameComp = "completed.txt"
degreeDict, completedList = getData(filenameReqs, filenameComp)
"""
# use for testing that degree information is being read correctly
printData( degreeDict )
exit()
"""
## print courses completed
printCompleted( completedList )
## print courses remaining
print()
printRemaining( degreeDict, completedList )
## print courses for which you have prereqs
print()
printEligible( degreeDict, completedList )