S8-O/LT6-2 - SPAM: Simplifying Python for Approaching Machine Learning

1. Innovative Practice Work In Progress
Joel Rosiene1 , Carolyn Rosiene2
1 Eastern Connecticut State University
2 University of Hartford

This WIP paper presents an approach to teaching Python as a first course in machine learning for non-majors.  The flexibility of Python enables a skilled programmer to be very expressive, but it can be troublesome to the new student.  Students who come to Python from an imperative language often approach the language “non-Pythonically”, leading to reliance on traditional programming constructs.

Traditional imperative programing constructs focus on the “how” to implement a function or the structure of data.  The building blocks of sequencing, selection and iteration being used to implement the function algorithmically and how to structure the computation.  The approach used here is to utilize set-builder notation to specify data and the mapping of functions to the set.  In this approach, while iteration is present, the application of the map does not imply an order, separating the processing orders which are required from coincidental orderings.  The students are introduced to the concepts of functors and monads early, hopefully, influencing their problem-solving approach. 

Short Python brain teasers posed in set-builder notation are used to develop the student skills in set comprehension and function composition in preparation for the machine learning component.  It is at this point where Python packages are introduced as a mechanism to construct complex sets of objects (data) and map existent functions across these sets.

 The above approach focuses on the formal definition of sets, functions and mapping and leads naturally to the problem of “learning” functions from examples.   One aspect of machine learning is the construction of functions given examples from the domain and specifying the element in the range (the training set) and the composition of the resulting functions.  In this approach, we discuss the differences between the estimation of an unknown function with incomplete training data (bias), the fitting of a known function and inaccurate data (regression) and estimation of a random variable (randomness).

The amount of information on machine learning can be daunting for a student who is new to programming.  Approaches which focus on control structures (or the data structure) adds additional complexity to a more function (functional) approach.  While Python is not a functional language, it is popular and can be used to explore sets, functional programming and machine learning.