S1-COMP8-3 - Investigating the Incorporation of Machine Learning Concepts in Data Structure Education

1. Innovative Practice Work In Progress
BO LIU1 , Fengying Xie1
1 Department of Aerospace Information Engineering, School of Astronautics, Beihang University

The thrive of the machine learning (ML) especially the deep learning techniques has led to an increased demand for trained professionals with ML skills to solve challenging engineering problems in many fields. Meanwhile, the knowledge of data structure (DS) is one of the core bases for the information engineering and lies the foundation of many ML algorithms. While some professional techniques such as image processing have been successfully applied to the eduction of DS, the integration of DS education with ML still needs to be explored. We believe the integration not only leads to a more solid foundation for the advanced ML-related core courses but also could improve the teaching performance of DS, as the students have a chance to flexibly practice DS knowledge in interesting and relatively sophisticated problem contexts. In this paper, we discuss several possible ML topics could be integrated into the teaching of DS. The first suggested topic is the concept of tensor which is the basic data structure for deep learning and an extension of multiple-dimensional array. After teaching its basic concept and how the powerful toolkits like PyTorch handle tensors, the students will be instructed to implement 1D, 2D and 3D tensor convolution, which can greatly improve their ability to handle high-dimensional data. On basis of this, an experiment of decision tree based classification is utilized to practice the usage of tree structure. Given a set of samples containing features and class ID, the students are asked to firstly construct the decision tree using the ID3 feature selection algorithm, then perform post-pruning of the decision tree to improve its generality, and finally test the decision tree on some test samples. In this experiment, some algorithms like the ID3 algorithm could be provided to the students, so they will focus on DS-related programming, such as tree manipulation, file I/O and string processing. In addition, we utilize the experiment of computing graph to help the teaching of Directed Acyclic Graph (DAG), in which the students are required to implement the calculation of a multiple-variable function and its gradient based on DAG. This mimics the forward and backward propagation of neural network, covering several important knowledge points including the design of the computing node structure, graph construction and topological sorting, etc. In conclusion, this paper proposes to introduce ML concepts into data structure education and discusses several possible ways of integration. Practicing DS knowledge in interesting ML-related problem contexts would intrigue the study enthusiasm of students and give them an overview of how DS knowledge is applied in frontier technology, which could benefit the education of both DS and ML-related courses.