Python Programming is designed to learn data science using Python. The course covers topics including, Python setup and familiarization with the environment, writing programs in Python, Python development support data structures and libraries including Numpy, exploratory data analysis using libraries such as Pandas, predictive model design including regression analysis, decision tree and other prediction models, visualizations using Matplotlib, and implementation project to practice the concepts learned during the course.
统计学理论
This course teaches basic theory and methodologies for probability and statistical analysis. The course covers random variable and probability distribution; data collection; graphical and numerical methods for describing data; estimation; hypothesis testing; regression analysis.
数据库原理与开发
This course introduces students to the key concepts of database systems, the basics of the Structured Query Language (SQL) as well as basic database design for storing data as part of a multi-step data gathering, analysis, and processing effort. The course covers concepts of database systems, basics of SQL, Data Models and Relational SQL, Many-to-Many Relationships in SQL, Databases and Visualization, introduction to NoSQL.
时间序列分析
This course teaches the fundamental theory and techniques for processing and analyzing time series data. The course covers descriptive techniques of time series, ARMA models, model diagnostics, heteroskedasticity and GARCH models, statistical software for time series analysis.
机器学习
This course introduces basic theory, methodologies and tools for machine learning. It will cover supervised learning (support vector machine, neural network and kernel methods), unsupervised learning (ensemble, dimension reduction and deep learning), and applications of machine learning.
数据挖掘
This course teaches the fundamental theory and techniques for data processing and data mining. The course covers data pre-processing, feature selection and extraction, pattern discovery from data, classification, clustering and outlier detection.