Curriculum

Data Technology Leads the Future

Curriculum

The programme courses each comprise 3 credits, and are taught in English. The duration of study is 24 months. Students of the Master of Science in Data Science program have two routes to fulfill the graduation requirements. Option one requires students to complete 3 compulsory courses and 11 elective courses. Option two requires 3 compulsory courses and 9 elective courses, and completion of a 6-credit graduation project under the guidance of academic mentors and/or industry mentors. Students who complete the 42 professional credits will be awarded a degree certificate from the Chinese University of Hong Kong.


Required Courses

Data Mining

This course teaches the fundamental theory and techniques for data processing and data mining. The course covers data pre-processing, feature selection and extraction, pattern discovery from data, classification, clustering and outlier detection.

Machine Learning

This course introduces basic theory, methodologies and tools for machine learning. It will cover supervised learning (support vector machine, neural network and kernel methods), unsupervised learning (ensemble, dimension reduction and deep learning), and applications of machine learning.

Optimization and Modeling

This course covers basic optimization theory and algorithms for optimization problems in machine learning and data science. Basic theory covered will include convex analysis and optimality conditions for unconstrained and constrained problems. Basic optimization algorithms such as the gradient method, the proximal gradient method and its accelerated versions, the alternating direction method of multipliers, and stochastic approaches will be presented. Second-order algorithms such as Newton’s method and inexact Newton methods will also be introduced. The course will require algorithms implementation and problem solving on computers.

Elective Courses

Internet Finance

The course provides the tools necessary to analyze the opportunities and potential competitive threats in commercial Web-based organizations. To quantify and apply the analysis, a particular focus is on valuing Internet companies based on a careful examination of their business model and environment. The course also covers the basic theory of financial intermediation as it applies to online financial service firms. It discusses the impact of a migration to online financial services and the competitive changes created.

Quantitative Portfolio Analysis

This course introduces formal quantitative analytical concepts and tools used to manage security portfolios from perspective of an institutional investor. The following topics will be covered: market microstructure, margin purchasing, short selling, portfolio risk management, risk/return tradeoffs, strategic/tactical asset allocation, active versus passive management, portfolio revision, performance evaluation.

Internship Training II

This course is designed to enhance the student internship experience and facilitate student professional development. It provides students with the opportunity to set specific and individualized goals and identify growth areas in an internship relevant to their potential career field in both big data analytics and business analytics areas. Student will be able to apply knowledge gained in the classroom to real-world challenges in an internship environment. This is a continuation of Internship Training I, allowing the students to have more depth or breadth in their career paths.

Internship Training I

This course is designed to enhance the student internship experience and facilitate student professional development. It provides students with the opportunity to set specific and individualized goals and identify growth areas in an internship relevant to their potential career field in both big data analytics and business analytics areas. Student will be able to apply knowledge gained in the classroom to real-world challenges in an internship environment.

Capstone Project

This capstone project course enables experiential learning by applying the analytics methodologies, techniques, and tools learned throughout the Programme to real-world problems. Taking advantage of the dynamic business and technology markets in the Pearl River Delta region, students will work with real business clients to develop the solution and present the results, insights, and recommendations.

Operations Management and Analytics

This course introduces the analysis of key issues related to the design and management of operations using quantitative tools such as linear, integer, and non-linear programming, regression, and statistical analysis. It covers important topics such as forecasting, aggregate planning, inventory theory, transportation, production control and scheduling, and facility location, among others, and uses mathematical modeling, spreadsheet analysis, case studies, and simulations to deliver materials.

Reinforcement Learning

This course focuses on the introduction of one important subject of machine learning: reinforcement learning, which is considered the core for artificial intelligence. Topics include fundamentals of reinforcement learning, bandit problems, Markov decision processes, dynamic programming, Monte Carlo methods, temporal-difference learning, on-policy vs. off-policy learning, learning vs. planning, approximation methods, eligibility trace, policy gradient methods, and critic-actor methods.

Deep Learning

Deep learning solves the central problem in representation learning by introducing representations that are expressed in terms of other, simpler representations. Deep learning enables the computer to build complex concepts out of simpler concepts. This course is developed to study the popular concepts and techniques of deep learning. The contents include probability and information theory basics; machine learning basics; deep feedforward networks and auto-encoder network; regularization and optimization for deep learning; Convolutional Neural Networks (CNNs); Recurrent Neural Networks (RNNs); Generative Adversarial Networks (GANs); and its applications.

Stochastic Process

A stochastic process is a mathematical model for random phenomena which change in time. It builds on probability theory and functional analysis, where the latter is because a sample is a function of time. Some knowledge about them would be helpful. However, it is more important to consider how randomness arises in time, intuitively and mathematically. Through examples, this course provides tools for this consideration, and shows how to use them in applications. The examples are taken from coin tossing, lottery, service systems including communication and queueing networks, inventories and risk analysis. To follow the course, students are expected to have some familiarity with basic mathematics of undergraduate level including elementary mathematical logic and naive set theory.

Corporate Governance and Social Responsibility in China

This course aims to develop a sound understanding of the underlying concepts and theories of corporate governance and corporate social responsibilities, which are both indispensable in today’s business environment. It explores the use of different internal control strategies and corporate governance practices and the integration of ethics in achieving efficiency, effectiveness and economy in operations and in complying with legal, regulatory, social and corporate oversight requirements with a particular focus issues in China.

Deep Learning and Their Applications

This course will cover key new techniques intended to help students and practitioners enter the field of machine learning in general and deep learning in particular. The topics include multilayer perceptron (MLP), stacked autoencoders (denoising, variational and adversarial), convolutional neural networks (ResNet, Inception),recurrent neural networks (LSTM, GRU), multi-task deep learning, transfer learning, deep generative models and deep Gaussian process. Selected use cases include sentiment analysis, chatbot, image based document classification and machine translation.

Dynamic Programming

Dynamic Programming is a fundamental tool widely used to model and solve various engineering problems. This course is developed to study the popular concepts and techniques of dynamic programming. The contents include Principles of Optimality; Dynamic Programming Algorithm; Deterministic Dynamic Programming Problems; Stochastic Dynamic Programming Problems with Perfect and Imperfect Information; Approximate Dynamic Programming and Infinite Horizon Problems.

Fintech Theory and Practice

The objective of this course is to provide students with Fintech theory and practice. This course covers the applications of new technologies like big data, block chain, and artificial intelligence (AI) in financial services, the new forms of financial services such as peer to peer lending and crowdfunding, cryptocurrencies, and Fintech regulations. Representatives from banks, hedge funds, and insurance companies will share the recent Fintech development in their companies with students.

Derivatives and Risk Management Techniques

This course provides both introductory theory and a working knowledge of contemporary financial derivatives, with an emphasis on the use of derivatives in financial risk management. The theory component covers some fundamental pricing principles that apply to various derivative contracts in financial markets. The working knowledge component will cover the main types of derivatives contracts and valuation techniques. The course includes an Options Market Making Simulation which aims to help students to gain more practical knowledge about the sophisticated options market-making mechanism. This subject is both theoretical and practical, the emphasis will be on problem-solving and analytical skills.

Big Data Marketing

This course will introduce basic big data approaches and their marketing applications. The topics include trends of big data applications, consumer evolution in the digital age, big data insights into business, text mining and topic modeling, Web search data and Internet marketing, social network and social media marketing, mobile marketing, customer interactions strategies, and data driven marketing strategy. Methodologies and techniques, including text analysis, Web crawling, logistic regression, and social network analysis, will be introduced and their business applications will be discussed. This course aims to help students develop analytics skills and abilities combined with innovative business ideas to create effective big-data marketing strategies in today’s business and technology environment.

Cloud Computing

This course is designed for graduate students from all academic programs at CUHK-Shenzhen. Topics covered include principles of Internet computing, cloud systems architecture, cloud enabling technologies including virtualization, software and language tools, applications and programming of existing cloud systems. Case studies include the popular clouds and application tools such as the AWS, Google, Salesforce, Azure, Aliyun, Baidu, Hadoop, Spark, Python, TensorFlow, VMWare, etc. We will emphasize the interactions between existing clouds and the Internet of Things (IoT), 5G mobile networks, machine learning, big data analytics, cognitive computing, and artificial Intelligence (AI) applications.  Students will have hands-on experience in using the available AWS, Tencent, or Aliyun clouds.

Optimization Theory and Algorithms

This course covers basic theory and algorithms for unconstrained and constrained optimization problems, convex and non-convex optimization problems, optimality conditions including duality theory. Algorithms include basic first-order and second-order methods.  Some applications of optimization, such as those in data science, will be introduced. The course also requires algorithm implementation and problem solving on computers.

Blockchain

This course is designed to introduce the fundamental basics of blockchain, its solution scenarios, and enabling technologies. The main contests of the fundamental basics of blockchain include but not limited to the concepts of blockchain, evolutions of business to business collaborations, digital currency, cryptocurrency, bitcoins, mining machines, digital asset trading, initial coin offerings, digital economy, smart contracts, consensus mechanisms, as well as privacy & security issues. The solution scenarios part include case studies of using blockchain to transform individual industries and application domains. A blockchain solution reference architecture will be articulated and leveraged to analyze and design a blockchain solution. The enabling technologies part includes blockchain related cloud computing, artificial intelligence, content based file systems, as well as new programming models.

Artificial Intelligence: Law and Policy

Artificial intelligence is poised to become the fourth industrial revolution, fundamentally changing the way we live, work, and learn. This course seeks to explore the legal and policy aspects of artificial intelligence. In particular, the course will introduce the concepts of machine learning and artificial intelligence, examine how artificial intelligence changes the ways in which legal services are delivered and policy decisions are made, and evaluate the wider legal and policy issues created by the use of artificial intelligence.

Economic Analytics

This course is about applying economic models with data to deal with corporate decisions and strategies. The art and science of economic modeling for this purpose makes use of the principles and the tools derived from the studies of information economics, games, industrial organization and the related fields. The students will learn various econometric methods, such as least-squares, maximum likelihood and generalized method of moments estimators.

Data Visualization

Data visualization is the process used to communicate information clearly in graphical form. Topics covered include overview of the concepts and models used to visualize data, data models, graphical perception, and techniques for visual encoding and interaction, data visualization tools, data visualization projects using business-focused data from sources such as social media, ecommerce, marketing, etc.

Data-Driven Experimentation and Measurement

This course introduces controlled experiments in business settings, experiment design, A/B testing; specialized statistical methodologies; fundamentals of econometrics, instrument variable regression, propensity score matching.

Web Analytics

Through a combination of case studies, theoretical concepts and frameworks used in business applications, this course introduces the theory and practice of Web Analytics and User Experience research. It provides a theoretical prospective on what Web Analytics is and the KPIs used in Web Analytics. The second part of the course focuses on UX Research and draws heavily on applied statistics and experimental design. Other topics covered in the course are social media, network security and user’s privacy. The tutorials will use the R programming language.

Big Data Modeling and Management

As the web technology and mobile use rapidly evolves, the volume of user-generated data expand exponentially. The distillation of knowledge from such a large amount of unstructured, dynamically changed data is an extremely difficult task without the help of distributed techniques. This course introduces most state-of-the-art Big data analytical concepts, techniques and tools. By taking this course, students will gain hands-on technical experiences on solving Big Data problems using distributed algorithms and tools widely adopted in industry. The topics include basic concepts about Big Data, installation and configuration of Hadoop and Spark under a multi-node environment, distributed algorithms (recommender systems, clustering, classification, topic models, and network analysis), web crawling and web data extraction using major application programming interfaces.

Applied Parallel Programming

This course introduces the fundamentals of parallel programming, from task parallelism to data parallelism. By taking this course, students will learn reason about task and data parallel programs, express common algorithms in a functional style and solve them in parallel, and write programs that effectively use parallel collections to achieve performance. The topics include basics of parallel programming, task parallelism algorithms, data parallelism, data structures for parallel computing and practical applications.

Artificial Intelligence

This course provides a technical introduction of fundamental concepts of artificial intelligence (AI). Topics include: fuzzy logic and applications; fuzzy expert systems; fuzzy query; fuzzy data and knowledge engineering; fuzzy control; genetic algorithms and programming and their applications; parallel genetic algorithms; island model and coevolution; genetic programming.

Analysis of Numerical Algorithms

This course introduces the basic numerical techniques to solve mathematical problems on a computer. Algorithms for several common problems encountered in mathematics, science and engineering are introduced. Topics include linear equations, nonlinear equations, polynomial interpolation and splines, numerical integration, least squares and fast Fourier transform.

Applied Regression Analysis

This course introduces the basic theory and methodologies of regression analysis, and demonstrates its practical applications. Topics include scatter plots, simple linear regression, multiple regression, interpretation of main effects, complex repressors, analysis of variance, and nonlinear regression.

Image Processing and Computer Vision

A picture is worth a thousand words. How does a computer “see” and “understand” a picture? This course is an introductory graduate level course to answer such questions. We will first cover principles of image formation, operations to alter images, feature extraction and other image processing methods to turn images into abstract descriptions. We will then turn to computer vision topics that discuss how to perceive the structure and semantics of the world, including multi-view geometry, structure from motion, and visual localization and recognition. We will also touch upon related topics in machine learning which are widely used in computer vision.

Natural Language Processing

In the deep learning era, representation learning is a crucial task for many artificial intelligence (AI) applications. Especially in natural language processing (NLP), to properly represent text is an important while challenging mission that has great potential in improving NLP quality in many aspects. Especially nowadays, pre-train models become the most prevailing technique that helps people obtaining the state-of-the-art performance in many NLP tasks. Text representation and the way of learning it turn into the eye of the storm in NLP as well as AI field. In this course, we will cover several basic representation (embedding) methods for different text granularity, such as word, sentence and documents. Moreover, we will also cover other advanced or task-specific embedding techniques that are learned on texts, and the front-edge learning approaches for pre-train models. The audience of this course will have the chance to learn and practice the up-to-date research on text representation learning and know how they are applied in real word applications.

Artificial Intelligence in Medical Imaging

Artificial Intelligence (AI) in healthcare and Medical Imaging has developed rapidly over the last decade, and now it has become one of the most exciting application areas of AI. This course will explore the basic knowledge and the latest advances of Deep Learning based methods in the medical field, with special attention to challenges and opportunities for Medical AI. This course will provide students with the opportunity to learn skills to train/learn/develop Deep Learning models from medical data. It covers knowledge on Digital Medical Imaging, supervised learning, semi-supervised learning and unsupervised learning. Importantly, some hot topics like cutting-edge research on few-shot learning and AI security in Medical Imaging will also be introduced.

Advanced Time Series Analysis

This course will cover basic concepts and analysis methods of time series data. Topics include: stationary and non- stationary linear time series models, model specification, parameter estimation, model diagnostics, forecasting, seasonal ARIMA models, deterministic trends and exponential smoothing, ARCH and GARCH models. The course also proposed a brief introduction to advanced topics including threshold models, spectral analysis, multivariate time series and machine learning for time series. We will use R for demonstration and projects.

Database Principles and Development

This course introduces students to the key concepts of database systems, the basics of the Structured Query Language (SQL) as well as basic database design for storing data as part of a multi-step data gathering, analysis, and processing effort. The course covers concepts of database systems, basics of SQL, Data Models and Relational SQL, Many-to-Many Relationships in SQL, Databases and Visualization, introduction to NoSQL.

Python Programming

Python Programming is designed to learn data science using Python. The course covers topics including, Python setup and familiarization with the environment, writing programs in Python, Python development support data structures and libraries including Numpy, exploratory data analysis using libraries such as Pandas, predictive model design including regression analysis, decision tree and other prediction models, visualizations using Matplotlib, and implementation project to practice the concepts learned during the course.

Theory of Statistics

This course teaches basic theory and methodologies for probability and statistical analysis. The course covers random variable and probability distribution; sufficiency and likelihood principles of data reduction; point estimation; hypothesis testing; interval estimation; asymptotic evaluations.

Mining Massive Datasets

This course will survey state-of-the-art topics in Big Data, looking at data collection (smartphones, sensors, the Web), data storage and processing (scalable relational databases, Hadoop, Spark, etc.), extracting structured data from unstructured data, systems issues (exploiting multicore, security), analytics (machine learning, data compression, efficient algorithms), visualization, and a range of applications.

Microstructure and Algorithm Trading

This course introduces the foundations of securities trading and discusses market microstructure and optimal trading strategies. IT covers the nature of markets and prices, trading mechanism, market microstructure models, trading costs and optimal trading strategies and high frequency trading.

Fixed-income Securities Analysis

This course introduces the analytical tools and concepts needed to price fixed income securities. Topics include the pricing and hedging of bonds, inflation-indexed bonds, derivatives, and other types of fixed income securities. Emphasis will be placed on the student’s ability to price these securities by appropriately discounting future cash flows for time and risk.

Credit Risk Modelling and Products

The course introduces credit risk modeling and credit derivatives evaluation and management. It covers structural models of default risk, intensity-based modelling, risk structures of interest rates; credit default swaps, CDOs and related products.

Electronic Payment Systems and Blockchain

This course covers various methods of transferring payments over the Internet and compares their functionality and provides a solid, overall, understanding of blockchain technology. Topics include electronic money, electronic contracts, micro-payments, authenticity, integrity and reliability of transactions, encryption and digital signature techniques needed to support electronic cash, and technologies available to support secure transactions on the Internet and the fundamentals of blockchain technology.

Alternative Investment

This course mainly explains: as an institutional investor, how to make reasonable asset allocation, timing and securities selection. The contents include: investment philosophy, asset management methodology, traditional and alternative asset types, risk and return characteristics, investment steps and strategies.

Chinese Economy and Financial Markets

Chinese economy is developing rapidly as the country’s leadership pushes forward its liberalization process and integration with the rest of the world. This course aims to provide an in-depth coverage of Chinese economy and its financial system, with a focus on its distinct characteristics. The objective is to understand the factors that drove China’s economic miracles in the last 30 years, the challenges that China is currently facing and the reforms that can help China skip the mid-income trap. The role of the financial system in China and the future directions of financial market reforms will be discussed in class.

Applied Econometrics

This course provides a unified framework to study the properties of popular econometric methods used in economic analysis such as least-squares, maximum likelihood and generalized method of moments estimators. Topics in this class include the applications of these popular econometric methods to cross-sectional data and time series data.

Game Theory and Auction Theory

This is an advanced course on game theory and auction theory. We will cover topics in strategic games, extensive games of complete or incomplete information, epistemic foundations of game theory, repeated games, bargaining theory, coalitional games and matching theory. We will also discuss various applications of game theory in economic activities such as auctions.

Entrepreneurial Finance and Economics

The course covers financial topics relevant to newly formed companies, with an emphasis on innovative start-ups that target large markets and seek to raise outside capital. Topics include: (1) valuation, which is the course’s primary theme, underlying all the topics covered; (2) evaluating business opportunities, which focuses on the underlying economic principles that differentiate large opportunities from small opportunities; (3) funding business opportunities, which covers both identifying a company’s needs and acquiring the capital to finance those needs; and (4) discussing how successful entrepreneurial ventures “exit”.

Stochastic Models and their business applications

The focus of the course is about the mathematical methods applied to economics and financial derivatives products. Probability and stochastic calculus will be studied before introducing the modeling theory for Options. It bridge the gap between the option pricing theory and practice with examples of popular structured products in the financial market. Topics include probability, stochastic calculus, risk-neutral modeling, black-scholes-merton model and applications. After the course, the students will be well prepared to work in financial industry as trader, structurer, sales and risk manager. Course grade will be based upon presence, homework or project and final exam.

Information Management

This course aims to emphasize the importance of the information in business entity and how information management and technologies improve the competitive advantage of the business entity. It provides students with the role of electronic commerce in today’s business environment, the understanding of the nature and value of information system and information management, the process of system development, and the knowledge in information technology applications.

Forensic and Forecasting Analytics

This course explores the use of financial and non-financial data for solving problems in financial accounting, managerial accounting, audit, internal control and corporate governance contexts. Students will gain exposure to different advanced data analytics techniques and predictive models such as text analytics, neural networks and deep learning to detect irregularities, anomalies and potential fraud in accounting data. Students will gain knowledge and hands-on experience in applying these techniques to make predictions by generating value from accounting data.

Business Valuation and Financial Statement Analysis

This course introduces the valuation techniques in the fields of corporate finance, equity research, fund management and strategy consulting employed by analysts and investors while valuing stocks and firms. It explores how to use financial statements to develop an in-depth fundamental analysis of the business which can be applied to a range of investment and strategic decisions. Specific topics covered will include models of shareholder value, financial diagnosis, and future earnings and cash flow forecast. Much of the course’s emphasis is on case studies involving listed companies.

Data Mining and Business Analytics

This course introduces fundamental concepts, technologies, and applications of business analytics using Big Data. It covers the state-of-the-art topics in Big Data including data collection, data storage and processing, data mining, predictive analytics, and cloud computing.

Accounting Data Strategy and Visualization

The growing volume of both structured and unstructured data has pushed forward a more data-driven form of decision making. Future accountants need to be able to collect and work with data. This course aims to first introduce various accounting and financial research datasets and to provide students various quantitative analysis techniques in developing analytical data models to support decision-making. With the information developed from data modeling, it is crucial to communicate practical implications of quantitative analyses to any kind of audience member. This course aims to further provide an introduction as well as a hands-on experience in data visualization and visual analytics to help summarize large amount of data effectively. Students will learn to combine analytic and interactive visualization approaches and use them to demonstrate or provide insights into real-world problems and situations

Text Analytics in Financial Market

Thorough examination of huge amounts of text data is known to be a difficult task which requires the understanding of natural language processing. This course aims to provide students fundamental techniques and major algorithms used for text processing and retrieval to extract useful information to support decision making. After taking the course, students will know how to independently obtain and analyze huge amounts of unstructured textual data to generate the business insights for companies.

Financial Market and Instrument

This course aims to cover the composition of financial markets, the role financial markets and institutions play in the modern business environment and the common instruments and products in financial transactions with a particular focus on the risk-neutral pricing of securities and fundamental analysis of stocks.

Credit Rating and Credit Risk Management

This course aims to provide students with an in-depth understanding of the credit rating practices and methodologies employed by international credit rating agency, including credit metric, distance to default and actuarial approach, in assessing corporate credit risk. This course also examines the concept of credit risk and offers an in-depth understanding and new developments in credit risk management and credit derivatives.

Artificial Intelligence Principles

This course presents students with a foundational understanding of state-of-the-art artificial intelligence (AI) technologies and their marketing implications as well as their limitations. We will cover three key AI technologies: machine learning, natural language processing, and robotics and discuss their marketing applications. Students will gain a practical introduction to these key AI technologies and their marketing implications. The course does not assume any particular technological background, though some programing knowledge is a plus. Students will focus on the marketing and managerial implications of these technologies and how they can be applied in the workplace. In addition, students will have the opportunities to learn how to apply these AI technologies using real marketing dataset.

IOT and Retail Technology

Internet of Things (also referred to as IoT) is an Internet technology that extends Internet connectivity beyond standard devices, such as desktops, laptops, and smartphones, to any range of everyday objects including traditionally non-internet-enabled physical devices. By combining low-power, battery-free hardware with real-time digital analytics, IoT disrupts the traditional retail process and enables retailers to transform their customer service relationships and provide consumers with a seamless shopping experience, while meeting, and surpassing, the expectations of increasingly tech-savvy consumers. This course introduces the key concepts of IoT and its applications in Retail industry.