# Online courses directory (90)

**Please note that the verified certificate option is not currently open for this course. Please enroll in the audit track and you will be emailed when the verified certificate option is open for enrollment.**

The modern data analysis pipeline involves collection, preprocessing, storage, analysis, and interactive visualization of data.

The goal of this course, part of the Analytics: Essential Tools and Methods MicroMasters program, is for you to learn how to build these components and connect them using modern tools and techniques.

In the course, you’ll see how computing and mathematics come together. For instance, “under the hood” of modern data analysis lies numerical linear algebra, numerical optimization, and elementary data processing algorithms and data structures. Together, they form the foundations of numerical and data-intensive computing.

The hands-on component of this course will develop your proficiency with modern analytical tools. You will learn how to mash up Python, R, and SQL through Jupyter notebooks, among other tools. Furthermore, you will apply these tools to a variety of real-world datasets, thereby strengthening your ability to translate principles into practice.

This course is an introduction to linear algebra. It has been argued that linear algebra constitutes half of all mathematics. Whether or not everyone would agree with that, it is certainly true that practically every modern technology relies on linear algebra to simplify the computations required for Internet searches, 3-D animation, coordination of safety systems, financial trading, air traffic control, and everything in between. Linear algebra can be viewed either as the study of linear equations or as the study of vectors. It is tied to analytic geometry; practically speaking, this means that almost every fact you will learn in this course has a picture associated with it. Learning to connect the facts with their geometric interpretation will be very useful for you. The book which is used in the course focuses both on the theoretical aspects as well as the applied aspects of linear algebra. As a result, you will be able to learn the geometric interpretations of many of the algebraic concepts…

Are you interested in pursuing a degree in Data Science, but unsure whether you have the necessary Math and Programming skills? This assessment will help you identify your current readiness in three core areas required for the study of Data Science; Calculus, Linear Algebra, and Programming.

You can take this assessment at your own pace and receive a private score report that identifies your readiness in each specific area. We will also provide, when necessary, recommendations for additional free online study.

This assessment is free, unproctored, and not offered for credit; it is designed for enrichment and self-assessment for anyone interested in pursuing data science as a career.

The laws of nature are expressed as differential equations. Scientists and engineers must know how to model the world in terms of differential equations, and how to solve those equations and interpret the solutions. This course focuses on the equations and techniques most useful in science and engineering.

#### Course Format

This course has been designed for independent study. It provides everything you will need to understand the concepts covered in the course. The materials include:

**Lecture Videos**by Professor Arthur Mattuck.**Course Notes**on every topic.**Practice Problems**with**Solutions**.**Problem Solving Videos**taught by experienced MIT Recitation Instructors.**Problem Sets**to do on your own with**Solutions**to check your answers against when you're done.- A selection of
**Interactive Java® Demonstrations**called*Mathlets*to illustrate key concepts. - A full set of
**Exams with Solutions**, including practice exams to help you prepare.

#### Content Development

Haynes Miller

Jeremy Orloff

Dr. John Lewis

Arthur Mattuck

## Other Versions

## Other OCW Versions

OCW has published multiple versions of this subject.

## Related Content

Technological innovations have revolutionized the way we view and interact with the world around us. Editing a photo, re-mixing a song, automatically measuring and adjusting chemical concentrations in a tank: each of these tasks requires real-world data to be captured by a computer and then manipulated digitally to extract the salient information. Ever wonder how signals from the physical world are sampled, stored, and processed without losing the information required to make predictions and extract meaning from the data?

Students will find out in this rigorous mathematical introduction to the engineering field of signal processing: the study of signals and systems that extract information from the world around us. This course will teach students to analyze discrete-time signals and systems in both the time and frequency domains. Students will learn convolution, discrete Fourier transforms, the z-transform, and digital filtering. Students will apply these concepts in interactive MATLAB programming exercises (all done in browser, no download required).

Part 1 of this course analyzes signals and systems in the time domain. Part 2 covers frequency domain analysis.

Prerequisites include strong problem solving skills, the ability to understand mathematical representations of physical systems, and advanced mathematical background (one-dimensional integration, matrices, vectors, basic linear algebra, imaginary numbers, and sum and series notation). Part 1 is a prerequisite for Part 2. This course is an excerpt from an advanced undergraduate class at Rice University taught to all electrical and computer engineering majors.

Technological innovations have revolutionized the way we view and interact with the world around us. Editing a photo, re-mixing a song, automatically measuring and adjusting chemical concentrations in a tank: each of these tasks requires real-world data to be captured by a computer and then manipulated digitally to extract the salient information. Ever wonder how signals from the physical world are sampled, stored, and processed without losing the information required to make predictions and extract meaning from the data?

Students will find out in this rigorous mathematical introduction to the engineering field of signal processing: the study of signals and systems that extract information from the world around us. This course will teach students to analyze discrete-time signals and systems in both the time and frequency domains. Students will learn convolution, discrete Fourier transforms, the z-transform, and digital filtering. Students will apply these concepts in interactive MATLAB programming exercises (all done in browser, no download required).

Part 1 of this course analyzes signals and systems in the time domain. Part 2 covers frequency domain analysis.

Prerequisites include strong problem solving skills, the ability to understand mathematical representations of physical systems, and advanced mathematical background (one-dimensional integration, matrices, vectors, basic linear algebra, imaginary numbers, and sum and series notation). Part 1 is a prerequisite for Part 2. This course is an excerpt from an advanced undergraduate class at Rice University taught to all electrical and computer engineering majors.

The modern smartphone is enabled by a billion-plus nanotransistors, each having an active region that is barely a few hundred atoms long. Interestingly the same amazing technology has also led to a deeper understanding of the nature of current flow on an atomic scale and my aim is to make these lessons from nanoelectronics accessible to anyone in any branch of science or engineering. I will assume very little background beyond linear algebra and differential equations, although we will be discussing advanced concepts in non-equilibrium statistical mechanics that should be of interest even to specialists.

In the first half of this course (4 weeks) we will introduce a new perspective connecting the quantized conductance of short ballistic conductors to the familiar Ohm's law of long diffusive conductors, along with a brief description of the modern nanotransistor. In the second half (4 weeks) we will address fundamental conceptual issues related to the meaning of resistance on an atomic scale, the interconversion of electricity and heat, the second law of thermodynamics and the fuel value of information.

Overall I hope to show that the lessons of nanoelectronics lead naturally to a new viewpoint, one that changes even some basic concepts we all learn in freshman physics. This unique viewpoint not only clarifies many old questions but also provides a powerful approach to new questions at the frontier of modern nanoelectronics, such as how devices can be built to control the spin of electrons.

This course was originally offered in 2012 on nanoHUB-U and the accompanying text was subsequently published by World Scientific. I am preparing a second edition for publication in 2015, which will be used for this course. The manuscript will be made available to registered students.

__Sample comments:__

From Roald Hoffmann, http://en.wikipedia.org/wiki/Roald_Hoffmann

Cornell University

* "… the pedagogical imperative in research is very important to me, and so I really value a kindred spirit. Your (Datta's) online courses are just wonderful!"*

From anonymous student in previous offering.

*"The course was just awesome .. Prof. Datta's style of delivering lecture is mind-blowing."*

This course is the latest in a series offered by the nanoHUB-U project which is jointly funded by Purdue and NSF with the goal of transcending disciplines through short courses accessible to students in any branch of science or engineering. These courses focus on cutting-edge topics distilled into short lectures with quizzes and practice exams.

This topic continues our journey through the world of Euclid by helping us understand angles and how they can relate to each other. Angle basics. Measuring angles in degrees. Using a protractor. Measuring angles. Measuring angles. Acute right and obtuse angles. Angle types. Vertical, adjacent and linearly paired angles. Exploring angle pairs. Introduction to vertical angles. Vertical angles. Using algebra to find the measures of vertical angles. Vertical angles 2. Proof-Vertical Angles are Equal. Angles Formed by Parallel Lines and Transversals. Identifying Parallel and Perpendicular Lines. Figuring out angles between transversal and parallel lines. Congruent angles. Parallel lines 1. Using algebra to find measures of angles formed from transversal. Parallel lines 2. CA Geometry: Deducing Angle Measures. Proof - Sum of Measures of Angles in a Triangle are 180. Triangle Angle Example 1. Triangle Angle Example 2. Triangle Angle Example 3. Challenging Triangle Angle Problem. Proof - Corresponding Angle Equivalence Implies Parallel Lines. Finding more angles. Angles 1. Angles 2. Sum of Interior Angles of a Polygon. Angles of a polygon. Sum of the exterior angles of convex polygon. Introduction to angles (old). Angles (part 2). Angles (part 3). Angles formed between transversals and parallel lines. Angles of parallel lines 2. The Angle Game. Angle Game (part 2). Acute right and obtuse angles. Complementary and supplementary angles. Complementary and supplementary angles. Example using algebra to find measure of complementary angles. Example using algebra to find measure of supplementary angles. Angle addition postulate. Angle basics. Measuring angles in degrees. Using a protractor. Measuring angles. Measuring angles. Acute right and obtuse angles. Angle types. Vertical, adjacent and linearly paired angles. Exploring angle pairs. Introduction to vertical angles. Vertical angles. Using algebra to find the measures of vertical angles. Vertical angles 2. Proof-Vertical Angles are Equal. Angles Formed by Parallel Lines and Transversals. Identifying Parallel and Perpendicular Lines. Figuring out angles between transversal and parallel lines. Congruent angles. Parallel lines 1. Using algebra to find measures of angles formed from transversal. Parallel lines 2. CA Geometry: Deducing Angle Measures. Proof - Sum of Measures of Angles in a Triangle are 180. Triangle Angle Example 1. Triangle Angle Example 2. Triangle Angle Example 3. Challenging Triangle Angle Problem. Proof - Corresponding Angle Equivalence Implies Parallel Lines. Finding more angles. Angles 1. Angles 2. Sum of Interior Angles of a Polygon. Angles of a polygon. Sum of the exterior angles of convex polygon. Introduction to angles (old). Angles (part 2). Angles (part 3). Angles formed between transversals and parallel lines. Angles of parallel lines 2. The Angle Game. Angle Game (part 2). Acute right and obtuse angles. Complementary and supplementary angles. Complementary and supplementary angles. Example using algebra to find measure of complementary angles. Example using algebra to find measure of supplementary angles. Angle addition postulate.

The goal of this course is to give you solid foundations for developing, analyzing, and implementing parallel and locality-efficient algorithms. This course focuses on theoretical underpinnings. To give a practical feeling for how algorithms map to and behave on real systems, we will supplement algorithmic theory with hands-on exercises on modern HPC systems, such as Cilk Plus or OpenMP on shared memory nodes, CUDA for graphics co-processors (GPUs), and MPI and PGAS models for distributed memory systems. This course is a graduate-level introduction to scalable parallel algorithms. “Scale” really refers to two things: efficient as the problem size grows, and efficient as the system size (measured in numbers of cores or compute nodes) grows. To really scale your algorithm in both of these senses, you need to be smart about reducing asymptotic complexity the way you’ve done for sequential algorithms since CS 101; but you also need to think about reducing communication and data movement. This course is about the basic algorithmic techniques you’ll need to do so. The techniques you’ll encounter covers the main algorithm design and analysis ideas for three major classes of machines: for multicore and many core shared memory machines, via the work-span model; for distributed memory machines like clusters and supercomputers, via network models; and for sequential or parallel machines with deep memory hierarchies (e.g., caches). You will see these techniques applied to fundamental problems, like sorting, search on trees and graphs, and linear algebra, among others. The practical aspect of this course is implementing the algorithms and techniques you’ll learn to run on real parallel and distributed systems, so you can check whether what appears to work well in theory also translates into practice. (Programming models you’ll use include Cilk Plus, OpenMP, and MPI, and possibly others.)

If you’re interested in data analysis and interpretation, then this is the data science course for you. We start by learning the mathematical definition of distance and use this to motivate the use of the singular value decomposition (SVD) for dimension reduction and multi-dimensional scaling and its connection to principle component analysis. We will learn about the *batch effect*: the most challenging data analytical problem in genomics today and describe how the techniques can be used to detect and adjust for batch effects. Specifically, we will describe the principal component analysis and factor analysis and demonstrate how these concepts are applied to data visualization and data analysis of high-throughput experimental data.

Finally, we give a brief introduction to machine learning and apply it to high-throughput data. We describe the general idea behind clustering analysis and descript K-means and hierarchical clustering and demonstrate how these are used in genomics and describe prediction algorithms such as k-nearest neighbors along with the concepts of training sets, test sets, error rates and cross-validation.

Given the diversity in educational background of our students we have divided the series into seven parts. You can take the entire series or individual courses that interest you. If you are a statistician you should consider skipping the first two or three courses, similarly, if you are biologists you should consider skipping some of the introductory biology lectures. Note that the statistics and programming aspects of the class ramp up in difficulty relatively quickly across the first three courses. By the third course will be teaching advanced statistical concepts such as hierarchical models and by the fourth advanced software engineering skills, such as parallel computing and reproducible research concepts.

These courses make up 2 XSeries and are self-paced:

PH525.1x: Statistics and R for the Life Sciences

PH525.3x: Statistical Inference and Modeling for High-throughput Experiments

PH525.4x: High-Dimensional Data Analysis

PH525.5x: Introduction to Bioconductor: annotation and analysis of genomes and genomic assays

PH525.6x: High-performance computing for reproducible genomics

PH525.7x: Case studies in functional genomics

This class was supported in part by NIH grant R25GM114818.

HarvardX requires individuals who enroll in its courses on edX to abide by the terms of the edX honor code. HarvardX will take appropriate corrective action in response to violations of the edX honor code, which may include dismissal from the HarvardX course; revocation of any certificates received for the HarvardX course; or other remedies as circumstances warrant. No refunds will be issued in the case of corrective action for such violations. Enrollees who are taking HarvardX courses as part of another program will also be governed by the academic policies of those programs.

HarvardX pursues the science of learning. By registering as an online learner in an HX course, you will also participate in research about learning. Read our research statement to learn more.

Harvard University and HarvardX are committed to maintaining a safe and healthy educational and work environment in which no member of the community is excluded from participation in, denied the benefits of, or subjected to discrimination or harassment in our program. All members of the HarvardX community are expected to abide by Harvard policies on nondiscrimination, including sexual harassment, and the edX Terms of Service. If you have any questions or concerns, please contact harvardx@harvard.edu and/or report your experience through the edX contact form.

If you’re interested in data analysis and interpretation, then this is the data science course for you.

*Enhanced throughput*: Almost all recently manufactured laptops and desktops include multiple core CPUs. With R, it is very easy to obtain faster turnaround times for analyses by distributing tasks among the cores for concurrent execution. We will discuss how to use Bioconductor to simplify parallel computing for efficient, fault-tolerant, and reproducible high-performance analyses. This will be illustrated with common multicore architectures and Amazon’s EC2 infrastructure.

*Enhanced interactivity*: New approaches to programming with R and Bioconductor allow researchers to use the web browser as a highly dynamic interface for data interrogation and visualization. We will discuss how to create interactive reports that enable us to move beyond static tables and one-off graphics so that our analysis outputs can be transformed and explored in real time.

*Enhanced reproducibility:* New methods of virtualization of software environments, exemplified by the Docker ecosystem, are useful for achieving reproducible distributed analyses. The Docker Hub includes a considerable number of container images useful for important Bioconductor-based workflows, and we will illustrate how to use and extend these for sharable and reproducible analysis.

Given the diversity in educational background of our students we have divided the series into seven parts. You can take the entire series or individual courses that interest you. If you are a statistician you should consider skipping the first two or three courses, similarly, if you are biologists you should consider skipping some of the introductory biology lectures. Note that the statistics and programming aspects of the class ramp up in difficulty relatively quickly across the first three courses. By the third course will be teaching advanced statistical concepts such as hierarchical models and by the fourth advanced software engineering skills, such as parallel computing and reproducible research concepts.

These courses make up 2 XSeries and are self-paced:

PH525.1x: Statistics and R for the Life Sciences

PH525.3x: Statistical Inference and Modeling for High-throughput Experiments

PH525.4x: High-Dimensional Data Analysis

PH525.5x: Introduction to Bioconductor: annotation and analysis of genomes and genomic assays

PH525.6x: High-performance computing for reproducible genomics

PH525.7x: Case studies in functional genomics

This class was supported in part by NIH grant R25GM114818.

HarvardX requires individuals who enroll in its courses on edX to abide by the terms of the edX honor code. HarvardX will take appropriate corrective action in response to violations of the edX honor code, which may include dismissal from the HarvardX course; revocation of any certificates received for the HarvardX course; or other remedies as circumstances warrant. No refunds will be issued in the case of corrective action for such violations. Enrollees who are taking HarvardX courses as part of another program will also be governed by the academic policies of those programs.

HarvardX pursues the science of learning. By registering as an online learner in an HX course, you will also participate in research about learning. Read our research statement to learn more.

Harvard University and HarvardX are committed to maintaining a safe and healthy educational and work environment in which no member of the community is excluded from participation in, denied the benefits of, or subjected to discrimination or harassment in our program. All members of the HarvardX community are expected to abide by Harvard policies on nondiscrimination, including sexual harassment, and the edX Terms of Service. If you have any questions or concerns, please contact harvardx@harvard.edu and/or report your experience through the edX contact form.

Are you thinking about teaching high school math? Learning how to implement effective teaching strategies is essential to creating a successful environment for both you and your students. In this education and teacher training course, you will learn, from experts at Teach for America, how to teach introductory fundamental geometry concepts and empower students to explore math on their own.

You will have the opportunity to engage in creative teaching practices and learn new ways to imagine and visualize approaches for teaching geometry. You will also learn best practices to better support student learning in a math classroom. If you are preparing for or considering teaching geometry, this course will be ideal for you as you explore effective classroom instructional strategies, operations and better understand the foundational content you will teach in a high school geometry class across diverse student communities.

This course includes a weekly module that covers teaching strategies while also introducing commonly taught high school geometry content including:

- segment/angle addition
- basic and parallel/transversal line angle
- angle relationships
- perpendicular and parallel linear equations

You can jump between lessons to quickly review geometry content and classroom strategies relevant to your learning interests. This course covers geometry curriculum typically covered in high school classroom and Common Core State Standards (CCSS) alignment is indicated where applicable. No prerequisite knowledge is required, however, basic algebra skills will be useful (as well as a strong desire to make an engaging and inclusive geometry class)!

This course provides a brief review of introductory algebra topics. Topics to be covered include integer operations, order of operations, perimeter and area, fractions and decimals, scientific notation, ratios and rates, conversions, percents, algebraic expressions, linear equations, the Pythagorean theorem, and graphing.

We begin with an introduction to the biology, explaining what we measure and why. Then we focus on the two main measurement technologies: next generation sequencing and microarrays. We then move on to describing how raw data and experimental information are imported into R and how we use Bioconductor classes to organize these data, whether generated locally, or harvested from public repositories or institutional archives. Genomic features are generally identified using intervals in genomic coordinates, and highly efficient algorithms for computing with genomic intervals will be examined in detail. Statistical methods for testing gene-centric or pathway-centric hypotheses with genome-scale data are found in packages such as limma, some of these techniques will be illustrated in lectures and labs.

Given the diversity in educational background of our students we have divided the series into seven parts. You can take the entire series or individual courses that interest you. If you are a statistician you should consider skipping the first two or three courses, similarly, if you are biologists you should consider skipping some of the introductory biology lectures. Note that the statistics and programming aspects of the class ramp up in difficulty relatively quickly across the first three courses. By the third course will be teaching advanced statistical concepts such as hierarchical models and by the fourth advanced software engineering skills, such as parallel computing and reproducible research concepts.

These courses make up 2 XSeries and are self-paced:

PH525.1x: Statistics and R for the Life Sciences

PH525.3x: Statistical Inference and Modeling for High-throughput Experiments

PH525.4x: High-Dimensional Data Analysis

PH525.5x: Introduction to Bioconductor: annotation and analysis of genomes and genomic assays

PH525.6x: High-performance computing for reproducible genomics

PH525.7x: Case studies in functional genomics

This class was supported in part by NIH grant R25GM114818.

HarvardX requires individuals who enroll in its courses on edX to abide by the terms of the edX honor code. HarvardX will take appropriate corrective action in response to violations of the edX honor code, which may include dismissal from the HarvardX course; revocation of any certificates received for the HarvardX course; or other remedies as circumstances warrant. No refunds will be issued in the case of corrective action for such violations. Enrollees who are taking HarvardX courses as part of another program will also be governed by the academic policies of those programs.

HarvardX pursues the science of learning. By registering as an online learner in an HX course, you will also participate in research about learning. Read our research statement to learn more.

Harvard University and HarvardX are committed to maintaining a safe and healthy educational and work environment in which no member of the community is excluded from participation in, denied the benefits of, or subjected to discrimination or harassment in our program. All members of the HarvardX community are expected to abide by Harvard policies on nondiscrimination, including sexual harassment, and the edX Terms of Service. If you have any questions or concerns, please contact harvardx@harvard.edu and/or report your experience through the edX contact form.

Matrix Algebra underlies many of the current tools for experimental design and the analysis of high-dimensional data. In this introductory data analysis course, we will use matrix algebra to represent the linear models that commonly used to model differences between experimental units. We perform statistical inference on these differences. Throughout the course we will use the R programming language.

These courses make up 2 XSeries and are self-paced:

PH525.1x: Statistics and R for the Life Sciences

PH525.3x: Statistical Inference and Modeling for High-throughput Experiments

PH525.4x: High-Dimensional Data Analysis

PH525.5x: Introduction to Bioconductor: annotation and analysis of genomes and genomic assays

PH525.6x: High-performance computing for reproducible genomics

PH525.7x: Case studies in functional genomics

This class was supported in part by NIH grant R25GM114818.

This course offers an advanced introduction to numerical linear algebra. Topics include direct and iterative methods for linear systems, eigenvalue decompositions and QR/SVD factorizations, stability and accuracy of numerical algorithms, the IEEE floating point standard, sparse and structured matrices, preconditioning, linear algebra software. Problem sets require some knowledge of MATLAB®.

This course offers an advanced introduction to numerical linear algebra. Topics include direct and iterative methods for linear systems, eigenvalue decompositions and QR/SVD factorizations, stability and accuracy of numerical algorithms, the IEEE floating point standard, sparse and structured matrices, preconditioning, linear algebra software. Problem sets require some knowledge of MATLAB®.