What Should a Doctor Know About AI? has announced a new AI course for doctors. This is the first AI course specifically designed for doctors.

This course is the first in a series of data science, deep learning, and decision science courses for physicians scheduled to be launched in 2019 by Professors Chirag Patel and Arjun Manrai from Harvard Medical School, experts in bioinformatics and advisors at These courses will make machine learning tools accessible to a wide number of people in medicine.

The creation of this course was inspired by a conversation between Walter de Brouwer, PhD, co-founder of and Drs. Chirag Patel, PhD, and Arjun Manrai, PhD. The subject of their discussion was What should a doctor know about AI? Although many patients and physicians are now collecting new emerging large scale health data, few doctors are able to use this data. Further still, how does one make sense of the data in the context of the prodigious medical literature and enormous growth of publicly available data? Medicine is far from having a systematic “evidence base” to support the use of these new emerging data (or mitigate their misuse) in practice. What if doctors were empowered to use their own datasets to help their patients? The idea of developing a customized course for doctors was conceived during this conversation.

“A goal of this course is to impart skills in practical and applied machine learning for biomedicine. We envision that the tools of data science and machine learning will complement doctors’ existing armamentarium for diagnosing and providing care for their populations. At the very least, through practical and hands-on exercises, we aim to demonstrate how and when these methods work in an effort to take the hype out of the use of machine learning.”

Chirag Patel, PhD, Harvard Medical School

AI, Data Science, and Machine Learning

Data science uses computational methods to extract knowledge, insights and patterns from large datasets. It involves cleaning data, manipulating data, building customized algorithms, and communicating results.

Machine learning is a subdomain of AI that is used in data science. Machine learning algorithms extract patterns from data and perform tasks such as prediction and classification. It’s called machine learning because the computer “learns” from the data and improves overtime.

Why doctors should learn about data science

Akshay Sharma, Chief Technology Officer at

Although large datasets are available, few doctors know how to use them, so they are rarely used to help patients. This course was designed to introduce doctors to data science to process and analyze large datasets and interpret findings from the medical literature. Practical examples will be used throughout the course to help doctors understand how to implement these tools. Doctors in the course will learn how to procure a high performance server online for machine learning and will create state of the art, highly practical, models for machine learning and decision science towards separating correlation from causation.

Data science can help doctors deliver contextualized and precise care to their patients. For example, a doctor may have a patient who is at risk for asthma or comes in with complaints of asthma. The doctor will administer tests to measure lung function to determine how well a patient can exhale. The doctor also collects information about the patient including age, sex, and ethnicity. To understand how this particular patient’s lung function compares to other patients in the population, the doctor might search Google to determine the normal ranges for the lung function test for a patient of this age, sex, and ethnicity. For many tests there might be very little information across patients of different age, sex, and ethnic groups. Should this information be taken into account? If this doctor has data science training, she can use publicly available datasets from the CDC, download it into her toolkit, and quickly run analytical tests to determine the normal range of variation in patients with this background.

“So much information is buried in large publicly accessible datasets and never reaches the point of care. Our course will help doctors bring their keen clinical observations to repurpose these data streams using data science and machine learning.”

Arjun Manrai PhD, Harvard Medical School

Convergence and Collaboration

One big challenge in the healthcare industry is that there’s a huge amount of clinical knowledge but there’s very few people who are able the understand the data in order to solve real-world problems. It’s important to build an ecosystem where people in AI, data science, and healthcare can collaborate to make the medical decision making process transparent. Contributions from experts in all three of these domains is essential to advance AI in medicine. Data scientists generally come from a software engineering background. Although they have expertise in machine learning, statistics, math, and computer science, in most cases they lack domain expertise in healthcare.

To facilitate collaboration between these three groups, will host a platform for the global data science and medical communities. Using this platform, data scientists and doctors have instant “one-click” access to a scalable, powerful, inexpensive GPU and CPU infrastructure. To enhance this platform, has acquired Crestle (now, a popular data science analysis platform, and has rewritten it for greater scalability to ensure it meets the needs of the data science community. With the infrastructure, provides an environment for highly skilled AI physicians through providing educational courses and an easy to use deep learning infrastructure. The platform enables one-click deployment of deep learning packages to deliver the course. The course will also use foundational prediction modules developed by’s team.

The New AI Course

The new AI course for doctors was designed for interaction between instructors and students. The content was written specifically for doctors and is intended to be practical and used in a tangible way. The course will cover basic programming, cleaning of data, and interpretation of medical literature. The course will include both lectures and labs and will culminate with students writing a paper. Doctors will be asked to pick a dataset, pick a tool, and ask a novel scientific question. They will then address the question by doing the appropriate data analysis and writing up their findings in a format that could be potentially submitted to a peer reviewed journal.

Walter De Brouwer PhD — Chief Executive Officer , Akshay Sharma — Chief Technology Officer at

Although some programming experience is helpful for someone taking this course, it’s not required. The course was designed using modules so that doctors can get up and running using simple data science techniques from the beginning. Although large datasets are becoming more commonplace in medicine today, some physicians may not have had prior exposure to large datasets. This course was not designed with a specific type of medical practice in mind. Doctors in various specialities from oncology to pediatrics to cardiovascular medicine could benefit from it. In fact, having doctors from various specialties taking this course would make it more interesting because of the questions that they will ask.

The course is intended to equip doctors with tools that they can use to analyze their own de-identified patient data in a practical way that will help their patients. Following this course, participants should be able to pick a data set and analyze it. The objective is to train doctors around the world to harness open source machine learning infrastructure to learn and practice building models with their own datasets. The course sets the stage for learning more advanced topics in machine learning, such as neural networks.

The course, scheduled to start in early 2019, will be taught by Drs. Patel and Manrai. Dr. Patel researches the influence on genome and environment on health and Dr. Manrai researches using machine learning and informatics extracting reproducible signal from large-scale datasets. The new data science course is based on a data science course that Drs. Patel and Manrai are currently teaching for first year PhD and Masters students at Harvard.

Chirag Patel PhD, Harvard Medical School and advisor at

Dr. Chirag Patel received his PhD in biomedical informatics from Stanford University. For his PhD thesis, Chirag created the first “search engine” to identify robust analytic support environmental exposures associated with disease. The method, called an “Environment-wide association study” is a machine learning approach to associate nutrients, pollutants, and infectious agents with complex diseases, such as type 2 diabetes, cardiovascular disease, preterm birth, and aging. Chirag researches the role of the environment and the genome in health and is developing computational methods to infer over human genomic and environmental information with the tools of translational bioinformatics and data science.

Arjun Manrai PhD, Harvard Medical School and advisor at

Dr. Arjun Manrai received an undergraduate degree in Physics with Highest Honors from Harvard and earned his Ph.D. in Bioinformatics and Integrative Genomics from the Harvard-MIT Division of Health Sciences and Technology. Manrai’s research has two main foci: (1) Statistical and machine learning approaches to improve the use of genomic and laboratory data in the clinic, with the long-term goal of improving care across diverse demographic strata of the population; (2) Meta-research approaches to modeling the reproducibility of scientific inquiry and the value of research evidence, particularly in the increasingly common setting of communal investigation of large, shared datasets. This work involves integrating and analyzing massive heterogeneous datasets — including genetic, laboratory, clinical, and environmental data — with high-throughput, reproducible methods. His research has been published in the New England Journal of Medicine and JAMA, presented at the National Academy of Sciences, and featured in the Wall Street Journal, New York Times, and NPR.

Course Inquiries would like to make this course broadly accessible — doctors from around the world are welcome to apply. Hospitals and medical groups from around the world are also welcome to contact if they are interested in this course. If you are interested in learning more please complete this inquiry form.

This article was written by Margaretta Colangelo. Margaretta is President of U1 Technologies and is a Partner at Deep Knowledge Ventures. Deep Knowledge Ventures is an early stage venture fund focused on AI, Blockchain and Longevity, with a specific interest in using AI in precision preventive medicine. Margaretta serves on the advisory board of the AI Precision Health Institute at the University of Hawaii Cancer Center. Margaretta is collaborating with to advance technical innovation in precision medicine, and to make medical innovation comprehensible. Margaretta is based in San Francisco. @realmargaretta