Skip to Main Content

BIOF 088 | Introduction to Text Mining Using Python

April 20, 2021 to April 23, 2021


Registration occurs on a first-come, first-served basis. The deadline for registration is one week before the first day of the course.  If you are unable to register before the deadline, please email: or call 301-496-7977 for space availability. 

NIH Fellows or NIH community members being sponsored by their lab and awaiting payment authorization can tentatively hold a seat using the “Reserve A Seat” option. Payments need to be received within 7 business days from date of reservation or before the start of a workshop, whichever comes first. Seat reservations will be cancelled if payment is not received prior to workshop.

Register Now

Course overview
Text mining is an interdisciplinary area that primarily combines advances in Natural Language Processing (NLP), Information Retrieval (IR), and Machine Learning (ML) to help the computers understand human written language, and thus transform information from free text to structured knowledge. The volume of textual data has been growing rapidly. For instance, there are over 100,000 articles on COVID-19 tracked in LitCovid ( at a rate of 10,000 articles per month. Text mining techniques can be applied to respond to the pandemic. For example, automatically extracting the symptoms, diseases, and drugs mentioned in the articles would assist the diagnosis and treatment management of COVID-19 patients. More generally, the amount of text of popular databases is at million or trillion-scale. Such scale necessities the development of related text mining tools to facilitate data curation and knowledge discovery.

This course will introduce participants to a comprehensive set of text mining related topics, tools and techniques. It will cover three primary components: (1) basics of Python and its related packages, (2) an overview of text mining pipeline and techniques, and (3) an introduction to machine learning and development of text mining applications using machine learning. Each component will have hands-on exercises and case studies for practice.


At the end of the course a learner should be able to:

Write basic Python codes and use Python-related packages such as Pandas, Numpy and Sklearn for textual data analysis

Understand text mining pipelines and develop text mining methods for text processing

Understand machine learning related concepts and develop text mining applications (such as text classification and named entity recognition) using machine learning techniques


Who should attend?
This course is designed for learners and interested individuals with little or no experience in text mining or machine learning.

Prior exposure to programming languages is highly recommended but not required. We will spend the first day on Python fundamentals. Learners without experience in programming languages are expected to practice and grasp the basic syntax of Python. Basic computer skills are required.

General Training Rate:

Discounted Training Rate:
$1,075.00 - NIH Community (Trainees, Contractors, Employees, Tenants working at one of the NIH campuses)                                                                              
$1,195.00 - Academia, US Government (Non-NIH), US Military 

Technology Fee

Although no grades are given for courses, each participant will receive Continuing Education Units (CEUs) based on the number of contact hours.   One CEU is equal to ten contact hours.   Upon completion of this course each participant will receive a certificate, showing completion of the workshop and 2.8 CEUs.

Refund Policy
100% tuition refund for registrations cancelled 14 or more calendar days prior to the start of the workshop.

50% tuition refund for registrations cancelled between 4 to 13 calendar days prior to the start of the workshop.

No refund will be issued for registrations cancelled 3 calendar days or less prior to the start of the workshop.


All cancellations must be received in writing via email to Ms. Carline Coote at

Cancellations received after 4:00 pm (ET) on business days or received on non-business days are time marked for the following business day.

All refund payments will be processed by the start of the initial workshop.

Return to Workshops Calendar