Courses


Workshop

Introduction to Python for Data Science

Description

This is a two-day course that provides a gentle, hands-on introduction to the Python programming language for data science applications. Students will learn the fundamentals of Python as a language and how to work with data using the pandas library.

Objectives
  1. Develop comprehensive skills in the importing/exporting, wrangling, aggregating and joining of data using Python.
  2. Establish a mental model of the Python programming language to enable future self-learning.
  3. Build awareness and basic skills in the core data science area of data visualization.
Link to Materials  github
Next Session: April 30 and May 1, 2020


Workshop

Intermediate Python for Data Science

Description

This is a two-day course that provides more detailed coverage of how programming with Python can make working with data easier, while diving deeper into the Python data science ecosystem. Students will learn to program more efficient data science applications using Python and a variety of techniques.

Objectives
  1. Learn to use control flow and custom functions to work with data more efficiently.
  2. Build awareness and basic skills in working with Python from the shell and its environments.
  3. Gain exposure to Python’s data science ecosystem and modeling via scikit-learn.
Link to Materials  github
Next Session: June 25 and 26, 2020 (Tentative)


Workshop

Advanced Python for Data Science

Description

This is a two-day course that introduces how one can use Python for advanced data science tasks, such as deep learning and natural language processing. Most of the time will be spent working through example problems end-to-end in the classroom. Students will learn the fundamentals of the Keras package (for deep learning) and will explore several NLP packages and methodologies to see the strengths of each. Some additional time will be reserved for discussion of real programming challenges students have encountered, and for an overview of related relevant technologies students may need in an industry setting (e.g. Git and GitHub).

Objectives
  1. Develop an intuition for what problems are suited to deep learning- and/or NLP-based solutions.
  2. Build familiarity with the basic interfaces of key Python libraries for deep learning and NLP: Keras, FuzzyWuzzy, and gensim.
  3. Gain a high-level understanding of the function of data science-adjacent technologies that students will encounter in the workplace, focusing on Git and GitHub.
Link to Materials  github
Next Session: TBD


Course (2 Credit Hours)

Python for Data Science

Description

This is a 7-week, 2-credit hour course focused on using Python for data science. Topics include data wrangling, interaction with data sources, visualization, running scripts, the Python ecosystem, functions, and modeling.

Objectives
  1. Expose students to the Python data science ecosystem’s libraries, capabilities, and vocabulary.
  2. Build students’ proficiency in the core data wrangling skills: importing data, reshaping data, transforming data, and exporting data.
  3. Develop students’ ability to use Python within both interactive (Jupyter, REPL) and non-interactive (scripts) environments.
  4. Explore various methods of producing output in Python: plotting, exporting various data formats, converting notebooks to static files as deliverables, and writing to a SQL database.
  5. Expose students to modeling via scikit-learn and discuss the fundamentals of building models in Python.
  6. Teach students how and when to teach themselves, through a discussion of widely-available Python resources.
Course Page
Next Session: Spring 2020