Week 2) Method framework & first scraper

Week 2: Workflow & learning how to scrape

Learning goals

  • Differentiate between retrieving data from websites and APIs.
  • Retrieve and store web data in various formats using Python’s requests library and browser inspection tools.
  • Extract and manipulate data from websites and APIs using BeautifulSoup and JSON handling techniques.
  • Apply programming concepts to automate data collection and understand the use of Jupyter Notebooks vs. raw Python files.

Lecture

Laptop required!

Coaching session

After the lecture and coaching session

  • Complete the exercises contained at the end of the tutorial (about 1-2 hours)

  • Watch “What is web scraping and what are Application Programming Interfaces (APIs)?" (30 minutes)

  • “Fields of Gold: Scraping Web Data for Marketing Insights”

  • Finalize team enrollment on Canvas.

    This paper will provide a guiding framework for the rest of this course, and chance is you’ll have to read it a couple of times (e.g., first to get an overview, and later to appreciate and use the details in your project). The web appendix contains valuable tables, so don’t skip them.

Previous week Next week