Syllabus

Course information

First session Time Location (Room)
25.10.2023 Wednesdays, 11:30 - 13:00 Findelgasse, 2.024
Assignment and course language
  • Depending on the students’ feedback in the survey before the course starts, the course can be held in (incl. presentations, pitches, discussions, etc.) English or German.
  • Regardless of the feedback on the course language, the written assignments can be in German or English, depending on what the student/groups prefer.

Course description

In this seminar, students are introduced to working with digital behavioral data (DBD). DBD refer to digital traces of human behavior that are knowingly or unknowingly left in online environments (e.g. social media, messengers, entertainment media, or digital collaboration tools). These rich data is increasingly available to social scientific research in the public interest, but can also be used to derive strategic insights for business decisions.

Students learn how to work with DBD alongside the entire research process, from data collection, preprocessing and analysis, to reporting and provision (e.g. via open science tools). Students first get a comprehensive overview of the ways in which DBD can be collected (e.g., API scraping, usage logging, mock-up virtual environments, or data donations), as well as the requirements for data protection, research ethics, and data quality. Afterwards, students practice and apply their newly acquired knowledge in small projects on use cases from media and communication research. In doing so, they learn important computer-based methods with which large digital behavioral data sets (e.g. texts, images, usage behavior logs) can be processed and analyzed. By completing this module, participants will get an up-to-date overview and practical insights into how the potential of observational data (digital traces) can be used to better understand the behavior of media users in digital environments.

Learning Objectives

Students will

  • overview and understand central opportunities of DBD and accompanying challenges for data collection and preprocessing
  • evaluate the strengths and weaknesses of different ways of collecting DBD
  • get to know and understand central requirements for data protection, research ethics, and data quality
  • get to know and overview key computational social science methods to analyze DBD
  • practice and apply knowledge on DBD, statistics, and data analysis in small projects of their own

Organization of the course

Registration for the course takes place via StudOn. There you will receive the first information and instructions. Please make sure that

  • you complete the short survey before the seminar begins.
  • you check if you have received the invitation and joined Zulip.

All slides, assignment instructions, an up-to-date schedule, and other course materials may be found on the course website. I will regularly send out course announcements by e-mail or Zulip, make sure to check one or the other of these regularly.

(Preliminary) Schedule

Important information
  • Please note that this is a provisional schedule. Part of the Kick-Off session is the presentation, discussion and voting on different project ideas.
  • All sessions marked with a 🔨 are hands-on sessions actively working with R.
  • All sessions marked with a 📚 or 📦 are presentation sessions where a group of students will give a detailed presentation.
Session Datum Topic
Introduction
1 25.10.2023 Kick-Off
- 01.11.2023 🎃 Holiday (No Lecture)
2 08.11.2023 DBD: Overview
3 15.11.2023 🔨 Working with R
📂 Project 1 Analysis of media content
4 22.11.2023 📚 Digital disconnection
5 29.11.2023 📦 Data collection methods
6 06.12.2023 🔨 Text as data
7 13.12.2023 📊 Presentation & Discussion
8 20.12.2023 Buffer Session
- - 🎄Christmas Break (No Lecture)
📂 Project 2 Analysis of media usage
9 11.01.2024 📚 Media habits & routines
10 18.01.2024 📦 Data donations methods
11 25.01.2024 🔨 Working data logs
12 02.02.2024 📊 Presentation & Discussion
13 08.02.2024 🏁 Recap, Evaluation & Discussion
Note

For an even more detailed overview of the course schedule as well as the linked content of the individual sessions (e.g. slides or literature for the respective presentation), please see Schedule.

The course consists of three main parts:

Part I: Introduction

The first three sessions form the (theoretical) basis for the course.

  • The kick-off session is mainly for getting to know each other and organizing the course. 
  • The second session is to give you an extended introduction DBD, including challenges and important frameworks. 
  • The third session is about practical work with R and RStudio.  

Part II: Analysis of media content (Project 1)

  • The focus of the second section is the observation and analysis of (written) social media content (e.g. post and/or hashtags) with the help of different methods (e.g. topic modeling, sentiment analysis etc.).

Part III: Analysis of media usage (Project 2)

  • The third part is about the effects of media use. Here you will learn how to process and analyze logging data and how to link it to survey data with the help of a survey conducted on the participants of the course during the first months of the semester.

Sessions

The goal of the sessions is to be as interactive as possible. In general, the sessions consist of two parts. In the first part (± 30 - 45 minutes) at the beginning of the session, there are usually presentations (including discussion), which are more or less detailed depending on the stage of the project. The second part (± 45 - 60 minutes) consists of a group activity (with concluding discussion), which should either be about deepening the presentation content or about independent work on one’s own or the group project.

My role as instructor is to introduce you new tools and techniques, but it is up to you to take them and make use of them. A lot of what you do in this course will involve writing code, and coding is a skill that is best learned by doing. You are expected to bring a laptop to each class so that you can take part in the in-session exercises. Please make sure your laptop is fully charged before you come to class as the number of outlets in the classroom will not be sufficient to accommodate everyone.

Where to ask questions

  • If you have a question during the lecture, feel free to ask it! There are likely other students with the same question, so by asking you will create a learning opportunity for everyone.
  • Any general questions about session content, assignments or about the project should be posted on Zulip, so that everyone can benefit from the answers. There is a chance another student has already asked a similar question, so please check the other posts before adding a new question. If you know the answer to a question, I encourage you to respond!
  • E-mails should be reserved for personal matters.

Assessment

In order to obtain credits and a grade, participants are required to

  1. attend regularly (at least 80% of the sessions) and participate actively. A maximum of two sessions can be missed without excuse. Absence in further sessions can only be excused in case of illness (i. e. with a medical certificate).
  2. complete various assignments as part of a portfolio. The type and scope of the assignments depends on the number of participants and the project(s). Detailed information can be found in the section Assignments.

Academic integrity

TL;DR

Do not cheat!

For general information on formatting, style, citation, appendices, wording of the affidavit, etc., see our Guide to Academic Writing.

Policy on sharing and reusing code

I am well aware that a huge volume of code is available on the web to solve any number of problems. Unless I explicitly tell you not to use something, the course’s policy is that you may make use of any online resources (e.g. StackOverflow) but you must explicitly cite where you obtained any code you directly use (or use as inspiration). Any recycled code that is discovered and is not explicitly cited will be treated as plagiarism.

Policy on use of generative artificial intelligence (AI):

You should treat generative AI, such as ChatGPT, the same as other online resources. There are two guiding principles that govern how you can use AI in this course1: (1) Cognitive dimension: Working with AI should not reduce your ability to think clearly. We will practice using AI to facilitate—rather than hinder—learning. (2) Ethical dimension: Students using AI should be transparent about their use and make sure it aligns with academic integrity.

  • ✅ AI tools for code: You may make use of the technology for coding examples on assignments; if you do so, you must explicitly cite where you obtained the code. Any recycled code that is discovered and is not explicitly cited will be treated as plagiarism. You may use these guidelines for citing AI-generated content.

  • ❌ AI tools for narrative: Unless instructed otherwise, you may not use generative AI to write narrative on assignments. In general, you may use generative AI as a resource as you complete assignments but not to answer the exercises for you. You are ultimately responsible for the work you turn in; it should reflect your understanding of the course content.

Footnotes

  1. These guiding principles are based on Course Policies related to ChatGPT and other AI Tools developed by Joel Gladd, Ph.D↩︎