Syllabus

Course information

First session	Time	Location (Room)
25.10.2023	Wednesdays, 11:30 - 13:00	Findelgasse, 2.024

Assignment and course language

Depending on the students’ feedback in the survey before the course starts, the course can be held in (incl. presentations, pitches, discussions, etc.) English or German.
Regardless of the feedback on the course language, the written assignments can be in German or English, depending on what the student/groups prefer.

Course description

In this seminar, students are introduced to working with digital behavioral data (DBD). DBD refer to digital traces of human behavior that are knowingly or unknowingly left in online environments (e.g. social media, messengers, entertainment media, or digital collaboration tools). These rich data is increasingly available to social scientific research in the public interest, but can also be used to derive strategic insights for business decisions.

Students learn how to work with DBD alongside the entire research process, from data collection, preprocessing and analysis, to reporting and provision (e.g. via open science tools). Students first get a comprehensive overview of the ways in which DBD can be collected (e.g., API scraping, usage logging, mock-up virtual environments, or data donations), as well as the requirements for data protection, research ethics, and data quality. Afterwards, students practice and apply their newly acquired knowledge in small projects on use cases from media and communication research. In doing so, they learn important computer-based methods with which large digital behavioral data sets (e.g. texts, images, usage behavior logs) can be processed and analyzed. By completing this module, participants will get an up-to-date overview and practical insights into how the potential of observational data (digital traces) can be used to better understand the behavior of media users in digital environments.

Learning Objectives

Students will

overview and understand central opportunities of DBD and accompanying challenges for data collection and preprocessing
evaluate the strengths and weaknesses of different ways of collecting DBD
get to know and understand central requirements for data protection, research ethics, and data quality
get to know and overview key computational social science methods to analyze DBD
practice and apply knowledge on DBD, statistics, and data analysis in small projects of their own

Recommended prerequisites

Interest in social scientific perspectives on media, communication, and digital technologies.
Basic knowledge of working with statistical software such as Stata, R, Python, or SPSS is required.
Students are recommended, but not required, to also visit the lecture Data Science: Foundations, Tools, Applications in Socio-Economics and Marketing.

Organization of the course

Registration for the course takes place via StudOn. There you will receive the first information and instructions. Please make sure that

you complete the short survey before the seminar begins.
you check if you have received the invitation and joined Zulip.

All slides, assignment instructions, an up-to-date schedule, and other course materials may be found on the course website. I will regularly send out course announcements by e-mail or Zulip, make sure to check one or the other of these regularly.

(Preliminary) Schedule

Important information

Please note that this is a provisional schedule. Part of the Kick-Off session is the presentation, discussion and voting on different project ideas.
All sessions marked with a 🔨 are hands-on sessions actively working with R.
All sessions marked with a 📚 or 📦 are presentation sessions where a group of students will give a detailed presentation.

Session	Datum	Topic
		Introduction
1	25.10.2023	Kick-Off
-	01.11.2023	🎃 Holiday (No Lecture)
2	08.11.2023	DBD: Overview
3	15.11.2023	🔨 Working with R
	📂 Project 1	Analysis of media content
4	22.11.2023	📚 Digital disconnection
5	29.11.2023	📦 Data collection methods
6	06.12.2023	🔨 Text as data
7	13.12.2023	📊 Presentation & Discussion
8	20.12.2023	Buffer Session
-	-	🎄Christmas Break (No Lecture)
	📂 Project 2	Analysis of media usage
9	11.01.2024	📚 Media habits & routines
10	18.01.2024	📦 Data donations methods
11	25.01.2024	🔨 Working data logs
12	02.02.2024	📊 Presentation & Discussion
13	08.02.2024	🏁 Recap, Evaluation & Discussion

Note

For an even more detailed overview of the course schedule as well as the linked content of the individual sessions (e.g. slides or literature for the respective presentation), please see Schedule.

The course consists of three main parts:

Part I: Introduction

The first three sessions form the (theoretical) basis for the course.

The kick-off session is mainly for getting to know each other and organizing the course.
The second session is to give you an extended introduction DBD, including challenges and important frameworks.
The third session is about practical work with R and RStudio.

Part II: Analysis of media content (Project 1)

The focus of the second section is the observation and analysis of (written) social media content (e.g. post and/or hashtags) with the help of different methods (e.g. topic modeling, sentiment analysis etc.).

Part III: Analysis of media usage (Project 2)

The third part is about the effects of media use. Here you will learn how to process and analyze logging data and how to link it to survey data with the help of a survey conducted on the participants of the course during the first months of the semester.

Sessions

The goal of the sessions is to be as interactive as possible. In general, the sessions consist of two parts. In the first part (± 30 - 45 minutes) at the beginning of the session, there are usually presentations (including discussion), which are more or less detailed depending on the stage of the project. The second part (± 45 - 60 minutes) consists of a group activity (with concluding discussion), which should either be about deepening the presentation content or about independent work on one’s own or the group project.

My role as instructor is to introduce you new tools and techniques, but it is up to you to take them and make use of them. A lot of what you do in this course will involve writing code, and coding is a skill that is best learned by doing. You are expected to bring a laptop to each class so that you can take part in the in-session exercises. Please make sure your laptop is fully charged before you come to class as the number of outlets in the classroom will not be sufficient to accommodate everyone.

Where to ask questions

If you have a question during the lecture, feel free to ask it! There are likely other students with the same question, so by asking you will create a learning opportunity for everyone.
Any general questions about session content, assignments or about the project should be posted on Zulip, so that everyone can benefit from the answers. There is a chance another student has already asked a similar question, so please check the other posts before adding a new question. If you know the answer to a question, I encourage you to respond!
E-mails should be reserved for personal matters.

Assessment

In order to obtain credits and a grade, participants are required to

attend regularly (at least 80% of the sessions) and participate actively. A maximum of two sessions can be missed without excuse. Absence in further sessions can only be excused in case of illness (i. e. with a medical certificate).
complete various assignments as part of a portfolio. The type and scope of the assignments depends on the number of participants and the project(s). Detailed information can be found in the section Assignments.

Academic integrity

TL;DR

Do not cheat!

For general information on formatting, style, citation, appendices, wording of the affidavit, etc., see our Guide to Academic Writing.

Policy on use of generative artificial intelligence (AI):

You should treat generative AI, such as ChatGPT, the same as other online resources. There are two guiding principles that govern how you can use AI in this course¹: (1) Cognitive dimension: Working with AI should not reduce your ability to think clearly. We will practice using AI to facilitate—rather than hinder—learning. (2) Ethical dimension: Students using AI should be transparent about their use and make sure it aligns with academic integrity.

✅ AI tools for code: You may make use of the technology for coding examples on assignments; if you do so, you must explicitly cite where you obtained the code. Any recycled code that is discovered and is not explicitly cited will be treated as plagiarism. You may use these guidelines for citing AI-generated content.
❌ AI tools for narrative: Unless instructed otherwise, you may not use generative AI to write narrative on assignments. In general, you may use generative AI as a resource as you complete assignments but not to answer the exercises for you. You are ultimately responsible for the work you turn in; it should reflect your understanding of the course content.

Atteveldt, W. van, Trilling, D., & Arcíla, C. (2021). Computational analysis of communication: A practical introduction to the analysis of texts, networks, and images with code examples in python and r. John Wiley & Sons.
Engel, U., Quan-Haase, A., Liu, S. X., & Lyberg, L. (2021). Handbook of Computational Social Science, Volume 1: Theory, Case Studies and Ethics (1st ed.). Routledge. https://doi.org/10.4324/9781003024583
Engel, U., Quan-Haase, A., Liu, S. X., & Lyberg, L. (2021). Handbook of computational social science, volume 2. Routledge. https://doi.org/10.4324/9781003025245
Haim, M. (2023). Computational Communication Science: Eine Einführung. Springer Fachmedien Wiesbaden. https://link.springer.com/10.1007/978-3-658-40171-9
Salganik, M. J. (2018). Bit by bit: Social research in the digital age. Princeton University Press.

Footnotes

These guiding principles are based on Course Policies related to ChatGPT and other AI Tools developed by Joel Gladd, Ph.D↩︎